Group Functions using SQL
Group functions produce results based on a group or set of rows. For example, with group
functions you can get counts, sums/totals, averages, maximums, minimums, etc. If you include
the distinct clause then you will evaluate only non-duplicate values while if you include ALL
(the default) you will evaluate all values. The data types may be VARCHAR2, CHAR, NUMBER or DATE
for the functions COUNT, MAX and MIN. Null values will be ignored by all group functions except
COUNT(*). The NVL function can be used to overcome this.
FUNCTION list:
- COUNT - counts the group
- SUM - totals the group
- AVG - averages the group
- MAX - finds maximum in the group
- MIN - find minimum in the group
- STDDEV - finds the standard deviation
- VARIANCE - finds the variance
The only group function that deals with null values is COUNT(*). Other functions can be forced
to deal with null values by using the NVL function.
SQL CODE:
SQL> SELECT COUNT(*) FROM donor;
COUNT(*)
---------
6
SQL> SELECT COUNT(NVL(yrgoal,0)), AVG(NVL(yrgoal,0)), SUM(NVL(yrgoal,0))
2 FROM donor;
COUNT(NVL(YRGOAL,0)) AVG(NVL(YRGOAL,0)) SUM(NVL(YRGOAL,0))
-------------------- ------------------ ------------------
6 183.33333 1100
SQL>
The rest of these examples will ignore null values if the column being tested contains a null
value.
SQL CODE:
SQL> SELECT COUNT(yrgoal), AVG(yrgoal), SUM(yrgoal)
2 FROM donor;
COUNT(YRGOAL) AVG(YRGOAL) SUM(YRGOAL)
------------- ----------- -----------
5 220 1100
SQL> SELECT MIN(yrgoal), MAX(yrgoal)
2 FROM donor;
MIN(YRGOAL) MAX(YRGOAL)
----------- -----------
50 500
SQL> SELECT COUNT(datefst), MAX(datefst), MIN(datefst)
2 FROM donor;
COUNT(DATEFST) MAX(DATEF MIN(DATEF
-------------- --------- ---------
6 03-JUL-98 04-MAR-92
SQL> SELECT COUNT(DISTINCT(yrgoal)) FROM donor;
COUNT(DISTINCT(YRGOAL))
-----------------------
4
SQL> SELECT SUM(DISTINCT(yrgoal))
2 FROM donor;
SUM(DISTINCT(YRGOAL))
---------------------
1050
SQL> SELECT STDDEV(yrgoal), VARIANCE(yrgoal)
2 FROM donor;
STDDEV(YRGOAL) VARIANCE(YRGOAL)
-------------- ----------------
213.8925 45750
Instead of treating the whole table as the group, the GROUP BY clause can be used to make groups
based on particular criteria. In the first example below, I am printing the state and a count
of the number of cities in each state. The select line has state and count(city) and the GROUP
BY clause has state (note that state did not have to be included in the select by, only in the
group by). SQL will produce one display line for each state or group. Including the GROUP BY
function means that you will only get grouped results, you will not see individual groups. A
WHERE clause can be used to exclude things from the GROUPS as shown in the second example below.
One other important thing to note, you must use the column not the column alias in the group by
clause. One other note, the group by clause causes sorting to be in ascending order by the
column in the group by clause, to override this, you can use the order by clause. This is shown
in the third example which is ordered by state desc.
SQL CODE:
SQL> SELECT state, COUNT(city)
2 FROM donor
3 GROUP BY state;
ST COUNT(CITY)
-- -----------
MA 4
RI 2
SQL>
SQL> SELECT state, COUNT(city)
2 FROM donor
3 WHERE city !='Seekonk'
4 GROUP BY state;
ST COUNT(CITY)
-- -----------
MA 3
RI 2
SQL> SELECT state, COUNT(city)
2 FROM donor
3 GROUP BY state
4 ORDER BY state desc;
ST COUNT(CITY)
-- -----------
RI 2
MA 4
SQL>
You can also use a function in the order by, let's say I wanted to order by the count of the
cities in descending order as opposed to the state name.
SQL CODE:
SQL> SELECT state, COUNT(city)
2 FROM donor
3 GROUP by state
4 ORDER BY COUNT(city) desc;
ST COUNT(CITY)
-- -----------
MA 4
RI 2
You can also group within a group. In this example, I grouped by state and then by contact
within state. This means that first the rows will be grouped by state and then on contact
within state. The SUM of yrgoals is for each contact within each state. Note again that with
the GROUP BY I can only display grouped information so the only things in the select can be the
columns I group by and group functions. Again, I could have eliminated a particular contact, as
shown in the second example below, by the use of the WHERE clause. In the example, I eliminated
contact John Adams from the results.
SQL CODE:
SQL> SELECT state, contact, SUM(yrgoal)
2 FROM donor
3 GROUP BY state, contact;
ST CONTACT SUM(YRGOAL)
-- ------------ -----------
MA Amy Costa 150
MA John Adams 50
MA John Smith 500
RI Susan Jones 400
SQL> SELECT state, contact, SUM(yrgoal)
2 FROM donor
3 WHERE contact !='John Adams'
4 GROUP BY state, contact;
ST CONTACT SUM(YRGOAL)
-- ------------ -----------
MA Amy Costa 150
MA John Smith 500
RI Susan Jones 400
A tricky part of grouping is when you can use the WHERE clause and when you must use the HAVING
clause. My understanding is that if you restrict prior to forming the groups the WHERE clause
can be used. In the example above I eliminated contact John Adams. However if you restrict the
actual groups such as in the example below where I was testing to see if SUM(yrgoal) was > 100
then you need to use the HAVING clause. The third example below, brings the WHERE and the
HAVING together with the WHERE eliminating JOHN ADAMS and the HAVING eliminating groups that do
not have a SUM(yrgoal) > 100.
SQL CODE:
SQL> SELECT state, contact, SUM(yrgoal)
2 FROM donor
3 WHERE SUM(yrgoal) > 100
4 GROUP by state, contact;
WHERE SUM(yrgoal) > 100
*
ERROR at line 3:
ORA-00934: group function is not allowed here
SQL> SELECT state, contact, SUM(yrgoal)
2 FROM donor
3 GROUP BY state, contact
4 HAVING SUM(yrgoal) > 100;
ST CONTACT SUM(YRGOAL)
-- ------------ -----------
MA Amy Costa 150
MA John Smith 500
RI Susan Jones 400
SQL> SELECT state, contact, SUM(yrgoal)
2 FROM donor
3 WHERE contact != 'John Adams'
4 GROUP BY state, contact
5 HAVING SUM(yrgoal) > 100
6 ORDER by SUM(yrgoal);
ST CONTACT SUM(YRGOAL)
-- ------------ -----------
MA Amy Costa 150
RI Susan Jones 400
MA John Smith 500
SQL>
The two examples below show the minimum or the average yrgoals by state (example 1) and by
contact within state (example 2).
SQL CODE:
SQL> SELECT MIN(AVG(yrgoal))
2 FROM donor
3 GROUP BY state;
MIN(AVG(YRGOAL))
----------------
175
SQL> SELECT MIN(AVG(yrgoal))
2 FROM donor
3 GROUP BY state, contact;
MIN(AVG(YRGOAL))
----------------
50
SQL>