559 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owls only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Use the GROUP BY clause in SQL for statistics
Part three of a four-part series of blogs
If you want to average, count, sum or perform any other statistic on values in SQL, you need to learn the GROUP BY and HAVING parts of a SQL statement. That's where this online tutorial comes in!
This part of the tutorial on grouping in SQL looks at various special cases of aggregation, beginning with counting rows.
You can count rows in 3 different ways:
|Method||What it will return|
|COUNT(*)||The number of rows|
|COUNT(ColumnName)||Number of rows having non-null column values.|
|COUNT(DISTINCT ColumnName)||The number of unique values for a column.|
Here is an example of each type of syntax:
AS 'Number films in total' FROM tblFilm
AS 'Number films with certificate id' FROM tblFilm
SELECT COUNT(DISTINCT FilmCertificateId)
AS 'Number unique certificate ids' FROM tblFilm
Here's what this returns for the Wise Owl movies database:
There are 261 films in the database, but 101 of these have a null value for the certificate id. There are only 6 distinct certificate values: U, PG, 12, 12A, 15 and 18.
Averaging integer fields can be a pain! Suppose that you want to find out the average number of Oscars won for our database. You might think this would do it:
avg(FilmOscarWins) AS 'Average Oscars won'
However, it doesn't. Because the Oscars column is an integer, the data type returned is also an integer - and approximates to zero:
The query returns 0, even though the true answer must be more than this.
The solution is to convert the column being averaged into a non-integer data type before you process it:
avg(CAST(FilmOscarWins AS float)) AS 'Average Oscars won'
However, it's a good idea to then display the result to a smaller number of decimal places:
CAST(avg(CAST(FilmOscarWins AS float)) AS decimal(5,2))
AS 'Average Oscars won'
The before-and-answer results for the above two queries are as follows:
|Before rounding||Rounded to 2 decimal places|
When you sum a set of values, any nulls will be ignored. For example, here are the 3 films in our database whose names end with A:
One of the films (Kagemusha) has a null value for its budget.
The following query would return the value 73,000,000, the sum of 60,000,000 and 13,000,000:
-- total budget for films ending with "A"
FilmName like '%a'
If you look at the results, there's no sign that you may have missed out values:
The total budget returned by this query.
However, the Messages tab tells the full story:
The waning as to what has happened.
Note that Micrsooft Access would return a null value from this query - SQL behaves differently.
When averaging values (some of which may be null), your best bet is to strip out nulls first (see a separate blog for more on ways to deal with nulls). Consider this query:
-- average budget including Kagemusha
AVG(IsNull(FilmBudgetDollars,0)) AS 'Average budget 1',
-- average budget excluding Kagemusha
AVG(FilmBudgetDollars) AS 'Average budget 2'
FilmName like '%a'
This would return two different numbers:
Which is correct? It depends whether you feel that the film Kagemusha should count towards the statistics.
Next: how to use subqueries in grouping queries.
|Parts of this blog|
25 Aytoun Street