Lesson 9
SQL provides powerful capabilities for grouping query results and calculating aggregates to analyze and summarize data. Mastering these techniques unlocks the ability to answer more insightful questions from database tables.
The GROUP BY clause groups a result set by one or more columns. For each distinct value in those columns, it aggregates the rows into a single summary row.
For example, to count customers per city:
SELECT city, COUNT(*) AS num_customers
FROM customers
GROUP BY city;
This groups all rows by the "city" column. For each unique city, it outputs a single row with the city name and count of rows for that city.
We can group by multiple columns to create subgroupings - like customers per city per country:
SELECT country, city, COUNT(*) AS num_customers
FROM customers
GROUP BY country, city;
Now for each country and city combination, we get a count.
Common aggregate functions used with GROUP BY are:
COUNT(): Counts rows in a group. Can pass *
or a column name.
SUM(): Sums a numeric column.
AVG(): Averages a numeric column.
MAX()/MIN(): Maximum and minimum values in a column.
Let's see some examples of using aggregates to gain business insights:
SELECT MAX(salary) AS highest_salary
FROM employees;
Finds highest salary paid.
SELECT department, SUM(sales) AS total_sales
FROM sales_data
GROUP BY department;
Gets total sales by department. Useful for department performance comparisons.
Where GROUP BY groups rows by columns, HAVING filters row groups based on aggregate conditions. This is done after aggregation, unlike WHERE which filters before.
For example, to get high performing departments with over $100,000 in sales:
SELECT department, SUM(sales) AS total_sales
FROM sales_data
GROUP BY department
HAVING SUM(sales) > 100000;
The HAVING clause filters the groups after computing totals. This leaves only departments with sales greater than the given amount.
You can combine HAVING with WHERE. WHERE filters individual rows before aggregation. HAVING then filters whole groups after calculating summaries.
Mastering GROUP BY, aggregates like COUNT/SUM/AVG/MIN/MAX, along with the HAVING clause opens up a breadth of reporting and business analytics capabilities from database tables. With a bit of SQL knowledge you can derive surprisingly powerful insights.