SQL aggregate functions
Posted on Dec 23, 2023
SQL aggregate functions are used to calculate multiple values in a table column. They allow you to summarize and extract useful information from a data set. Here are some commonly used SQL aggregate functions:
COUNT() - Returns the number of rows in a column.
For example:
SELECT COUNT(customer_id) FROM customers
would return the total number of customers.
SUM() - Returns the total sum of values in a numeric column.
For example:
SELECT SUM(salary) FROM employees
would calculate the total payroll.
AVG() - Returns the average value of a numeric column.
For example:
SELECT AVG(salary) FROM employees
would return the average salary.
MAX() - Returns the maximum value in a specified column.
For example:
SELECT MAX(price) FROM products
would return the highest product price.
MIN() - Returns the minimum value in a column.
For example:
SELECT MIN(order_date) FROM orders
would return the earliest order date.
GROUP BY - Allows you to group results by one or more columns before applying aggregate functions.
For example:
SELECT customer_id, SUM(order_total) FROM orders GROUP BY customer_id
would sum the orders by the customer.
Aggregate functions are commonly used in conjunction with the GROUP BY clause to group data before applying a calculation. For example:
SELECT department, SUM(salary)
FROM employees
GROUP BY department;
This would calculate the total salary expenditure per department. The GROUP BY clause first groups all rows by the department column before SUM() calculates the totals.
Some key points about SQL aggregate functions:
- They ignore NULL values in their calculations. You can use the COALESCE function to substitute NULLs with a default value.
- Used without GROUP BY, they calculate results across the entire table.
- Can be used on numeric, character, and date data types in most database engines.
- Are often used in mathematical and financial applications to analyze and summarize large data sets.
- Important part of data analysis to generate reports, statistics and business insights.
- Offer shortcuts to avoid manual calculations using procedural code.
- Help reduce database processing overhead compared to running row-by-row calculations in application code.
Here are some less-known tips and tricks for using SQL aggregate functions:
- The DISTINCT keyword can be used within aggregate functions to remove duplicates. For example, SELECT COUNT(DISTINCT customer_id) would only count unique customer IDs.
- Some database systems like MySQL support GROUP_CONCAT() to concatenate values from a group into a single string. Helpful in creating CSV lists.
- The ARRAY_AGG() function collects all values from a group into an array. Enables easier manipulation compared to concatenation.
- The statistical aggregate functions like STDDEV(), VAR_SAMP(), CORR() etc. allow powerful analytics without leaving the database.
- The FILTER clause can filter which rows are included in the aggregation based on a condition.
- Aggregates like PERCENTILE_CONT() can calculate percentiles and quantiles for advanced analytics.
- The CUBE and ROLLUP options for GROUP BY can generate multiple grouping subtotals in a single query.
- GROUPING SETS provide even finer control over grouping and aggregation choices within one query.
- The GROUPING() function identifies which columns are not part of the grouping in a GROUP BY query. It helps distinguish super-aggregate rows.
- Window functions like SUM() OVER() allow aggregation while retaining detail rows instead of grouping.
- Subqueries within aggregate functions enable calculations like running totals or moving averages without self-joins.
Please be aware that there is a difference in aggregate function support between MySQL and Postgres. MySQL has GROUP_CONCAT, while Postgres has more advanced analytical functions like ARRAY_AGG, GROUPING SETS, window functions, etc. But both support common aggregates like COUNT, SUM, AVG, MIN, and MAX.
By mastering aggregate functions like COUNT, SUM, AVG, MAX, and MIN, you can write powerful queries to unlock insights into your business data. They are essential tools for data analysts, business intelligence professionals, and application developers working with SQL databases.