Using Aggregate Functions
So far, you've learned how to retrieve specific data using WHERE clauses and organize results with ORDER BY and LIMIT. But what if you need to analyze your data rather than just display it? Aggregate functions help you answer questions like "How many customers signed up in 2024?" or "What is the average product price?" using the marketplace schema.
In this lesson, you'll learn to:
- Count records and handle NULL values
- Calculate sums, averages, and extremes
- Combine multiple aggregates in single queries
- Understand when to use
DISTINCTwith aggregates
What Are Aggregate Functions?
Aggregate functions perform calculations across multiple rows and return a single summary value. They're essential for data analysis, reporting, and business intelligence.
The most common aggregate functions include:
COUNT()- counts the number of rowsSUM()- calculates the total sumAVG()- computes the average valueMAX()- finds the maximum valueMIN()- finds the minimum value
Counting Records with COUNT()
The COUNT() function returns the number of rows that match your criteria.
-- Count all customers
SELECT COUNT(*) AS total_customers FROM customer;
-- Count customers who prefer Electronics (category_id = 1)
SELECT COUNT(*) AS electronics_fans
FROM customer
WHERE preferred_category_id = 1;
-- Count distinct order statuses
SELECT COUNT(DISTINCT status) AS order_statuses
FROM orders;
Use COUNT(*) when you want to count all rows, including those with NULL values. Use COUNT(column_name) to count only non-NULL values in that specific column.
Calculating Totals with SUM()
SUM() adds up all values in a numeric column.
-- Total revenue from delivered orders
SELECT SUM(order_total) AS delivered_revenue
FROM orders
WHERE status = 'Delivered';
-- Total inventory value for Electronics (category_id = 1)
SELECT SUM(price * stock_qty) AS electronics_inventory_value
FROM product
WHERE category_id = 1;
Finding Averages with AVG()
AVG() calculates the arithmetic mean of values in a numeric column.
-- Average product price
SELECT AVG(price) AS average_product_price
FROM product;
-- Average price for Home & Kitchen products (category_id = 2)
SELECT AVG(price) AS avg_home_kitchen_price
FROM product
WHERE category_id = 2;
AVG() automatically excludes NULL values from its calculation. If you need to include NULLs as zeros, use COALESCE(column_name, 0).
Finding Extremes with MAX() and MIN()
These functions help you identify boundary values in your data.
-- Most recent and earliest customer signup years
SELECT
MAX(signup_year) AS newest_signup_year,
MIN(signup_year) AS earliest_signup_year
FROM customer;
-- Highest and lowest priced products
SELECT
MAX(price) AS max_product_price,
MIN(price) AS min_product_price
FROM product;
Combining Multiple Aggregate Functions
You can use multiple aggregate functions in a single query to get comprehensive insights.
-- Order overview for March 2025
SELECT
COUNT(*) AS total_items,
COUNT(DISTINCT o.customer_id) AS unique_customers,
SUM(oi.quantity * oi.unit_price) AS gross_revenue,
AVG(oi.unit_price) AS avg_item_price,
MAX(oi.unit_price) AS max_item_price
FROM order_item AS oi
JOIN orders AS o ON o.order_id = oi.order_id
WHERE o.order_date LIKE '2025-03%';
Using DISTINCT with Aggregate Functions
Combine DISTINCT with aggregates to work with unique values only.
-- Count unique customers who placed orders in March 2025
SELECT COUNT(DISTINCT customer_id) AS unique_customers
FROM orders
WHERE order_date LIKE '2025-03%';
-- Average of unique product prices
SELECT AVG(DISTINCT price) AS avg_unique_prices
FROM product;
Be careful with DISTINCT in aggregates. AVG(DISTINCT price) gives a very different result from AVG(price) - it averages only unique prices rather than all prices.
Common Pitfalls
- NULL values confusion: Remember that
COUNT(*)counts every row (even ifdelivered_atis NULL), whileCOUNT(delivered_at)counts only delivered orders - Mixing aggregates and regular columns: You can't mix aggregate functions with non-aggregated columns without
GROUP BY(we'll cover this next lesson) - Data type mismatches:
SUM()andAVG()only work with numeric data types - Performance with DISTINCT:
COUNT(DISTINCT column)can be slow on large tables - Floating-point precision:
AVG()on floating-point numbers may have precision issues
Summary
Aggregate functions transform raw data into meaningful insights:
COUNT()helps you understand data volumeSUM()andAVG()provide financial and statistical analysisMAX()andMIN()identify boundaries and outliers- Combine multiple functions for comprehensive reporting on order activity
- Use
DISTINCTcarefully to focus on unique values
In the next lesson, you'll learn how to use GROUP BY to apply these aggregates to subsets of your data.
Quiz
Show quiz
-
What does
COUNT(*)return?- A) The number of non-NULL values in the first column
- B) The total number of rows in the result set
- C) The number of distinct values in all columns
-
Which aggregate function would you use to find the highest product price?
- A)
SUM(price) - B)
AVG(price) - C)
MAX(price)
- A)
-
What's the difference between
COUNT(*)andCOUNT(delivered_at)when counting orders?- A)
COUNT(*)is faster - B)
COUNT(column_name)excludes NULL values - C)
COUNT(*)only counts distinct values
- A)
-
Can you use multiple aggregate functions in a single SELECT statement?
- A) No, only one aggregate per query
- B) Yes, but only if they're the same function
- C) Yes, you can use multiple different aggregates
-
What happens if you use
SUM()on a column containing NULL values (for example, order totals from a LEFT JOIN where some orders are missing)?- A) NULL values are treated as zero
- B) NULL values are excluded from the calculation
- C) The entire result becomes NULL
Answers:
- B -
COUNT(*)returns the total number of rows, regardless of NULL values - C -
MAX(price)finds the highest price - B -
COUNT(column_name)counts only non-NULL values in that column - C - You can use multiple different aggregate functions in one query
- B - NULL values are excluded from
SUM()calculations