Using Aggregate Functions
So far, you've learned how to retrieve specific data using WHERE clauses and organize results with ORDER BY and LIMIT. But what if you need to analyze your data rather than just display it? Aggregate functions help you answer questions like “How many students joined in 2024?” or “What is the average credit load of Computer Science courses?” using the University schema.
In this lesson, you'll learn to:
- Count records and handle NULL values
- Calculate sums, averages, and extremes
- Combine multiple aggregates in single queries
- Understand when to use
DISTINCTwith aggregates
What Are Aggregate Functions?
Aggregate functions perform calculations across multiple rows and return a single summary value. They're essential for data analysis, reporting, and business intelligence.
The most common aggregate functions include:
COUNT()- counts the number of rowsSUM()- calculates the total sumAVG()- computes the average valueMAX()- finds the maximum valueMIN()- finds the minimum value
Counting Records with COUNT()
The COUNT() function returns the number of rows that match your criteria.
-- Count all students
SELECT COUNT(*) AS total_students FROM student;
-- Count students in the Computer Science department (dept_id = 1)
SELECT COUNT(*) AS cs_students
FROM student
WHERE dept_id = 1;
-- Count distinct semesters that have enrollments
SELECT COUNT(DISTINCT semester) AS semesters_with_activity
FROM enrollment;
Use COUNT(*) when you want to count all rows, including those with NULL values. Use COUNT(column_name) to count only non-NULL values in that specific column.
Calculating Totals with SUM()
SUM() adds up all values in a numeric column.
-- Total credits offered across all courses
SELECT SUM(credits) AS total_credits_offered
FROM course;
-- Total credits offered by the Computer Science department
SELECT SUM(credits) AS cs_credits
FROM course
WHERE dept_id = 1;
Finding Averages with AVG()
AVG() calculates the arithmetic mean of values in a numeric column.
-- Average credits per course
SELECT AVG(credits) AS average_course_credits
FROM course;
-- Average credits for Information Technology courses (dept_id = 2)
SELECT AVG(credits) AS it_avg_credits
FROM course
WHERE dept_id = 2;
AVG() automatically excludes NULL values from its calculation. If you need to include NULLs as zeros, use COALESCE(column_name, 0).
Finding Extremes with MAX() and MIN()
These functions help you identify boundary values in your data.
-- Most recent and earliest student cohorts
SELECT
MAX(enrollment_year) AS newest_enrollment_year,
MIN(enrollment_year) AS earliest_enrollment_year
FROM student;
-- Highest and lowest credit courses
SELECT
MAX(credits) AS max_course_credits,
MIN(credits) AS min_course_credits
FROM course;
Combining Multiple Aggregate Functions
You can use multiple aggregate functions in a single query to get comprehensive insights.
-- Enrollment overview for Spring 2025
SELECT
COUNT(*) AS total_enrollments,
COUNT(DISTINCT student_id) AS unique_students,
SUM(c.credits) AS total_credit_hours,
AVG(c.credits) AS average_course_credits,
MAX(c.credits) AS max_course_credits
FROM enrollment AS e
JOIN course AS c ON c.course_id = e.course_id
WHERE e.semester = 'Spring 2025';
Using DISTINCT with Aggregate Functions
Combine DISTINCT with aggregates to work with unique values only.
-- Count unique students enrolled in Spring 2025
SELECT COUNT(DISTINCT student_id) AS unique_students
FROM enrollment
WHERE semester = 'Spring 2025';
-- Average of unique course credit values
SELECT AVG(DISTINCT credits) AS avg_unique_credits
FROM course;
Be careful with DISTINCT in aggregates. AVG(DISTINCT price) gives a very different result from AVG(price) - it averages only unique prices rather than all prices.
Common Pitfalls
- NULL values confusion: Remember that
COUNT(*)counts every row (even ifgradeis NULL), whileCOUNT(grade)counts only non-NULL grades - Mixing aggregates and regular columns: You can't mix aggregate functions with non-aggregated columns without
GROUP BY(we'll cover this next lesson) - Data type mismatches:
SUM()andAVG()only work with numeric data types - Performance with DISTINCT:
COUNT(DISTINCT column)can be slow on large tables - Floating-point precision:
AVG()on floating-point numbers may have precision issues
Summary
Aggregate functions transform raw data into meaningful insights:
COUNT()helps you understand data volumeSUM()andAVG()provide financial and statistical analysisMAX()andMIN()identify boundaries and outliers- Combine multiple functions for comprehensive reporting on course demand or student load
- Use
DISTINCTcarefully to focus on unique values
In the next lesson, you'll learn how to use GROUP BY to apply these aggregates to subsets of your data.
Quiz
Show quiz
-
What does
COUNT(*)return?- A) The number of non-NULL values in the first column
- B) The total number of rows in the result set
- C) The number of distinct values in all columns
-
Which aggregate function would you use to find the highest course credit value?
- A)
SUM(price) - B)
AVG(price) - C)
MAX(price)
- A)
-
What's the difference between
COUNT(*)andCOUNT(grade)when counting enrollment records?- A)
COUNT(*)is faster - B)
COUNT(column_name)excludes NULL values - C)
COUNT(*)only counts distinct values
- A)
-
Can you use multiple aggregate functions in a single SELECT statement?
- A) No, only one aggregate per query
- B) Yes, but only if they're the same function
- C) Yes, you can use multiple different aggregates
-
What happens if you use
SUM()on a column containing NULL values (for example, credits from a LEFT JOIN where some courses are missing)?- A) NULL values are treated as zero
- B) NULL values are excluded from the calculation
- C) The entire result becomes NULL
Answers:
- B -
COUNT(*)returns the total number of rows, regardless of NULL values - C -
MAX(credits)finds the highest credit value - B -
COUNT(column_name)counts only non-NULL values in that column - C - You can use multiple different aggregate functions in one query
- B - NULL values are excluded from
SUM()calculations