Working with Subqueries
Subqueries let you embed one query inside another. They are perfect for questions like “Which courses are harder than the catalog average?” or “Which students share a department with Professor Mehta?” In this lesson we will write subqueries that rely on the University schema so you can see how nested logic complements joins.
Learning Goals
- Write subqueries in
WHERE,SELECT, andFROMclauses - Distinguish between single-row and multi-row subqueries
- Use correlated subqueries to compare each row against its peers
- Decide when a join or a subquery communicates intent more clearly
Single-Row Subqueries
A single-row subquery returns exactly one value that you can compare with operators such as =, >, or <.
SELECT course_code,
course_title,
credits
FROM course
WHERE credits > (
SELECT AVG(credits)
FROM course
);
The inner query calculates the catalog-wide average credit value. The outer query keeps only courses above that average.
Multi-Row Subqueries
Multi-row subqueries return a list of values. Combine them with operators like IN, ANY, or ALL.
SELECT first_name,
last_name,
dept_id
FROM instructor
WHERE dept_id IN (
SELECT dept_id
FROM department
WHERE dept_name IN ('Computer Science', 'Mathematics')
);
Here the subquery identifies the department IDs for two programs, and the outer query returns instructors who belong to either.
Subqueries in the SELECT Clause
You can use a subquery to compute a value for each row in the result set. This pattern is handy when you need a count or aggregate that depends on the current row.
SELECT d.dept_name,
(SELECT COUNT(*)
FROM course AS c
WHERE c.dept_id = d.dept_id) AS course_count
FROM department AS d
ORDER BY course_count DESC;
Each time the outer query processes a department, the subquery counts how many courses that department owns.
Subqueries in the FROM Clause
Subqueries in the FROM clause behave like temporary tables. They are often called derived tables.
SELECT dept_name,
avg_credits
FROM (
SELECT d.dept_name,
AVG(c.credits) AS avg_credits
FROM department AS d
JOIN course AS c ON c.dept_id = d.dept_id
GROUP BY d.dept_name
) AS dept_stats
WHERE avg_credits >= 3.5
ORDER BY avg_credits DESC;
The inner query calculates the average credit load for each department; the outer query filters that derived table.
Correlated Subqueries
Correlated subqueries reference columns from the outer query. They run once for every row produced by the outer query, making them powerful and occasionally expensive.
SELECT c.course_code,
c.course_title,
d.dept_name,
c.credits
FROM course AS c
JOIN department AS d ON d.dept_id = c.dept_id
WHERE c.credits = (
SELECT MAX(c2.credits)
FROM course AS c2
WHERE c2.dept_id = c.dept_id
);
For each course, the subquery checks the maximum credit value inside that course's department. Only the top-credit courses per department remain.
EXISTS for Presence Checks
Another common correlated pattern uses EXISTS to test whether related rows exist.
SELECT dept_name
FROM department AS d
WHERE EXISTS (
SELECT 1
FROM enrollment AS e
JOIN course AS c ON c.course_id = e.course_id
WHERE c.dept_id = d.dept_id
AND e.semester = 'Spring 2025'
);
Subqueries vs. JOINs
Many problems can be solved using either subqueries or joins. Pick the approach that best expresses the requirement.
-- Using a subquery
SELECT course_code, course_title
FROM course
WHERE dept_id = (
SELECT dept_id
FROM department
WHERE dept_name = 'Computer Science'
);
-- Using a join
SELECT c.course_code, c.course_title
FROM course AS c
JOIN department AS d ON d.dept_id = c.dept_id
WHERE d.dept_name = 'Computer Science';
The join version is often easier to extend when you need columns from both tables, while the subquery version can feel cleaner for simple lookups.
Common Pitfalls
- Unexpected multi-row results: Single-row subqueries combined with
=or>must return exactly one value. If the inner query returns more, the statement fails. - Correlated subquery performance: Because they execute once per outer row, they can be slow on large datasets.
- NULL comparisons: If a subquery returns
NULL, comparisons like=orINmay not behave as expected. UseEXISTS/NOT EXISTSor handleNULLexplicitly. - Readability: Deeply nested subqueries are hard to maintain. Consider breaking them into views or common table expressions when they become complex.
Summary
- Subqueries can appear in
WHERE,SELECT, andFROMclauses. - Use multi-row subqueries with operators like
IN,ANY, andALL. - Correlated subqueries evaluate once per row and unlock comparisons within groups.
- Decide between joins and subqueries based on clarity, required columns, and performance.
Quiz
Show quiz
-
Which operator would you use with a subquery that returns several department IDs?
A)=
B)IN
C)BETWEEN
D)LIKE -
What makes a subquery “correlated”?
A) It returns multiple rows
B) It appears in theFROMclause
C) It references columns from the outer query
D) It usesGROUP BY -
In the query that counts courses per department within the
SELECTlist, how many times does the subquery run?
A) Once total
B) Once per department row returned
C) Once per course
D) It depends on the database engine -
True or False: You can always rewrite a subquery as a join without changing the result.
A) True
B) False
Answers
- B
- C
- B
- B