Filtering Data with WHERE Clause
Now that you've learned the basic structure of SQL queries, let's dive into one of the most powerful features: filtering data with the WHERE clause. We'll use the marketplace tables to move from simply viewing raw records to answering specific questions like "Who signed up in 2024?" or "Which products cost more than 50?"
Learning Goals
- Understand the purpose and syntax of the WHERE clause
- Learn to use comparison operators for filtering
- Master pattern matching with LIKE and wildcards
- Combine multiple conditions with AND, OR, and NOT
- Handle NULL values properly in filters
Understanding the WHERE Clause
The WHERE clause allows you to filter records in your SELECT statements, returning only the rows that meet specific conditions. Think of it as asking a precise question rather than getting all the information at once.
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Comparison Operators
SQL provides various operators to compare values in your WHERE conditions:
-- Equal to
SELECT product_id, sku, category_id
FROM product
WHERE category_id = 1;
-- Greater than
SELECT sku, product_name, price
FROM product
WHERE price > 50;
-- Less than or equal to
SELECT customer_id, first_name, signup_year
FROM customer
WHERE signup_year <= 2023;
-- Not equal to (two ways)
SELECT category_id, category_name
FROM category
WHERE category_name <> 'Books & Stationery';
SELECT category_id, category_name
FROM category
WHERE category_name != 'Books & Stationery';
When comparing strings, most databases are case-sensitive by default. 'Delivered' and 'delivered' would be considered different values unless your database is configured otherwise.
Pattern Matching with LIKE
The LIKE operator lets you search for patterns in text data using wildcards:
%- Matches any sequence of characters (zero or more)_- Matches exactly one character
-- Find customer first names starting with 'A'
SELECT customer_id, first_name, last_name
FROM customer
WHERE first_name LIKE 'A%';
-- Find customer emails ending with '@shop.local'
SELECT customer_id, email
FROM customer
WHERE email LIKE '%@shop.local';
-- Find SKUs that start with 'EL' and have 4 digits
SELECT sku, product_name
FROM product
WHERE sku LIKE 'EL-____';
-- Find seller last names containing 'an'
SELECT seller_id, first_name, last_name
FROM seller
WHERE last_name LIKE '%an%';
Combining Conditions
You can combine multiple conditions using AND, OR, and NOT:
-- AND: Both conditions must be true
SELECT sku, product_name, price
FROM product
WHERE category_id = 1 AND price >= 30;
-- OR: At least one condition must be true
SELECT customer_id, first_name, last_name, preferred_category_id
FROM customer
WHERE preferred_category_id = 1 OR preferred_category_id = 2;
-- NOT: Exclude matching records
SELECT order_id, status
FROM orders
WHERE NOT status = 'Delivered';
-- Complex combinations with parentheses
SELECT oi.order_item_id,
CONCAT(cust.first_name, ' ', cust.last_name) AS customer_name,
p.sku,
o.order_date,
o.status
FROM order_item AS oi
JOIN orders AS o ON o.order_id = oi.order_id
JOIN customer AS cust ON cust.customer_id = o.customer_id
JOIN product AS p ON p.product_id = oi.product_id
WHERE (o.status = 'Delivered' OR o.status = 'Shipped')
AND p.category_id = 1;
Handling NULL Values
NULL represents missing or unknown data, and it requires special handling:
-- Find customers who haven't set a preferred category yet
SELECT customer_id, first_name, last_name
FROM customer
WHERE preferred_category_id IS NULL;
-- Find customers who have set a preferred category
SELECT customer_id, first_name, last_name
FROM customer
WHERE preferred_category_id IS NOT NULL;
-- This WON'T work as expected!
SELECT customer_id, first_name, last_name
FROM customer
WHERE preferred_category_id = NULL; -- Always returns no results
Never use = NULL or != NULL in your conditions. Always use IS NULL or IS NOT NULL instead. NULL represents an unknown value, so any comparison with NULL (even NULL = NULL) returns NULL, which is treated as false.
BETWEEN and IN Operators
These operators provide convenient ways to specify ranges and multiple values:
-- BETWEEN: Inclusive range
SELECT sku, product_name, price
FROM product
WHERE price BETWEEN 20 AND 80;
-- IN: Match any value in a list
SELECT category_id, category_name
FROM category
WHERE category_name IN ('Electronics', 'Books & Stationery', 'Sports & Outdoors');
-- NOT IN: Exclude values in a list
SELECT sku, product_name, seller_id
FROM product
WHERE seller_id NOT IN (1, 2);
Common Pitfalls
- Case sensitivity: String comparisons are often case-sensitive
- NULL handling: Forgetting to use IS NULL/IS NOT NULL
- Date formatting: Using incorrect date formats in comparisons
- Operator precedence: AND has higher precedence than OR, use parentheses to clarify
- Floating point precision: Exact equality comparisons with decimal numbers can be unreliable
-- This might not do what you expect!
SELECT sku, price, category_id
FROM product
WHERE category_id = 1 OR category_id = 2 AND price >= 50;
-- Use parentheses to make it clear
SELECT sku, price, category_id
FROM product
WHERE (category_id = 1 OR category_id = 2) AND price >= 50;
Summary
The WHERE clause is your primary tool for filtering data in SQL. You now know how to:
- Use comparison operators (
=,>,<,<>) on columns likeprice,category_id, orsignup_year - Search for patterns with LIKE and wildcards (
%,_) - Combine conditions with AND, OR, and NOT
- Handle NULLs safely with
IS NULLandIS NOT NULL
Practice filtering product, orders, and customer tables to get comfortable before moving on to sorting and limiting results.
Quiz
Show quiz
- Which clause filters rows based on a condition?
- What wildcard would you use to match any number of characters in a LIKE pattern?
- How do you correctly filter rows where
preferred_category_idis missing? - Which operator is used to check if a value falls inside a list of options?
- Why should you avoid
price = 19.99checks when using floating point values?
Answers:
- The WHERE clause
%WHERE preferred_category_id IS NULLIN- Floating point precision can cause exact comparisons to fail; use ranges or rounding if needed