Skip to main content

Filtering Data with WHERE Clause

Now that you've learned the basic structure of SQL queries, let's dive into one of the most powerful features: filtering data with the WHERE clause. We'll use the marketplace tables to move from simply viewing raw records to answering specific questions like "Who signed up in 2024?" or "Which products cost more than 50?"

Learning Goals

  • Understand the purpose and syntax of the WHERE clause
  • Learn to use comparison operators for filtering
  • Master pattern matching with LIKE and wildcards
  • Combine multiple conditions with AND, OR, and NOT
  • Handle NULL values properly in filters

Understanding the WHERE Clause

The WHERE clause allows you to filter records in your SELECT statements, returning only the rows that meet specific conditions. Think of it as asking a precise question rather than getting all the information at once.

Basic WHERE Clause Syntax
SELECT column1, column2, ...
FROM table_name
WHERE condition;

Comparison Operators

SQL provides various operators to compare values in your WHERE conditions:

Using Comparison Operators
-- Equal to
SELECT product_id, sku, category_id
FROM product
WHERE category_id = 1;

-- Greater than
SELECT sku, product_name, price
FROM product
WHERE price > 50;

-- Less than or equal to
SELECT customer_id, first_name, signup_year
FROM customer
WHERE signup_year <= 2023;

-- Not equal to (two ways)
SELECT category_id, category_name
FROM category
WHERE category_name <> 'Books & Stationery';

SELECT category_id, category_name
FROM category
WHERE category_name != 'Books & Stationery';
tip

When comparing strings, most databases are case-sensitive by default. 'Delivered' and 'delivered' would be considered different values unless your database is configured otherwise.

Pattern Matching with LIKE

The LIKE operator lets you search for patterns in text data using wildcards:

  • % - Matches any sequence of characters (zero or more)
  • _ - Matches exactly one character
Pattern Matching Examples
-- Find customer first names starting with 'A'
SELECT customer_id, first_name, last_name
FROM customer
WHERE first_name LIKE 'A%';

-- Find customer emails ending with '@shop.local'
SELECT customer_id, email
FROM customer
WHERE email LIKE '%@shop.local';

-- Find SKUs that start with 'EL' and have 4 digits
SELECT sku, product_name
FROM product
WHERE sku LIKE 'EL-____';

-- Find seller last names containing 'an'
SELECT seller_id, first_name, last_name
FROM seller
WHERE last_name LIKE '%an%';

Combining Conditions

You can combine multiple conditions using AND, OR, and NOT:

Combining Multiple Conditions
-- AND: Both conditions must be true
SELECT sku, product_name, price
FROM product
WHERE category_id = 1 AND price >= 30;

-- OR: At least one condition must be true
SELECT customer_id, first_name, last_name, preferred_category_id
FROM customer
WHERE preferred_category_id = 1 OR preferred_category_id = 2;

-- NOT: Exclude matching records
SELECT order_id, status
FROM orders
WHERE NOT status = 'Delivered';

-- Complex combinations with parentheses
SELECT oi.order_item_id,
CONCAT(cust.first_name, ' ', cust.last_name) AS customer_name,
p.sku,
o.order_date,
o.status
FROM order_item AS oi
JOIN orders AS o ON o.order_id = oi.order_id
JOIN customer AS cust ON cust.customer_id = o.customer_id
JOIN product AS p ON p.product_id = oi.product_id
WHERE (o.status = 'Delivered' OR o.status = 'Shipped')
AND p.category_id = 1;

Handling NULL Values

NULL represents missing or unknown data, and it requires special handling:

Working with NULL Values
-- Find customers who haven't set a preferred category yet
SELECT customer_id, first_name, last_name
FROM customer
WHERE preferred_category_id IS NULL;

-- Find customers who have set a preferred category
SELECT customer_id, first_name, last_name
FROM customer
WHERE preferred_category_id IS NOT NULL;

-- This WON'T work as expected!
SELECT customer_id, first_name, last_name
FROM customer
WHERE preferred_category_id = NULL; -- Always returns no results
warning

Never use = NULL or != NULL in your conditions. Always use IS NULL or IS NOT NULL instead. NULL represents an unknown value, so any comparison with NULL (even NULL = NULL) returns NULL, which is treated as false.

BETWEEN and IN Operators

These operators provide convenient ways to specify ranges and multiple values:

BETWEEN and IN Examples
-- BETWEEN: Inclusive range
SELECT sku, product_name, price
FROM product
WHERE price BETWEEN 20 AND 80;

-- IN: Match any value in a list
SELECT category_id, category_name
FROM category
WHERE category_name IN ('Electronics', 'Books & Stationery', 'Sports & Outdoors');

-- NOT IN: Exclude values in a list
SELECT sku, product_name, seller_id
FROM product
WHERE seller_id NOT IN (1, 2);

Common Pitfalls

  • Case sensitivity: String comparisons are often case-sensitive
  • NULL handling: Forgetting to use IS NULL/IS NOT NULL
  • Date formatting: Using incorrect date formats in comparisons
  • Operator precedence: AND has higher precedence than OR, use parentheses to clarify
  • Floating point precision: Exact equality comparisons with decimal numbers can be unreliable
Operator Precedence Example
-- This might not do what you expect!
SELECT sku, price, category_id
FROM product
WHERE category_id = 1 OR category_id = 2 AND price >= 50;

-- Use parentheses to make it clear
SELECT sku, price, category_id
FROM product
WHERE (category_id = 1 OR category_id = 2) AND price >= 50;

Summary

The WHERE clause is your primary tool for filtering data in SQL. You now know how to:

  • Use comparison operators (=, >, <, <>) on columns like price, category_id, or signup_year
  • Search for patterns with LIKE and wildcards (%, _)
  • Combine conditions with AND, OR, and NOT
  • Handle NULLs safely with IS NULL and IS NOT NULL

Practice filtering product, orders, and customer tables to get comfortable before moving on to sorting and limiting results.

Quiz

Show quiz
  1. Which clause filters rows based on a condition?
  2. What wildcard would you use to match any number of characters in a LIKE pattern?
  3. How do you correctly filter rows where preferred_category_id is missing?
  4. Which operator is used to check if a value falls inside a list of options?
  5. Why should you avoid price = 19.99 checks when using floating point values?

Answers:

  1. The WHERE clause
  2. %
  3. WHERE preferred_category_id IS NULL
  4. IN
  5. Floating point precision can cause exact comparisons to fail; use ranges or rounding if needed