Introduction to Databases and SQL
Welcome to your first lesson in SQL. In this course we will explore database concepts by working with a fictional e-commerce marketplace. You will create categories, manage sellers, track customers, publish products, and record orders and transactions, all inside a relational database engine. Every query you write in the upcoming modules relies on the shared E-commerce schema you assemble in the setup lessons, using MySQL as the database engine.
Learning Goals
- Understand what databases are and where SQL fits in
- Explore core relational concepts through the marketplace schema
- Review the sample tables and relationships you will query throughout the course
- Run your first SQL statements against that shared dataset
What is a Database?
A database is an organized collection of data that can be stored, searched, and updated efficiently. Imagine running a marketplace on spreadsheets: keeping product catalogs, seller rosters, customer accounts, and order histories in sync quickly becomes messy. A relational database stores those entities in tables and enforces relationships between them so the information stays consistent.
Why Use Databases?
Relational databases help you:
- Store large volumes of information across multiple related entities
- Enforce rules like "each product belongs to exactly one category"
- Allow multiple stakeholders (sellers, support teams, analysts) to access the same truth
- Secure sensitive records, such as customer contact details and payments
- Run complex analytical queries without duplicating data in spreadsheets
Relational databases provide the ACID properties (Atomicity, Consistency, Isolation, Durability). These guarantees protect your order and payment data even if multiple transactions run simultaneously or an unexpected failure occurs.
Types of Databases
Relational Databases (SQL Databases)
Relational databases organize information into tables with rows (records) and columns (attributes). Relationships between tables are maintained through keys. Popular relational engines include PostgreSQL, MySQL, SQL Server, and SQLite.
Non-relational Databases (NoSQL)
NoSQL systems store information using alternative formats such as JSON documents, key-value pairs, or graphs. They shine in scenarios that require flexible schemas or massive horizontal scaling. While this course focuses on SQL, it is helpful to recognize where other models fit.
| Type | Description | Example Database | Official Documentation |
|---|---|---|---|
| Document Database | Stores data as JSON-like documents | MongoDB | MongoDB Docs |
| Key-Value Store | Uses key-value pairs for extremely fast lookups | Redis | Redis Documentation |
| Column-Family Store | Organizes data by columns for analytics workloads | Apache Cassandra | Cassandra Documentation |
| Graph Database | Focuses on nodes and relationships | Neo4j | Neo4j Documentation |
Meet the E-commerce Schema
We will revisit the same core tables in every lesson:
- category - Stores
category_idandcategory_name. - seller - Contains store and contact details for marketplace sellers.
- customer - Lists shoppers, signup year, and optional category preference.
- product - Represents each catalog item, including price, stock, and category.
- orders - Stores purchases with status and delivery timestamps.
- order_item - Bridge table linking orders to products with quantity and price at checkout.
- transactions - Records payment attempts and outcomes per order.
Creating the schema in MySQL looks like this:
CREATE TABLE category (
category_id INT AUTO_INCREMENT PRIMARY KEY,
category_name VARCHAR(80) NOT NULL UNIQUE
) ENGINE=InnoDB;
CREATE TABLE seller (
seller_id INT AUTO_INCREMENT PRIMARY KEY,
store_name VARCHAR(120) NOT NULL,
first_name VARCHAR(80) NOT NULL,
last_name VARCHAR(80) NOT NULL,
email VARCHAR(190) UNIQUE,
joined_at DATE NOT NULL,
rating DECIMAL(2,1) CHECK (rating BETWEEN 1 AND 5)
) ENGINE=InnoDB;
CREATE TABLE customer (
customer_id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(80) NOT NULL,
last_name VARCHAR(80) NOT NULL,
email VARCHAR(190) UNIQUE,
preferred_category_id INT,
signup_year INT CHECK (signup_year >= 2000),
city VARCHAR(80),
FOREIGN KEY (preferred_category_id) REFERENCES category(category_id)
) ENGINE=InnoDB;
Relational Building Blocks
Relational databases rely on a few core constructs:
- Tables define the shape of each entity. For example, the
producttable captures SKU, name, price, stock, and the corresponding category. - Rows hold individual records. Each row in
order_itemrecords a product in a specific order. - Primary keys uniquely identify each row (
customer.customer_id,product.product_id). - Foreign keys connect rows across tables (
order_item.order_idtoorders.order_id). - Constraints enforce data quality, such as making
category_nameunique or limiting prices to positive numbers.
Your First SQL Query
SQL (Structured Query Language) is the standard way to interact with relational databases. A basic query requests specific columns from a table:
-- Inspect the product catalog
SELECT sku, product_name, price
FROM product;
-- Retrieve customers who prefer the Electronics category
SELECT c.customer_id,
c.first_name,
c.last_name,
cat.category_name
FROM customer AS c
JOIN category AS cat ON cat.category_id = c.preferred_category_id
WHERE cat.category_name = 'Electronics';
SQL keywords such as SELECT and FROM declare what you want to read, while clauses like WHERE and JOIN refine the results.
Sample Data and Results
The schema lesson includes seed data so you can experiment right away:
INSERT INTO category (category_name) VALUES
('Electronics'),
('Home & Kitchen'),
('Fashion');
INSERT INTO product (sku, product_name, price, stock_qty, category_id, seller_id) VALUES
('EL-1001', 'Wireless Earbuds', 59.99, 120, 1, 1),
('HK-2001', 'Nonstick Cookware Set', 89.00, 60, 2, 2),
('FA-3001', 'Cotton Hoodie', 39.50, 150, 3, 3);
INSERT INTO orders (customer_id, order_date, status, order_total, shipping_city, delivered_at) VALUES
(1, '2025-03-02', 'Delivered', 89.98, 'Bengaluru', '2025-03-06'),
(3, '2025-03-08', 'Delivered', 89.00, 'Delhi', '2025-03-12'),
(5, '2025-03-15', 'Pending', 59.99, 'Hyderabad', NULL);
Running the product catalog query returns:
sku | product_name | price
------- | ---------------------- | -----
EL-1001 | Wireless Earbuds | 59.99
HK-2001 | Nonstick Cookware Set | 89.00
FA-3001 | Cotton Hoodie | 39.50
Why Learn SQL?
SQL continues to be one of the most requested skills across software engineering, analytics, and data science roles because it:
- Works similarly across major relational databases
- Enables complex data questions to be expressed clearly
- Plays nicely with reporting tools, BI dashboards, and application backends
- Ensures you can reason about data quality, relationships, and performance
- Provides a foundation for advanced analytics and data engineering topics
Common Pitfalls
- Forgetting relationships: Queries that ignore joins can return duplicate rows or miss needed context.
- Relying on
SELECT *: Fetch only the columns you need to avoid surprises when schemas evolve. - Ignoring NULL semantics: Always use
IS NULLorIS NOT NULLwhen filtering optional columns likedelivered_at. - Mixing up identifiers: Pay attention to whether you need
category_id(a numeric key) orcategory_name(the friendly label). - Skipping transactions: When inserting lots of seed data, wrap the statements in a transaction for safety and speed.
Summary
In this lesson you learned:
- How relational databases store information in tables and why that model suits the marketplace scenario
- The key tables, columns, and relationships that form our shared schema
- How to run a basic SQL query against familiar tables like
customerandproduct - Where the provided DDL and seed data live so you can recreate the environment anytime
With the groundwork in place, move on to the next lesson to ensure your SQL environment is configured to run these scripts smoothly.
Quiz
Show quiz
-
Which table links orders to the products purchased?
A)category
B)product
C)order_item
D)seller -
What does the
preferred_category_idcolumn in thecustomertable represent?
A) The customer's account manager
B) The category the customer prefers
C) The product the customer last bought
D) The customer's primary payment method -
Which SQL keyword lists the columns you want to retrieve?
A)WHERE
B)FROM
C)SELECT
D)JOIN -
Why should you avoid using
= NULLin your queries?
A) It runs too slowly
B) SQL converts it to zero
C) Comparisons with NULL always return unknown, so the filter fails
D) It only works in document databases -
Which property of relational databases guarantees that a group of statements either all succeed or all fail together?
A) Durability
B) Consistency
C) Atomicity
D) Isolation
Answers
- C)
order_item - B) The category the customer prefers
- C)
SELECT - C) Comparisons with NULL always return unknown
- C) Atomicity