Skip to main content

Introduction to Databases and SQL

Welcome to your first lesson in SQL. In this course we will explore database concepts by working with a fictional e-commerce marketplace. You will create categories, manage sellers, track customers, publish products, and record orders and transactions, all inside a relational database engine. Every query you write in the upcoming modules relies on the shared E-commerce schema you assemble in the setup lessons, using MySQL as the database engine.

Learning Goals

  • Understand what databases are and where SQL fits in
  • Explore core relational concepts through the marketplace schema
  • Review the sample tables and relationships you will query throughout the course
  • Run your first SQL statements against that shared dataset

What is a Database?

A database is an organized collection of data that can be stored, searched, and updated efficiently. Imagine running a marketplace on spreadsheets: keeping product catalogs, seller rosters, customer accounts, and order histories in sync quickly becomes messy. A relational database stores those entities in tables and enforces relationships between them so the information stays consistent.

Sample customer records
Customer ID
First Name
Last Name
Preferred Category
Signup Year
1
Aman
Sharma
Electronics
2024
3
Rohit
Verma
Home & Kitchen
2024
5
Sana
Ali
No preference
2025
8
Tarun
Iyer
Electronics
2024
9
Nisha
Desai
Books & Stationery
2025

Why Use Databases?

Relational databases help you:

  • Store large volumes of information across multiple related entities
  • Enforce rules like "each product belongs to exactly one category"
  • Allow multiple stakeholders (sellers, support teams, analysts) to access the same truth
  • Secure sensitive records, such as customer contact details and payments
  • Run complex analytical queries without duplicating data in spreadsheets
tip

Relational databases provide the ACID properties (Atomicity, Consistency, Isolation, Durability). These guarantees protect your order and payment data even if multiple transactions run simultaneously or an unexpected failure occurs.

Types of Databases

Relational Databases (SQL Databases)

Relational databases organize information into tables with rows (records) and columns (attributes). Relationships between tables are maintained through keys. Popular relational engines include PostgreSQL, MySQL, SQL Server, and SQLite.

Non-relational Databases (NoSQL)

NoSQL systems store information using alternative formats such as JSON documents, key-value pairs, or graphs. They shine in scenarios that require flexible schemas or massive horizontal scaling. While this course focuses on SQL, it is helpful to recognize where other models fit.

TypeDescriptionExample DatabaseOfficial Documentation
Document DatabaseStores data as JSON-like documentsMongoDBMongoDB Docs
Key-Value StoreUses key-value pairs for extremely fast lookupsRedisRedis Documentation
Column-Family StoreOrganizes data by columns for analytics workloadsApache CassandraCassandra Documentation
Graph DatabaseFocuses on nodes and relationshipsNeo4jNeo4j Documentation

Meet the E-commerce Schema

We will revisit the same core tables in every lesson:

  • category - Stores category_id and category_name.
  • seller - Contains store and contact details for marketplace sellers.
  • customer - Lists shoppers, signup year, and optional category preference.
  • product - Represents each catalog item, including price, stock, and category.
  • orders - Stores purchases with status and delivery timestamps.
  • order_item - Bridge table linking orders to products with quantity and price at checkout.
  • transactions - Records payment attempts and outcomes per order.

Creating the schema in MySQL looks like this:

create_tables.sql
CREATE TABLE category (
category_id INT AUTO_INCREMENT PRIMARY KEY,
category_name VARCHAR(80) NOT NULL UNIQUE
) ENGINE=InnoDB;

CREATE TABLE seller (
seller_id INT AUTO_INCREMENT PRIMARY KEY,
store_name VARCHAR(120) NOT NULL,
first_name VARCHAR(80) NOT NULL,
last_name VARCHAR(80) NOT NULL,
email VARCHAR(190) UNIQUE,
joined_at DATE NOT NULL,
rating DECIMAL(2,1) CHECK (rating BETWEEN 1 AND 5)
) ENGINE=InnoDB;

CREATE TABLE customer (
customer_id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(80) NOT NULL,
last_name VARCHAR(80) NOT NULL,
email VARCHAR(190) UNIQUE,
preferred_category_id INT,
signup_year INT CHECK (signup_year >= 2000),
city VARCHAR(80),
FOREIGN KEY (preferred_category_id) REFERENCES category(category_id)
) ENGINE=InnoDB;

Relational Building Blocks

Relational databases rely on a few core constructs:

  • Tables define the shape of each entity. For example, the product table captures SKU, name, price, stock, and the corresponding category.
  • Rows hold individual records. Each row in order_item records a product in a specific order.
  • Primary keys uniquely identify each row (customer.customer_id, product.product_id).
  • Foreign keys connect rows across tables (order_item.order_id to orders.order_id).
  • Constraints enforce data quality, such as making category_name unique or limiting prices to positive numbers.

Your First SQL Query

SQL (Structured Query Language) is the standard way to interact with relational databases. A basic query requests specific columns from a table:

first_query.sql
-- Inspect the product catalog
SELECT sku, product_name, price
FROM product;

-- Retrieve customers who prefer the Electronics category
SELECT c.customer_id,
c.first_name,
c.last_name,
cat.category_name
FROM customer AS c
JOIN category AS cat ON cat.category_id = c.preferred_category_id
WHERE cat.category_name = 'Electronics';

SQL keywords such as SELECT and FROM declare what you want to read, while clauses like WHERE and JOIN refine the results.

Sample Data and Results

The schema lesson includes seed data so you can experiment right away:

seed_data.sql
INSERT INTO category (category_name) VALUES
('Electronics'),
('Home & Kitchen'),
('Fashion');

INSERT INTO product (sku, product_name, price, stock_qty, category_id, seller_id) VALUES
('EL-1001', 'Wireless Earbuds', 59.99, 120, 1, 1),
('HK-2001', 'Nonstick Cookware Set', 89.00, 60, 2, 2),
('FA-3001', 'Cotton Hoodie', 39.50, 150, 3, 3);

INSERT INTO orders (customer_id, order_date, status, order_total, shipping_city, delivered_at) VALUES
(1, '2025-03-02', 'Delivered', 89.98, 'Bengaluru', '2025-03-06'),
(3, '2025-03-08', 'Delivered', 89.00, 'Delhi', '2025-03-12'),
(5, '2025-03-15', 'Pending', 59.99, 'Hyderabad', NULL);

Running the product catalog query returns:

sku     | product_name           | price
------- | ---------------------- | -----
EL-1001 | Wireless Earbuds | 59.99
HK-2001 | Nonstick Cookware Set | 89.00
FA-3001 | Cotton Hoodie | 39.50

Why Learn SQL?

SQL continues to be one of the most requested skills across software engineering, analytics, and data science roles because it:

  • Works similarly across major relational databases
  • Enables complex data questions to be expressed clearly
  • Plays nicely with reporting tools, BI dashboards, and application backends
  • Ensures you can reason about data quality, relationships, and performance
  • Provides a foundation for advanced analytics and data engineering topics

Common Pitfalls

  • Forgetting relationships: Queries that ignore joins can return duplicate rows or miss needed context.
  • Relying on SELECT *: Fetch only the columns you need to avoid surprises when schemas evolve.
  • Ignoring NULL semantics: Always use IS NULL or IS NOT NULL when filtering optional columns like delivered_at.
  • Mixing up identifiers: Pay attention to whether you need category_id (a numeric key) or category_name (the friendly label).
  • Skipping transactions: When inserting lots of seed data, wrap the statements in a transaction for safety and speed.

Summary

In this lesson you learned:

  • How relational databases store information in tables and why that model suits the marketplace scenario
  • The key tables, columns, and relationships that form our shared schema
  • How to run a basic SQL query against familiar tables like customer and product
  • Where the provided DDL and seed data live so you can recreate the environment anytime

With the groundwork in place, move on to the next lesson to ensure your SQL environment is configured to run these scripts smoothly.

Quiz

Show quiz
  1. Which table links orders to the products purchased?
    A) category
    B) product
    C) order_item
    D) seller

  2. What does the preferred_category_id column in the customer table represent?
    A) The customer's account manager
    B) The category the customer prefers
    C) The product the customer last bought
    D) The customer's primary payment method

  3. Which SQL keyword lists the columns you want to retrieve?
    A) WHERE
    B) FROM
    C) SELECT
    D) JOIN

  4. Why should you avoid using = NULL in your queries?
    A) It runs too slowly
    B) SQL converts it to zero
    C) Comparisons with NULL always return unknown, so the filter fails
    D) It only works in document databases

  5. Which property of relational databases guarantees that a group of statements either all succeed or all fail together?
    A) Durability
    B) Consistency
    C) Atomicity
    D) Isolation


Answers

  1. C) order_item
  2. B) The category the customer prefers
  3. C) SELECT
  4. C) Comparisons with NULL always return unknown
  5. C) Atomicity