SQL Cheat Sheet for Beginners: Queries, Joins, and Functions

SQL is the universal language of data. Every major database, from PostgreSQL and MySQL to SQLite and SQL Server, speaks it. Whether you are pulling reports, analyzing business data, or building applications, SQL is a skill that pays dividends for your entire career. This cheat sheet covers every essential SQL concept with practical examples you can adapt to your own tables and data.

SELECT Basics

Every SQL query starts with SELECT. It tells the database what columns you want to retrieve.

-- Select specific columns
SELECT first_name, last_name, email
FROM customers;

-- Select all columns
SELECT *
FROM customers;

-- Select with an alias
SELECT first_name AS name, annual_salary / 12 AS monthly_salary
FROM employees;

-- Select distinct (unique) values
SELECT DISTINCT department
FROM employees;

-- Limit the number of rows returned
SELECT *
FROM orders
LIMIT 10;

-- Skip rows with OFFSET (pagination)
SELECT *
FROM products
LIMIT 10 OFFSET 20;    -- rows 21-30

Ordering results:

-- Sort ascending (default)
SELECT * FROM employees ORDER BY last_name;

-- Sort descending
SELECT * FROM employees ORDER BY salary DESC;

-- Sort by multiple columns
SELECT * FROM employees ORDER BY department ASC, salary DESC;

-- Sort by column position (1-based)
SELECT first_name, last_name, salary
FROM employees
ORDER BY 3 DESC;    -- order by salary

Filtering with WHERE

WHERE filters rows before they are returned.

-- Comparison operators
SELECT * FROM products WHERE price > 50;
SELECT * FROM products WHERE price >= 50;
SELECT * FROM products WHERE price != 50;    -- or <>
SELECT * FROM employees WHERE department = 'Engineering';

-- AND / OR
SELECT * FROM employees
WHERE department = 'Sales' AND salary > 60000;

SELECT * FROM employees
WHERE department = 'Sales' OR department = 'Marketing';

-- IN — match against a list
SELECT * FROM employees
WHERE department IN ('Sales', 'Marketing', 'Support');

-- NOT IN
SELECT * FROM employees
WHERE department NOT IN ('Engineering', 'DevOps');

-- BETWEEN — inclusive range
SELECT * FROM orders
WHERE order_date BETWEEN '2025-01-01' AND '2025-12-31';

SELECT * FROM products
WHERE price BETWEEN 10 AND 50;

-- LIKE — pattern matching
SELECT * FROM customers WHERE email LIKE '%@gmail.com';
SELECT * FROM products WHERE name LIKE 'iPhone%';        -- starts with
SELECT * FROM products WHERE name LIKE '%Pro%';           -- contains
SELECT * FROM customers WHERE phone LIKE '555-____';      -- _ matches one character

-- IS NULL / IS NOT NULL
SELECT * FROM customers WHERE phone IS NULL;
SELECT * FROM customers WHERE email IS NOT NULL;

Operator precedence matters. Use parentheses to be explicit:

-- Without parentheses, AND binds tighter than OR — this may not do what you expect
SELECT * FROM employees
WHERE department = 'Sales' OR department = 'Marketing' AND salary > 80000;

-- With parentheses — clear intent
SELECT * FROM employees
WHERE (department = 'Sales' OR department = 'Marketing') AND salary > 80000;

Aggregation and Grouping

Aggregate functions compute a single result from a set of rows.

Function Description
COUNT(*) Number of rows
COUNT(column) Number of non-NULL values
COUNT(DISTINCT col) Number of unique non-NULL values
SUM(column) Total
AVG(column) Average
MIN(column) Minimum value
MAX(column) Maximum value
-- Basic aggregations
SELECT COUNT(*) AS total_orders FROM orders;
SELECT SUM(amount) AS total_revenue FROM orders;
SELECT AVG(salary) AS avg_salary FROM employees;
SELECT MIN(price) AS cheapest, MAX(price) AS most_expensive FROM products;

-- Count distinct values
SELECT COUNT(DISTINCT customer_id) AS unique_customers FROM orders;

GROUP BY splits rows into groups and applies aggregation to each group:

-- Revenue by department
SELECT department, SUM(salary) AS total_payroll
FROM employees
GROUP BY department;

-- Orders per month
SELECT
    EXTRACT(YEAR FROM order_date) AS year,
    EXTRACT(MONTH FROM order_date) AS month,
    COUNT(*) AS order_count,
    SUM(amount) AS revenue
FROM orders
GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
ORDER BY year, month;

-- Multiple aggregations
SELECT
    department,
    COUNT(*) AS headcount,
    ROUND(AVG(salary), 2) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

HAVING filters groups after aggregation (WHERE filters rows before):

-- Departments with average salary above 80k
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 80000;

-- Customers who placed more than 5 orders
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5
ORDER BY order_count DESC;

Execution order of a SQL query:

  1. FROM (and JOIN)
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. DISTINCT
  7. ORDER BY
  8. LIMIT / OFFSET

This is why you cannot use a column alias from SELECT in a WHERE clause — WHERE runs before SELECT.

Joins

Joins combine rows from two or more tables based on a related column.

INNER JOIN — returns only rows with matching keys in both tables:

Table A       Table B
+----+       +----+
| 1  |       | 1  |
| 2  |  ==>  | 3  |  ==>  Result: 1, 3
| 3  |       | 5  |
| 4  |       +----+
+----+
SELECT o.order_id, o.order_date, c.first_name, c.last_name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id;

LEFT JOIN — all rows from the left table, matching rows from the right (NULL if no match):

-- All customers and their orders (including customers with no orders)
SELECT c.first_name, c.last_name, o.order_id, o.amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;

-- Find customers who have never placed an order
SELECT c.first_name, c.last_name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;

RIGHT JOIN — all rows from the right table, matching rows from the left:

SELECT o.order_id, c.first_name
FROM customers c
RIGHT JOIN orders o ON c.customer_id = o.customer_id;

FULL OUTER JOIN — all rows from both tables, with NULLs where there is no match:

SELECT c.first_name, o.order_id
FROM customers c
FULL OUTER JOIN orders o ON c.customer_id = o.customer_id;

CROSS JOIN — every row from table A paired with every row from table B (Cartesian product):

-- Generate all combinations of sizes and colors
SELECT s.size_name, c.color_name
FROM sizes s
CROSS JOIN colors c;

Self-join — join a table to itself (useful for hierarchical data):

-- Find each employee and their manager
SELECT
    e.first_name AS employee,
    m.first_name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

Joining multiple tables:

SELECT
    o.order_id,
    c.first_name,
    p.product_name,
    oi.quantity,
    oi.unit_price
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id;

Subqueries

A subquery is a query nested inside another query.

Subquery in WHERE:

-- Employees who earn more than the company average
SELECT first_name, last_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

-- Products that have been ordered at least once
SELECT product_name
FROM products
WHERE product_id IN (SELECT DISTINCT product_id FROM order_items);

-- Customers whose most recent order is before 2025
SELECT first_name, last_name
FROM customers
WHERE customer_id NOT IN (
    SELECT customer_id
    FROM orders
    WHERE order_date >= '2025-01-01'
);

Subquery in FROM (derived table):

-- Average of department averages
SELECT AVG(dept_avg) AS avg_of_averages
FROM (
    SELECT department, AVG(salary) AS dept_avg
    FROM employees
    GROUP BY department
) dept_salaries;

Correlated subquery — references the outer query (runs once per outer row):

-- Employees who earn more than their department average
SELECT first_name, last_name, salary, department
FROM employees e
WHERE salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department = e.department
);

-- Most recent order for each customer
SELECT *
FROM orders o
WHERE order_date = (
    SELECT MAX(order_date)
    FROM orders
    WHERE customer_id = o.customer_id
);

EXISTS — check whether a subquery returns any rows:

-- Customers who have placed at least one order
SELECT first_name, last_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

Window Functions

Window functions perform calculations across a set of rows related to the current row, without collapsing rows into groups.

Syntax: function() OVER (PARTITION BY ... ORDER BY ...)

-- Row number within each department, ordered by salary
SELECT
    first_name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_in_dept
FROM employees;

-- Top 3 earners in each department
SELECT * FROM (
    SELECT
        first_name,
        department,
        salary,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
) ranked
WHERE rn <= 3;

Ranking functions:

Function Behavior for ties
ROW_NUMBER() Unique number for every row (no ties)
RANK() Same rank for ties, skips numbers (1, 2, 2, 4)
DENSE_RANK() Same rank for ties, no gaps (1, 2, 2, 3)
NTILE(n) Divides rows into n roughly equal groups
SELECT
    product_name,
    revenue,
    RANK() OVER (ORDER BY revenue DESC) AS revenue_rank,
    DENSE_RANK() OVER (ORDER BY revenue DESC) AS dense_revenue_rank,
    NTILE(4) OVER (ORDER BY revenue DESC) AS quartile
FROM products;

LAG and LEAD — access previous or next row values:

-- Compare each month's revenue to the previous month
SELECT
    month,
    revenue,
    LAG(revenue, 1) OVER (ORDER BY month) AS prev_month,
    revenue - LAG(revenue, 1) OVER (ORDER BY month) AS change
FROM monthly_revenue;

-- Next order date for each customer
SELECT
    customer_id,
    order_date,
    LEAD(order_date, 1) OVER (PARTITION BY customer_id ORDER BY order_date) AS next_order
FROM orders;

Running totals and moving averages:

-- Running total of revenue
SELECT
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;

-- Running total per customer
SELECT
    customer_id,
    order_date,
    amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS customer_running_total
FROM orders;

-- 3-month moving average
SELECT
    month,
    revenue,
    AVG(revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_3
FROM monthly_revenue;

Data Modification

INSERT:

-- Insert a single row
INSERT INTO customers (first_name, last_name, email)
VALUES ('Alice', 'Smith', 'alice@example.com');

-- Insert multiple rows
INSERT INTO customers (first_name, last_name, email)
VALUES
    ('Bob', 'Jones', 'bob@example.com'),
    ('Charlie', 'Lee', 'charlie@example.com'),
    ('Diana', 'Clark', 'diana@example.com');

-- Insert from a SELECT
INSERT INTO archived_orders
SELECT * FROM orders WHERE order_date < '2024-01-01';

UPDATE:

-- Update specific rows
UPDATE employees
SET salary = salary * 1.10
WHERE department = 'Engineering';

-- Update multiple columns
UPDATE products
SET price = 29.99, updated_at = CURRENT_TIMESTAMP
WHERE product_id = 42;

-- Update with a subquery
UPDATE employees
SET salary = salary * 1.05
WHERE department_id IN (
    SELECT department_id FROM departments WHERE name = 'Sales'
);

DELETE:

-- Delete specific rows
DELETE FROM orders WHERE order_date < '2020-01-01';

-- Delete all rows (careful)
DELETE FROM temp_table;

-- TRUNCATE — faster way to delete all rows (resets auto-increment)
TRUNCATE TABLE temp_table;

Always use a WHERE clause with UPDATE and DELETE unless you intentionally want to affect every row in the table.

Table Operations

-- Create a table
CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    department VARCHAR(50),
    salary DECIMAL(10, 2) DEFAULT 0,
    hire_date DATE NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create a table with a foreign key
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INTEGER NOT NULL REFERENCES customers(customer_id),
    order_date DATE NOT NULL,
    amount DECIMAL(10, 2)
);

-- Add a column
ALTER TABLE employees ADD COLUMN phone VARCHAR(20);

-- Drop a column
ALTER TABLE employees DROP COLUMN phone;

-- Rename a column
ALTER TABLE employees RENAME COLUMN salary TO annual_salary;

-- Change column type
ALTER TABLE employees ALTER COLUMN salary TYPE DECIMAL(12, 2);

-- Add a constraint
ALTER TABLE employees ADD CONSTRAINT salary_positive CHECK (salary >= 0);

-- Drop a table
DROP TABLE IF EXISTS temp_table;

-- Create an index for faster queries
CREATE INDEX idx_employees_department ON employees(department);
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);

Common Patterns and Recipes

Top-N Per Group

Find the top 3 highest-paid employees in each department:

SELECT * FROM (
    SELECT
        first_name,
        department,
        salary,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
) ranked
WHERE rn <= 3;

Running Totals

SELECT
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date ROWS UNBOUNDED PRECEDING) AS running_total
FROM orders;

Find Duplicates

SELECT email, COUNT(*) AS count
FROM customers
GROUP BY email
HAVING COUNT(*) > 1;

Year-Over-Year Comparison

SELECT
    curr.month,
    curr.revenue AS this_year,
    prev.revenue AS last_year,
    ROUND((curr.revenue - prev.revenue) / prev.revenue * 100, 1) AS yoy_pct
FROM monthly_revenue curr
JOIN monthly_revenue prev
    ON curr.month_num = prev.month_num
    AND curr.year = prev.year + 1;

Conditional Aggregation with CASE

SELECT
    department,
    COUNT(*) AS total,
    COUNT(CASE WHEN salary > 80000 THEN 1 END) AS high_earners,
    COUNT(CASE WHEN salary <= 80000 THEN 1 END) AS others,
    ROUND(AVG(CASE WHEN gender = 'F' THEN salary END), 2) AS avg_female_salary,
    ROUND(AVG(CASE WHEN gender = 'M' THEN salary END), 2) AS avg_male_salary
FROM employees
GROUP BY department;

Gaps and Islands

Find gaps in a sequence (missing IDs):

SELECT
    curr.id + 1 AS gap_start,
    next.id - 1 AS gap_end
FROM numbers curr
JOIN numbers next ON next.id = (
    SELECT MIN(id) FROM numbers WHERE id > curr.id
)
WHERE next.id - curr.id > 1;

Find consecutive date ranges (islands):

WITH grouped AS (
    SELECT
        event_date,
        event_date - INTERVAL '1 day' * ROW_NUMBER() OVER (ORDER BY event_date) AS grp
    FROM events
)
SELECT
    MIN(event_date) AS start_date,
    MAX(event_date) AS end_date,
    COUNT(*) AS consecutive_days
FROM grouped
GROUP BY grp
ORDER BY start_date;

CTEs (Common Table Expressions)

CTEs make complex queries readable by breaking them into named steps:

WITH department_stats AS (
    SELECT
        department,
        AVG(salary) AS avg_salary,
        COUNT(*) AS headcount
    FROM employees
    GROUP BY department
),
high_paying AS (
    SELECT * FROM department_stats WHERE avg_salary > 80000
)
SELECT
    e.first_name,
    e.last_name,
    e.salary,
    e.department,
    d.avg_salary AS dept_avg
FROM employees e
JOIN high_paying d ON e.department = d.department
WHERE e.salary > d.avg_salary
ORDER BY e.salary DESC;

Quick Reference

Task Syntax
Select all SELECT * FROM table
Filter WHERE condition
Sort ORDER BY col DESC
Limit rows LIMIT 10
Remove duplicates SELECT DISTINCT col
Count rows SELECT COUNT(*)
Group and aggregate GROUP BY col
Filter groups HAVING condition
Inner join JOIN t2 ON t1.id = t2.id
Left join LEFT JOIN t2 ON t1.id = t2.id
Subquery WHERE col IN (SELECT ...)
Row numbering ROW_NUMBER() OVER (ORDER BY col)
Running total SUM(col) OVER (ORDER BY col)
Previous row LAG(col) OVER (ORDER BY col)
Insert row INSERT INTO t (cols) VALUES (vals)
Update rows UPDATE t SET col = val WHERE ...
Delete rows DELETE FROM t WHERE ...
Create table CREATE TABLE t (col TYPE, ...)
Add index CREATE INDEX idx ON t(col)

This cheat sheet gives you the SQL vocabulary to handle the vast majority of data questions you will face. Keep it bookmarked, practice with real data, and you will find that fluent SQL is one of the most versatile skills you can have.

Learn data analysis with Python in our free Python for Business Beginners textbook.