SQL Cheat Sheet – From Beginner to Advanced
Beginner
SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT.
SELECT & FROM
Read columns from one or more tables.
SQL
SELECT id, name, amount FROM main.default.sales;
List all sales with id, name, and amount from the sales table.
💡 Always qualify table names (catalog.schema.table) in Databricks.
WHERE
Filter rows before grouping.
SQL
SELECT * FROM main.default.orders WHERE year = 2024 AND status = 'COMPLETED';
Only completed orders for 2024.
💡 Use AND/OR and put the most selective condition first when possible.
GROUP BY & HAVING
Group rows and filter groups after aggregation.
SQL
SELECT region, SUM(amount) AS total FROM main.default.sales GROUP BY region HAVING SUM(amount) > 10000;
Regions with total sales over 10k.
💡 HAVING filters after GROUP BY; WHERE filters before.
ORDER BY & LIMIT
Sort results and cap rows returned.
SQL
SELECT * FROM main.default.products ORDER BY price DESC LIMIT 10;
Top 10 most expensive products.
Aggregations
SUM, COUNT, AVG, MIN, MAX — often with GROUP BY.
COUNT & SUM
Count rows or sum a numeric column.
SQL
SELECT COUNT(*), COUNT(email), SUM(amount) FROM main.default.orders;
AVG, MIN, MAX
Average, smallest, or largest value.
SQL
SELECT AVG(price), MIN(created_at), MAX(amount) FROM main.default.products;
Intermediate
LIKE, AND/OR, CASE, IN, UNION, BETWEEN, CAST, COALESCE.
LIKE & IN
Pattern match and membership tests.
SQL
SELECT * FROM customers WHERE name LIKE '%Acme%' AND region IN ('EU', 'US');
CASE & COALESCE
Conditional logic and null replacement.
SQL
SELECT id, CASE WHEN amount > 100 THEN 'High' ELSE 'Low' END, COALESCE(email, 'N/A') FROM users;
UNION & BETWEEN
Combine result sets; filter a range.
SQL
SELECT id, name FROM table_a UNION ALL SELECT id, name FROM table_b; -- BETWEEN is inclusive: WHERE date BETWEEN '2024-01-01' AND '2024-12-31'
Advanced
CTEs, Subqueries, Window Functions, Joins, Rank.
CTE (WITH)
Name a subquery and reuse it in the main query.
SQL
WITH top_customers AS ( SELECT customer_id, SUM(total) AS spent FROM orders GROUP BY customer_id HAVING SUM(total) > 1000 ) SELECT * FROM top_customers ORDER BY spent DESC;
💡 CTEs make complex queries readable and reusable.
Window functions
Compute over a partition without collapsing rows.
SQL
SELECT customer_id, order_date, total, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn FROM main.default.orders;
Subquery IN / EXISTS
Filter by the result of another query.
SQL
SELECT * FROM orders o WHERE EXISTS ( SELECT 1 FROM customers c WHERE c.id = o.customer_id AND c.country = 'US' );
Joins
INNER JOIN
Only rows that match in both tables.
SQL
SELECT o.id, c.name, o.total FROM main.default.orders o JOIN main.default.customers c ON o.customer_id = c.id;
LEFT JOIN
All left rows; match right or NULLs.
SQL
SELECT c.id, c.name, o.total FROM main.default.customers c LEFT JOIN main.default.orders o ON o.customer_id = c.id;
Window & Rank
RANK / DENSE_RANK
Rank rows within a partition; ties get same rank.
SQL
SELECT region, product, revenue, RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS rk FROM sales;
Real Patterns
15 practical scenarios: dedup, top-N, running total, etc.
Top N per group
Keep only the top N rows per category using ROW_NUMBER.
SQL
WITH ranked AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS rn FROM products ) SELECT * FROM ranked WHERE rn <= 5;
Top 5 products per category by sales.
Remove duplicates
Keep one row per key using ROW_NUMBER and filter rn = 1.
SQL
WITH dedup AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY updated_at DESC) AS rn FROM events ) SELECT * FROM dedup WHERE rn = 1;
Latest event per user (deduplicate by user_id).
Running total
Cumulative sum with SUM() OVER (ORDER BY …).
SQL
SELECT date, amount, SUM(amount) OVER (ORDER BY date ROWS UNBOUNDED PRECEDING) AS running_total FROM daily_sales;
DDL
CREATE TABLE
SQL
CREATE TABLE main.default.customers ( id BIGINT, name STRING, created_at TIMESTAMP );
ALTER / DROP / TRUNCATE
SQL
ALTER TABLE main.default.customers ADD COLUMN email STRING; DROP TABLE IF EXISTS main.default.old_table; TRUNCATE TABLE main.default.staging_events;
