Window Functions Mastery
Master PARTITION BY, ranking functions, LAG/LEAD, running totals, and window frames for analytics-grade SQL.
๐ฏ What You'll Learn
- How window functions differ from GROUP BY
- ROW_NUMBER, RANK, DENSE_RANK, and NTILE
- LAG/LEAD for time-series and row comparisons
- Running totals, cumulative percentages, and FIRST/LAST_VALUE
- Frame specifications: ROWS, RANGE, and GROUPS
๐ช What Are Window Functions?
Imagine looking through a sliding window on a train โ you see a few rows of scenery at a time, but the train (your result set) keeps all its rows. That's what window functions do: they compute across a "window" of related rows without collapsing them.
๐ก Pro Tip โ Window vs GROUP BY
GROUP BY returns one row per group. Window functions return every row with the computed value attached. Use GROUP BY when you want summary rows; use window functions when you need detail + context.
Window Functions Basics
PARTITION BY and OVER clause fundamentals
-- Window functions compute values across a "window" of rows
-- WITHOUT collapsing them like GROUP BY does
-- GROUP BY collapses rows:
SELECT department, AVG(salary) FROM employees GROUP BY department;
-- Returns one row per department
-- Window function keeps every row:
SELECT name, department, salary,
AVG(salary) OVER (PARTITION BY department) AS dept_avg,
salary - AVG(salary) OVER (PARTITION BY department) AS vs_dept_avg
FROM employees;
-- Returns every employee WITH their dep
...๐ Ranking Functions
Ranking functions assign positions to rows within each partition. The difference between them is how they handle ties:
Ranking Functions
ROW_NUMBER, RANK, DENSE_RANK, and NTILE
-- Ranking functions assign positions within partitions
SELECT name, department, salary,
-- ROW_NUMBER: unique sequential number (no ties)
ROW_NUMBER() OVER (
PARTITION BY department ORDER BY salary DESC
) AS row_num,
-- RANK: same rank for ties, skips next
-- e.g., 1, 2, 2, 4 (skips 3)
RANK() OVER (
PARTITION BY department ORDER BY salary DESC
) AS rank,
-- DENSE_RANK: same rank for ties, no gaps
-- e.g., 1, 2, 2, 3 (no skip)
DENSE_RANK
...โช LAG & LEAD โ Time Travel
LAG looks backward; LEAD looks forward. These are essential for comparing a row to its predecessor or successor โ month-over-month growth, stock price changes, session duration.
โ ๏ธ Common Mistake
Forgetting that LAG returns NULL for the first row (no predecessor). Use the third argument as a default: LAG(col, 1, 0) to avoid NULL arithmetic errors.
LAG, LEAD & Moving Averages
Time-series comparisons and sliding calculations
-- LAG and LEAD access previous/next rows
-- Perfect for time-series comparisons
SELECT order_date,
revenue,
-- Previous month's revenue
LAG(revenue, 1) OVER (ORDER BY order_date) AS prev_month,
-- Next month's revenue
LEAD(revenue, 1) OVER (ORDER BY order_date) AS next_month,
-- Month-over-month growth
ROUND(
(revenue - LAG(revenue, 1) OVER (ORDER BY order_date))
/ LAG(revenue, 1) OVER (ORDER BY order_date) * 100,
...๐ Running Totals & Cumulative Stats
Running totals accumulate values row by row โ think of a bank balance that adds each transaction. Combined with the total (no ORDER BY), you can compute cumulative percentages for Pareto analysis.
Running Totals & FIRST/LAST_VALUE
Cumulative sums, counts, and percentages
-- Running totals and cumulative calculations
SELECT order_date, customer_id, amount,
-- Running total (all rows up to current)
SUM(amount) OVER (
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
-- Running total per customer
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS customer_running_total,
-- Cumulative percentage of total revenue
ROUND(
SUM(amount) OVER
...๐ผ๏ธ Window Frames โ Fine-Grained Control
The frame clause determines exactly which rows are included in the window calculation. It's the difference between "all previous rows" and "the last 3 rows".
Window Frame Specifications
ROWS, RANGE, and sliding window boundaries
-- Window frames control EXACTLY which rows are included
-- ROWS: physical row count
-- RANGE: logical value range
-- GROUPS: distinct value groups (SQL:2011)
-- 3-day moving average (physical rows)
SELECT sale_date, revenue,
AVG(revenue) OVER (
ORDER BY sale_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS avg_3day_rows
FROM daily_sales;
-- Value-based range: all sales within ยฑ$1000
SELECT name, salary,
COUNT(*) OVER (
ORDER BY salary
RANGE BETWEE
...๐ Quick Reference
| Function | Purpose |
|---|---|
| ROW_NUMBER() | Unique sequential number |
| RANK() | Rank with gaps on ties |
| DENSE_RANK() | Rank without gaps |
| LAG(col, n) | Value n rows before |
| LEAD(col, n) | Value n rows after |
| SUM() OVER | Running total / partition sum |
๐ Lesson Complete!
You've mastered window functions โ one of the most powerful features in SQL. Next, explore advanced stored procedures and triggers!
Sign up for free to track which lessons you've completed and get learning reminders.