Advanced Track
Window Functions Mastery
By the end of this lesson you'll write analytics-grade SQL: running totals, rankings, month-over-month comparisons, and moving averages — all while keeping every detail row. The secret is the OVER (...) clause and, above all, the window frame that decides exactly which rows each calculation can see.
What You'll Learn
- ✓Why window functions keep rows that GROUP BY would collapse
- ✓Drive a window with OVER (PARTITION BY ... ORDER BY ...)
- ✓Rank rows with ROW_NUMBER, RANK, DENSE_RANK and NTILE
- ✓Compare neighbouring rows with LAG and LEAD
- ✓Build running totals and moving averages as windows
- ✓Control precisely which rows count using ROWS BETWEEN frames
GROUP BY aggregates, and subqueries/CTEs (WITH ...). Window functions build directly on those ideas. The in-browser editor lets you write and edit SQL; to run it, copy your query into a free playground like sqliteonline.com or db-fiddle.com. Every example below shows the expected result so you can check yourself. Our Sample Table: sales
Every query in this lesson runs against this tiny sales table — six rows across two regions, one row per region per month. It's small on purpose: you can add the numbers up by hand and confirm each result yourself.
Result:
| id | region | month | amount |
|---|---|---|---|
| 1 | East | 1 | 100 |
| 2 | East | 2 | 150 |
| 3 | East | 3 | 120 |
| 4 | West | 1 | 200 |
| 5 | West | 2 | 180 |
| 6 | West | 3 | 250 |
To follow along in a playground, create it with: CREATE TABLE sales (id INT, region TEXT, month INT, amount INT); then insert the six rows above.
1. Window Functions vs GROUP BY
A window function performs a calculation across a set of rows that are related to the current row — its "window" — and returns a value on every row. That's the crucial difference from GROUP BY, which squashes each group down to a single summary row. You reach for window functions whenever you want detail and context in the same result: every sale, with its region's total beside it.
🚆 Real-world analogy
A window function is like looking through a moving train window: the train (your result) keeps all its carriages (rows), but at each seat you can see a few rows of passing scenery. GROUP BY is more like getting off and counting the carriages from the platform — you lose your seat-by-seat view.
GROUP BY. If it's "every row, plus how it compares to its group", use a window function.GROUP BY vs a window
The same total, collapsed vs attached to every row.
-- The key idea: a window function adds a value to EVERY row
-- instead of collapsing rows the way GROUP BY does.
-- GROUP BY collapses 6 rows into 2 summary rows:
SELECT region, SUM(amount) AS region_total
FROM sales
GROUP BY region;
-- → East | 370 , West | 630 (the id/month/amount detail is GONE)
-- A window function keeps all 6 rows AND attaches the total:
SELECT id, region, month, amount,
SUM(amount) OVER (PARTITION BY region) AS region_total
FROM sales;
-- → every row survives,
...Result — window version — all 6 rows survive:
| id | region | month | amount | region_total |
|---|---|---|---|---|
| 1 | East | 1 | 100 | 370 |
| 2 | East | 2 | 150 | 370 |
| 3 | East | 3 | 120 | 370 |
| 4 | West | 1 | 200 | 630 |
| 5 | West | 2 | 180 | 630 |
| 6 | West | 3 | 250 | 630 |
2. PARTITION BY and ORDER BY — Steering the Window
Two clauses inside OVER (...) do the steering. PARTITION BY splits the rows into independent groups — the calculation restarts from scratch for each one (here, East is computed entirely separately from West). ORDER BY sets the running order inside each group, and it changes what an aggregate means: the moment you add ORDER BY to SUM() OVER, you get a running total rather than a flat group total.
PARTITION BY + ORDER BY → running total
A per-region running total by month.
-- PARTITION BY restarts the calculation for each group.
-- ORDER BY decides the running order inside that group.
SELECT id, region, month, amount,
SUM(amount) OVER (
PARTITION BY region -- East and West are calculated separately
ORDER BY month -- accumulate month 1, then 2, then 3
) AS running_total
FROM sales
ORDER BY region, month;
-- With ORDER BY present, SUM() OVER becomes a RUNNING total:
-- it adds up every row from the start of the partit
...Result — running_total per region, by month:
| region | month | amount | running_total |
|---|---|---|---|
| East | 1 | 100 | 100 |
| East | 2 | 150 | 250 |
| East | 3 | 120 | 370 |
| West | 1 | 200 | 200 |
| West | 2 | 180 | 380 |
| West | 3 | 250 | 630 |
Read the East rows top to bottom: 100, then 100+150=250, then 250+120=370. West restarts at 200 because PARTITION BY region gives it its own fresh window.
Your Turn: a running total per region
Fill in the two blanks so the total accumulates month by month, separately for each region. The expected numbers are in the comments so you can self-check.
🎯 Your Turn: running total
Complete the PARTITION BY and ORDER BY.
-- 🎯 YOUR TURN — fill in the blanks, then press "Try it Yourself"
-- Goal: a running total of amount for each region, by month.
SELECT region, month, amount,
SUM(amount) OVER (
PARTITION BY ___ -- 👉 the column that splits East from West
ORDER BY ___ -- 👉 the column that sets the accumulation order
) AS running_total
FROM sales
ORDER BY region, month;
-- ✅ Expected: East → 100, 250, 370 ; West → 200, 380, 6303. Ranking — ROW_NUMBER, RANK, DENSE_RANK, NTILE
Ranking functions assign a position to each row within its partition, in the order you specify. They differ only in how they treat ties (rows with equal ordering values):
ROW_NUMBER()— a unique number for every row, even on ties (an arbitrary winner is chosen).RANK()— ties share a rank, then the next rank skips (1, 2, 2, 4).DENSE_RANK()— ties share a rank with no gap after (1, 2, 2, 3).NTILE(n)— splits the partition intonroughly equal buckets (e.g. quartiles).
Ranking functions
ROW_NUMBER, RANK, DENSE_RANK and NTILE per region.
-- Ranking functions number rows inside each partition.
-- Here: rank regions' months from highest amount to lowest.
SELECT region, month, amount,
ROW_NUMBER() OVER (
PARTITION BY region ORDER BY amount DESC
) AS row_num,
RANK() OVER (
PARTITION BY region ORDER BY amount DESC
) AS rnk,
DENSE_RANK() OVER (
PARTITION BY region ORDER BY amount DESC
) AS dense_rnk,
NTILE(2) OVER (
PARTITION BY region ORDER
...Result — all amounts distinct, so the three ranks agree:
| region | month | amount | row_num | rnk | dense_rnk | half |
|---|---|---|---|---|---|---|
| East | 2 | 150 | 1 | 1 | 1 | 1 |
| East | 3 | 120 | 2 | 2 | 2 | 1 |
| East | 1 | 100 | 3 | 3 | 3 | 2 |
| West | 3 | 250 | 1 | 1 | 1 | 1 |
| West | 1 | 200 | 2 | 2 | 2 | 1 |
| West | 2 | 180 | 3 | 3 | 3 | 2 |
90, 85, 85, 70. The functions diverge — this is the whole reason three of them exist:Result — how the ranks differ on a tie (85, 85):
| score | ROW_NUMBER | RANK | DENSE_RANK |
|---|---|---|---|
| 90 | 1 | 1 | 1 |
| 85 | 2 | 2 | 2 |
| 85 | 3 | 2 | 2 |
| 70 | 4 | 4 | 3 |
Your Turn: rank each region's months
One blank: choose the ranking function that skips numbers after a tie. The PARTITION BY region ORDER BY amount DESC is already wired up for you.
🎯 Your Turn: RANK by amount
Rank highest amount = 1 within each region.
-- 🎯 YOUR TURN — fill in the blank.
-- Goal: rank each region's months from biggest amount to smallest.
SELECT region, month, amount,
___ OVER ( -- 👉 the ranking function (skips numbers on ties)
PARTITION BY region ORDER BY amount DESC
) AS amount_rank
FROM sales
ORDER BY region, amount_rank;
-- ✅ Expected:
-- East: 150 → 1, 120 → 2, 100 → 3
-- West: 250 → 1, 200 → 2, 180 → 34. LAG & LEAD — Looking at Neighbouring Rows
LAG(col, n) reaches back to the row n positions earlier; LEAD(col, n) reaches forward to the row n positions later (n defaults to 1). They walk the order you give in OVER (...), restarting at each partition boundary. This is how you compare a row to the one before it — month-over-month change, period-over-period growth, gaps between events.
LAG returns NULL there (and LEAD returns NULL on the last row). Pass a third argument as a fallback: LAG(amount, 1, 0) uses 0 when there's no prior row.LAG & LEAD
Previous and next month's amount, per region.
-- LAG looks at an EARLIER row; LEAD looks at a LATER row.
-- Both walk the order you give them, per partition.
SELECT region, month, amount,
-- amount of the PREVIOUS month in this region
LAG(amount, 1) OVER (
PARTITION BY region ORDER BY month
) AS prev_month,
-- amount of the NEXT month in this region
LEAD(amount, 1) OVER (
PARTITION BY region ORDER BY month
) AS next_month
FROM sales
ORDER BY region, month;
-- The first row of
...Result — prev/next month within each region:
| region | month | amount | prev_month | next_month |
|---|---|---|---|---|
| East | 1 | 100 | NULL | 150 |
| East | 2 | 150 | 100 | 120 |
| East | 3 | 120 | 150 | NULL |
| West | 1 | 200 | NULL | 180 |
| West | 2 | 180 | 200 | 250 |
| West | 3 | 250 | 180 | NULL |
5. Aggregates as Windows — Moving Averages
Any aggregate — SUM, AVG, COUNT, MIN, MAX — can run as a window. You already built a running SUM. Swap in AVG with a small frame and you get a moving average: a smoothed value over the last few rows. The frame ROWS BETWEEN 1 PRECEDING AND CURRENT ROW means "this row and the one immediately before it" — a 2-row window that slides down the data.
2-month moving average
AVG over the current and previous row.
-- A window FRAME picks exactly which rows feed the aggregate.
-- ROWS BETWEEN 1 PRECEDING AND CURRENT ROW = "this row + the one before".
SELECT region, month, amount,
ROUND(AVG(amount) OVER (
PARTITION BY region
ORDER BY month
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
), 2) AS moving_avg_2m
FROM sales
ORDER BY region, month;
-- This is a 2-month moving average that slides down the data.
-- Month 1 averages just itself (no earlier row to include).Result — moving_avg_2m per region:
| region | month | amount | moving_avg_2m |
|---|---|---|---|
| East | 1 | 100 | 100 |
| East | 2 | 150 | 125 |
| East | 3 | 120 | 135 |
| West | 1 | 200 | 200 |
| West | 2 | 180 | 190 |
| West | 3 | 250 | 215 |
East month 2 averages (100+150)/2 = 125; month 3 averages (150+120)/2 = 135. Month 1 averages only itself because there's no earlier row in its frame.
6. Window Frames — The Part Everyone Skips
The frame is the exact slice of the partition a function can see from the current row. You write it with ROWS BETWEEN <start> AND <end>, where the bounds are:
UNBOUNDED PRECEDING— the first row of the partition.n PRECEDING—nrows before the current row.CURRENT ROW— the row being calculated.n FOLLOWING—nrows after the current row.UNBOUNDED FOLLOWING— the last row of the partition.
So ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is a running total, and ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is the whole-partition total. (RANGE and GROUPS are sibling frame types that count by value or by ties instead of physical rows — reach for them later; ROWS covers the vast majority of cases.)
ORDER BY but omit the frame, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW — i.e. a running total, not the whole partition. So LAST_VALUE() with only an ORDER BY returns the current row, not the final one. To get the true last value, spell out ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.Frames side by side
Running total vs whole-partition total.
-- The frame is the rows the function can "see" from the current row.
-- UNBOUNDED PRECEDING = the very first row of the partition.
-- CURRENT ROW = the row being calculated.
-- UNBOUNDED FOLLOWING = the very last row of the partition.
SELECT region, month, amount,
-- Explicit running total (start-of-partition → here):
SUM(amount) OVER (
PARTITION BY region ORDER BY month
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
-- Whole-parti
...Result — same SUM, two different frames:
| region | month | amount | running_total | region_total |
|---|---|---|---|---|
| East | 1 | 100 | 100 | 370 |
| East | 2 | 150 | 250 | 370 |
| East | 3 | 120 | 370 | 370 |
| West | 1 | 200 | 200 | 630 |
| West | 2 | 180 | 380 | 630 |
| West | 3 | 250 | 630 | 630 |
Common Errors (and the fix)
- Window function in
WHERE:WHERE ROW_NUMBER() OVER (...) = 1errors with "window functions are not allowed in WHERE". Window functions run afterWHERE, so wrap the query in a CTE or subquery and filter on the alias:WITH ranked AS (SELECT ..., ROW_NUMBER() OVER (...) AS rn FROM sales) SELECT * FROM ranked WHERE rn = 1; - Missing
ORDER BYin a running total:SUM(amount) OVER (PARTITION BY region)with noORDER BYgives every row the same flat total (370/630), not an accumulating one. AddORDER BY monthto make it run. - Expecting
RANKto be gapless: after a tie,RANKskips (…2, 2, 4…). If you need continuous numbers with no gaps, useDENSE_RANK(…2, 2, 3…). They are not interchangeable. - The
LAST_VALUEdefault-frame trap:LAST_VALUE(x) OVER (ORDER BY month)returns the current row's value because the default frame ends atCURRENT ROW. AddROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGto reach the real last row. - Doing
NULLmaths afterLAG: the first row'sLAGisNULL, and150 - NULL = NULL. Use a default (LAG(amount, 1, 0)) or wrap withCOALESCE(...)if you don't want aNULLresult.
📘 Quick Reference
| Syntax | Purpose |
|---|---|
| fn(...) OVER (...) | Run fn over a window of rows; keep every row |
| PARTITION BY col | Restart the calculation per group |
| ORDER BY col | Order rows in the window (turns SUM into a running total) |
| ROW_NUMBER() | Unique 1,2,3… (no ties) |
| RANK() / DENSE_RANK() | Rank with gaps / without gaps on ties |
| NTILE(n) | Split the partition into n buckets |
| LAG(col,n) / LEAD(col,n) | Value n rows back / forward |
| ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | Running total frame |
| ROWS BETWEEN 1 PRECEDING AND CURRENT ROW | 2-row moving window |
| ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | Whole partition |
Frequently Asked Questions
Q: When should I use a window function instead of GROUP BY?
Use a window function when you need every detail row and a group-level value beside it (each sale plus its region total, or each row's rank). Use GROUP BY when you only want one summary row per group.
Q: Why can't I put a window function in WHERE or GROUP BY?
Because window functions are evaluated after WHERE, GROUP BY and HAVING — only SELECT and ORDER BY can use them directly. To filter on one, compute it in a CTE/subquery first, then filter on its alias.
Q: RANK vs DENSE_RANK vs ROW_NUMBER — which do I want?
ROW_NUMBER for a strict 1-per-row sequence (e.g. "the single top row"). RANK when ties should share a place and the next place may skip ("joint 2nd, then 4th"). DENSE_RANK when you want tier numbers with no gaps.
Q: Do I always need to write the frame out?
No — for running totals the default already gives you start-of-partition → current row. But spell the frame out whenever you use LAST_VALUE, a moving average, or any "whole partition" calc, so the default doesn't quietly cut your window short.
Mini-Challenge: Month-over-Month Change
Support is faded now — a brief, a blank canvas, and the expected answer in the comments. Write it, then copy it into a playground to confirm.
🎯 Mini-Challenge
Use LAG to show each region's month-over-month change.
-- 🎯 MINI-CHALLENGE — month-over-month change with LAG
-- Using only what this lesson covered:
-- 1. Show region, month, amount.
-- 2. Add a column "change" = this month's amount minus
-- the PREVIOUS month's amount in the same region.
-- 3. Order by region, then month.
-- Hint: LAG(amount) OVER (PARTITION BY ... ORDER BY ...)
--
-- ✅ Expected "change" column:
-- East: NULL, 50, -30
-- West: NULL, -20, 70
-- your query here🎉 Lesson Complete
- ✅ Window functions add a value to every row instead of collapsing them like
GROUP BY - ✅
OVER (PARTITION BY ... ORDER BY ...)steers the window; addingORDER BYmakesSUMa running total - ✅
ROW_NUMBER/RANK/DENSE_RANK/NTILErank rows and differ only on ties - ✅
LAGandLEADreach previous and next rows for period-over-period comparisons - ✅ The frame (
ROWS BETWEEN ...) decides exactly which rows a calculation sees — and the default can surprise you - ✅ Next: package logic on the server with Stored Procedures Advanced
Sign up for free to track which lessons you've completed and get learning reminders.