Courses/SQL/Window Functions

Advanced Track

Window Functions Mastery

By the end of this lesson you'll write analytics-grade SQL: running totals, rankings, month-over-month comparisons, and moving averages — all while keeping every detail row. The secret is the OVER (...) clause and, above all, the window frame that decides exactly which rows each calculation can see.

What You'll Learn

✓Why window functions keep rows that GROUP BY would collapse
✓Drive a window with OVER (PARTITION BY ... ORDER BY ...)
✓Rank rows with ROW_NUMBER, RANK, DENSE_RANK and NTILE
✓Compare neighbouring rows with LAG and LEAD
✓Build running totals and moving averages as windows
✓Control precisely which rows count using ROWS BETWEEN frames

Before you start: you should be comfortable with JOINs, GROUP BY aggregates, and subqueries/CTEs (WITH ...). Window functions build directly on those ideas. The in-browser editor lets you write and edit SQL; to run it, copy your query into a free playground like sqliteonline.com or db-fiddle.com. Every example below shows the expected result so you can check yourself.

Our Sample Table: `sales`

Every query in this lesson runs against this tiny sales table — six rows across two regions, one row per region per month. It's small on purpose: you can add the numbers up by hand and confirm each result yourself.

Result:

id	region	month	amount
1	East	1	100
2	East	2	150
3	East	3	120
4	West	1	200
5	West	2	180
6	West	3	250

To follow along in a playground, create it with: CREATE TABLE sales (id INT, region TEXT, month INT, amount INT); then insert the six rows above.

1. Window Functions vs GROUP BY

A window function performs a calculation across a set of rows that are related to the current row — its "window" — and returns a value on every row. That's the crucial difference from GROUP BY, which squashes each group down to a single summary row. You reach for window functions whenever you want detail and context in the same result: every sale, with its region's total beside it.

🚆 Real-world analogy

A window function is like looking through a moving train window: the train (your result) keeps all its carriages (rows), but at each seat you can see a few rows of passing scenery. GROUP BY is more like getting off and counting the carriages from the platform — you lose your seat-by-seat view.

Rule of thumb: if the question is "one number per group", use GROUP BY. If it's "every row, plus how it compares to its group", use a window function.

GROUP BY vs a window

The same total, collapsed vs attached to every row.

Try it Yourself »

SQL

-- The key idea: a window function adds a value to EVERY row
-- instead of collapsing rows the way GROUP BY does.

-- GROUP BY collapses 6 rows into 2 summary rows:
SELECT region, SUM(amount) AS region_total
FROM sales
GROUP BY region;
-- → East | 370 ,  West | 630   (the id/month/amount detail is GONE)

-- A window function keeps all 6 rows AND attaches the total:
SELECT id, region, month, amount,
       SUM(amount) OVER (PARTITION BY region) AS region_total
FROM sales;
-- → every row survives,
...

Result — window version — all 6 rows survive:

id	region	month	amount	region_total
1	East	1	100	370
2	East	2	150	370
3	East	3	120	370
4	West	1	200	630
5	West	2	180	630
6	West	3	250	630

2. PARTITION BY and ORDER BY — Steering the Window

Two clauses inside OVER (...) do the steering. PARTITION BY splits the rows into independent groups — the calculation restarts from scratch for each one (here, East is computed entirely separately from West). ORDER BY sets the running order inside each group, and it changes what an aggregate means: the moment you add ORDER BY to SUM() OVER, you get a running total rather than a flat group total.

PARTITION BY + ORDER BY → running total

A per-region running total by month.

Try it Yourself »

SQL

-- PARTITION BY restarts the calculation for each group.
-- ORDER BY decides the running order inside that group.

SELECT id, region, month, amount,
       SUM(amount) OVER (
           PARTITION BY region    -- East and West are calculated separately
           ORDER BY month         -- accumulate month 1, then 2, then 3
       ) AS running_total
FROM sales
ORDER BY region, month;

-- With ORDER BY present, SUM() OVER becomes a RUNNING total:
-- it adds up every row from the start of the partit
...

Result — running_total per region, by month:

region	month	amount	running_total
East	1	100	100
East	2	150	250
East	3	120	370
West	1	200	200
West	2	180	380
West	3	250	630

Read the East rows top to bottom: 100, then 100+150=250, then 250+120=370. West restarts at 200 because PARTITION BY region gives it its own fresh window.

Your Turn: a running total per region

Fill in the two blanks so the total accumulates month by month, separately for each region. The expected numbers are in the comments so you can self-check.

🎯 Your Turn: running total

Complete the PARTITION BY and ORDER BY.

Try it Yourself »

SQL

-- 🎯 YOUR TURN — fill in the blanks, then press "Try it Yourself"
-- Goal: a running total of amount for each region, by month.

SELECT region, month, amount,
       SUM(amount) OVER (
           PARTITION BY ___      -- 👉 the column that splits East from West
           ORDER BY ___          -- 👉 the column that sets the accumulation order
       ) AS running_total
FROM sales
ORDER BY region, month;

-- ✅ Expected: East → 100, 250, 370 ; West → 200, 380, 630

3. Ranking — ROW_NUMBER, RANK, DENSE_RANK, NTILE

Ranking functions assign a position to each row within its partition, in the order you specify. They differ only in how they treat ties (rows with equal ordering values):

ROW_NUMBER() — a unique number for every row, even on ties (an arbitrary winner is chosen).
RANK() — ties share a rank, then the next rank skips (1, 2, 2, 4).
DENSE_RANK() — ties share a rank with no gap after (1, 2, 2, 3).
NTILE(n) — splits the partition into n roughly equal buckets (e.g. quartiles).

Ranking functions

ROW_NUMBER, RANK, DENSE_RANK and NTILE per region.

Try it Yourself »

SQL

-- Ranking functions number rows inside each partition.
-- Here: rank regions' months from highest amount to lowest.

SELECT region, month, amount,
       ROW_NUMBER() OVER (
           PARTITION BY region ORDER BY amount DESC
       ) AS row_num,
       RANK() OVER (
           PARTITION BY region ORDER BY amount DESC
       ) AS rnk,
       DENSE_RANK() OVER (
           PARTITION BY region ORDER BY amount DESC
       ) AS dense_rnk,
       NTILE(2) OVER (
           PARTITION BY region ORDER 
...

Result — all amounts distinct, so the three ranks agree:

region	month	amount	row_num	rnk	dense_rnk	half
East	2	150	1	1	1	1
East	3	120	2	2	2	1
East	1	100	3	3	3	2
West	3	250	1	1	1	1
West	1	200	2	2	2	1
West	2	180	3	3	3	2

What if there's a tie? Imagine four scores ordered descending: 90, 85, 85, 70. The functions diverge — this is the whole reason three of them exist:

Result — how the ranks differ on a tie (85, 85):

score	ROW_NUMBER	RANK	DENSE_RANK
90	1	1	1
85	2	2	2
85	3	2	2
70	4	4	3

Your Turn: rank each region's months

One blank: choose the ranking function that skips numbers after a tie. The PARTITION BY region ORDER BY amount DESC is already wired up for you.

🎯 Your Turn: RANK by amount

Rank highest amount = 1 within each region.

Try it Yourself »

SQL

-- 🎯 YOUR TURN — fill in the blank.
-- Goal: rank each region's months from biggest amount to smallest.

SELECT region, month, amount,
       ___ OVER (                -- 👉 the ranking function (skips numbers on ties)
           PARTITION BY region ORDER BY amount DESC
       ) AS amount_rank
FROM sales
ORDER BY region, amount_rank;

-- ✅ Expected:
--   East: 150 → 1, 120 → 2, 100 → 3
--   West: 250 → 1, 200 → 2, 180 → 3

4. LAG & LEAD — Looking at Neighbouring Rows

LAG(col, n) reaches back to the row n positions earlier; LEAD(col, n) reaches forward to the row n positions later (n defaults to 1). They walk the order you give in OVER (...), restarting at each partition boundary. This is how you compare a row to the one before it — month-over-month change, period-over-period growth, gaps between events.

Edge case: the first row of each partition has no earlier neighbour, so LAG returns NULL there (and LEAD returns NULL on the last row). Pass a third argument as a fallback: LAG(amount, 1, 0) uses 0 when there's no prior row.

LAG & LEAD

Previous and next month's amount, per region.

Try it Yourself »

SQL

-- LAG looks at an EARLIER row; LEAD looks at a LATER row.
-- Both walk the order you give them, per partition.

SELECT region, month, amount,
       -- amount of the PREVIOUS month in this region
       LAG(amount, 1) OVER (
           PARTITION BY region ORDER BY month
       ) AS prev_month,
       -- amount of the NEXT month in this region
       LEAD(amount, 1) OVER (
           PARTITION BY region ORDER BY month
       ) AS next_month
FROM sales
ORDER BY region, month;

-- The first row of
...

Result — prev/next month within each region:

region	month	amount	prev_month	next_month
East	1	100	NULL	150
East	2	150	100	120
East	3	120	150	NULL
West	1	200	NULL	180
West	2	180	200	250
West	3	250	180	NULL

5. Aggregates as Windows — Moving Averages

Any aggregate — SUM, AVG, COUNT, MIN, MAX — can run as a window. You already built a running SUM. Swap in AVG with a small frame and you get a moving average: a smoothed value over the last few rows. The frame ROWS BETWEEN 1 PRECEDING AND CURRENT ROW means "this row and the one immediately before it" — a 2-row window that slides down the data.

2-month moving average

AVG over the current and previous row.

Try it Yourself »

SQL

-- A window FRAME picks exactly which rows feed the aggregate.
-- ROWS BETWEEN 1 PRECEDING AND CURRENT ROW = "this row + the one before".

SELECT region, month, amount,
       ROUND(AVG(amount) OVER (
           PARTITION BY region
           ORDER BY month
           ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
       ), 2) AS moving_avg_2m
FROM sales
ORDER BY region, month;

-- This is a 2-month moving average that slides down the data.
-- Month 1 averages just itself (no earlier row to include).

Result — moving_avg_2m per region:

region	month	amount	moving_avg_2m
East	1	100	100
East	2	150	125
East	3	120	135
West	1	200	200
West	2	180	190
West	3	250	215

East month 2 averages (100+150)/2 = 125; month 3 averages (150+120)/2 = 135. Month 1 averages only itself because there's no earlier row in its frame.

6. Window Frames — The Part Everyone Skips

The frame is the exact slice of the partition a function can see from the current row. You write it with ROWS BETWEEN <start> AND <end>, where the bounds are:

UNBOUNDED PRECEDING — the first row of the partition.
n PRECEDING — n rows before the current row.
CURRENT ROW — the row being calculated.
n FOLLOWING — n rows after the current row.
UNBOUNDED FOLLOWING — the last row of the partition.

So ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is a running total, and ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is the whole-partition total. (RANGE and GROUPS are sibling frame types that count by value or by ties instead of physical rows — reach for them later; ROWS covers the vast majority of cases.)

The default-frame surprise: when you write ORDER BY but omit the frame, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW — i.e. a running total, not the whole partition. So LAST_VALUE() with only an ORDER BY returns the current row, not the final one. To get the true last value, spell out ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

Frames side by side

Running total vs whole-partition total.

Try it Yourself »

SQL

-- The frame is the rows the function can "see" from the current row.
-- UNBOUNDED PRECEDING = the very first row of the partition.
-- CURRENT ROW         = the row being calculated.
-- UNBOUNDED FOLLOWING = the very last row of the partition.

SELECT region, month, amount,
    -- Explicit running total (start-of-partition → here):
    SUM(amount) OVER (
        PARTITION BY region ORDER BY month
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total,

    -- Whole-parti
...

Result — same SUM, two different frames:

region	month	amount	running_total	region_total
East	1	100	100	370
East	2	150	250	370
East	3	120	370	370
West	1	200	200	630
West	2	180	380	630
West	3	250	630	630

Common Errors (and the fix)

Window function in WHERE: WHERE ROW_NUMBER() OVER (...) = 1 errors with "window functions are not allowed in WHERE". Window functions run after WHERE, so wrap the query in a CTE or subquery and filter on the alias: WITH ranked AS (SELECT ..., ROW_NUMBER() OVER (...) AS rn FROM sales) SELECT * FROM ranked WHERE rn = 1;
Missing ORDER BY in a running total: SUM(amount) OVER (PARTITION BY region) with no ORDER BY gives every row the same flat total (370/630), not an accumulating one. Add ORDER BY month to make it run.
Expecting RANK to be gapless: after a tie, RANK skips (…2, 2, 4…). If you need continuous numbers with no gaps, use DENSE_RANK (…2, 2, 3…). They are not interchangeable.
The LAST_VALUE default-frame trap: LAST_VALUE(x) OVER (ORDER BY month) returns the current row's value because the default frame ends at CURRENT ROW. Add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to reach the real last row.
Doing NULL maths after LAG: the first row's LAG is NULL, and 150 - NULL = NULL. Use a default (LAG(amount, 1, 0)) or wrap with COALESCE(...) if you don't want a NULL result.

📘 Quick Reference

Syntax	Purpose
fn(...) OVER (...)	Run fn over a window of rows; keep every row
PARTITION BY col	Restart the calculation per group
ORDER BY col	Order rows in the window (turns SUM into a running total)
ROW_NUMBER()	Unique 1,2,3… (no ties)
RANK() / DENSE_RANK()	Rank with gaps / without gaps on ties
NTILE(n)	Split the partition into n buckets
LAG(col,n) / LEAD(col,n)	Value n rows back / forward
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW	Running total frame
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW	2-row moving window
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING	Whole partition

Frequently Asked Questions

Q: When should I use a window function instead of GROUP BY?

Use a window function when you need every detail row and a group-level value beside it (each sale plus its region total, or each row's rank). Use GROUP BY when you only want one summary row per group.

Q: Why can't I put a window function in WHERE or GROUP BY?

Because window functions are evaluated after WHERE, GROUP BY and HAVING — only SELECT and ORDER BY can use them directly. To filter on one, compute it in a CTE/subquery first, then filter on its alias.

Q: RANK vs DENSE_RANK vs ROW_NUMBER — which do I want?

ROW_NUMBER for a strict 1-per-row sequence (e.g. "the single top row"). RANK when ties should share a place and the next place may skip ("joint 2nd, then 4th"). DENSE_RANK when you want tier numbers with no gaps.

Q: Do I always need to write the frame out?

No — for running totals the default already gives you start-of-partition → current row. But spell the frame out whenever you use LAST_VALUE, a moving average, or any "whole partition" calc, so the default doesn't quietly cut your window short.

Mini-Challenge: Month-over-Month Change

Support is faded now — a brief, a blank canvas, and the expected answer in the comments. Write it, then copy it into a playground to confirm.

🎯 Mini-Challenge

Use LAG to show each region's month-over-month change.

Try it Yourself »

SQL

-- 🎯 MINI-CHALLENGE — month-over-month change with LAG
-- Using only what this lesson covered:
--   1. Show region, month, amount.
--   2. Add a column "change" = this month's amount minus
--      the PREVIOUS month's amount in the same region.
--   3. Order by region, then month.
--   Hint: LAG(amount) OVER (PARTITION BY ... ORDER BY ...)
--
-- ✅ Expected "change" column:
--   East: NULL, 50, -30
--   West: NULL, -20, 70

-- your query here

🎉 Lesson Complete

✅ Window functions add a value to every row instead of collapsing them like GROUP BY
✅ OVER (PARTITION BY ... ORDER BY ...) steers the window; adding ORDER BY makes SUM a running total
✅ ROW_NUMBER / RANK / DENSE_RANK / NTILE rank rows and differ only on ties
✅ LAG and LEAD reach previous and next rows for period-over-period comparisons
✅ The frame (ROWS BETWEEN ...) decides exactly which rows a calculation sees — and the default can surprise you
✅ Next: package logic on the server with Stored Procedures Advanced

Window Functions Mastery

What You'll Learn

Our Sample Table: sales

1. Window Functions vs GROUP BY

GROUP BY vs a window

2. PARTITION BY and ORDER BY — Steering the Window

PARTITION BY + ORDER BY → running total

Your Turn: a running total per region

🎯 Your Turn: running total

3. Ranking — ROW_NUMBER, RANK, DENSE_RANK, NTILE

Ranking functions

Your Turn: rank each region's months

🎯 Your Turn: RANK by amount

4. LAG & LEAD — Looking at Neighbouring Rows

LAG & LEAD

5. Aggregates as Windows — Moving Averages

2-month moving average

6. Window Frames — The Part Everyone Skips

Frames side by side

Common Errors (and the fix)

📘 Quick Reference

Frequently Asked Questions

Mini-Challenge: Month-over-Month Change

🎯 Mini-Challenge

🎉 Lesson Complete

Cookie & Privacy Settings

Our Sample Table: `sales`