Lesson 29 • Advanced
Rate Limiting & Throttling 🚦
By the end of this lesson you'll be able to protect a PHP API from abuse — choosing the right algorithm, counting hits with Redis, scoping limits per IP, user, or key, and returning a correct 429 with the headers clients expect.
What You'll Learn in This Lesson
- Explain why rate limiting stops abuse, DDoS, and runaway costs
- Compare fixed window, sliding window, token bucket, and leaky bucket
- Implement a token bucket that allows bursts but caps the average
- Count requests across servers with Redis INCR + EXPIRE
- Scope limits per IP, per user, and per API key
- Return 429 with Retry-After and X-RateLimit-* headers from middleware
php file.php locally. The examples use a fixed timestamp and an in-memory array so they're deterministic and run anywhere; the Redis and header() lines are shown as comments because they need a live server. The Output panel under each one shows exactly what to expect.Retry-After header.1️⃣ Why Rate Limit At All?
Rate limiting caps how many requests a single client may make in a window of time. Without it, one script can do enormous damage: brute-force a login by trying thousands of passwords a second, run a denial-of-service (DDoS) attack that buries your server in traffic so real users can't get in, scrape your whole catalogue in minutes, or rack up a huge bill when every request hits a paid API or database. A limit turns all of those from "trivially easy" into "not worth trying."
<?php
// Why bother? Imagine a login form with NO rate limit. An attacker can try
// thousands of passwords per second. This loop shows how fast that adds up.
$attemptsPerSecond = 500; // a script firing requests in a tight loop
$secondsRunning = 60; // left alone for just one minute
$total = $attemptsPerSecond * $secondsRunning;
echo "Password guesses in 1 minute: {$total}\n"; // 30000
// 30,000 free guesses a minute is how accounts get cracked, bills explode,
// and servers fall over. A rate limit caps each client at, say, 5 tries
// per 5 minutes — turning a brute-force attack into a non-starter.
$allowed = 5;
echo "With a limit of {$allowed}/5min, guesses in 1 minute: {$allowed}\n";Password guesses in 1 minute: 30000
With a limit of 5/5min, guesses in 1 minute: 52️⃣ Fixed Window — The Simplest Algorithm
A fixed window chops time into equal slots — say every 60 seconds — and keeps one counter per slot. Each request adds 1; once the counter passes the limit, the rest are denied until the clock ticks into the next slot, which starts a fresh count of zero. You don't need a cleanup job: the window number is baked into the key, so old windows simply stop being referenced. It's the easiest algorithm to get right, which is why it's the place to start.
<?php
// FIXED WINDOW: count requests inside a fixed time slot (e.g. each 60s).
// The window key changes when the clock ticks into the next minute, so old
// counts expire on their own. Simplest algorithm — one counter per window.
function fixedWindow(array &$store, string $key, int $limit, int $windowSec, int $now): array
{
// intdiv($now, 60) gives the window number: 1700000000 -> 28333333.
$bucket = $key . ':' . intdiv($now, $windowSec); // e.g. "ip:1.2.3.4:28333333"
$count = ($store[$bucket] ?? 0) + 1; // ?? 0 = "0 if not seen yet"
$store[$bucket] = $count;
$allowed = $count <= $limit; // true while under the cap
$remaining = max(0, $limit - $count); // never report a negative
return [$allowed, $remaining];
}
$store = [];
$now = 1700000000; // a fixed timestamp
// Limit: 3 requests per 60-second window.
for ($i = 1; $i <= 5; $i++) {
[$ok, $left] = fixedWindow($store, "ip:1.2.3.4", 3, 60, $now);
$status = $ok ? "200 OK" : "429 Too Many Requests";
echo "Request {$i}: {$status} (remaining: {$left})\n";
}Request 1: 200 OK (remaining: 2)
Request 2: 200 OK (remaining: 1)
Request 3: 200 OK (remaining: 0)
Request 4: 429 Too Many Requests (remaining: 0)
Request 5: 429 Too Many Requests (remaining: 0)The catch: a client can fire the full limit at the end of one window and the full limit again at the start of the next, doubling the real rate for a couple of seconds. That boundary burst is exactly what the next two algorithms fix.
3️⃣ Token Bucket — Allow Bursts, Cap the Average
The token bucket is the most widely used algorithm. Picture a bucket that holds up to capacity tokens and refills at a steady rate (say 2 tokens/second). Every request spends one token; if the bucket is empty, the request is denied. Because unused tokens build up, a client that's been quiet can burst — spend a pile of saved tokens at once — yet the steady refill rate still caps the long-term average. It's the friendliest behaviour for real users while staying strict on sustained abuse.
<?php
// TOKEN BUCKET: each client has a bucket of N tokens that refills at a constant
// rate. Each request spends 1 token; an empty bucket means 429. This allows
// short bursts (spend saved-up tokens) while capping the long-term average.
class TokenBucket
{
private float $tokens; // tokens available right now
private float $lastRefill; // when we last topped the bucket up
public function __construct(
private int $capacity, // most tokens the bucket can hold
private float $refillPerSecond, // tokens added every second
float $now,
) {
$this->tokens = $capacity; // start full
$this->lastRefill = $now;
}
/** Try to take one token. Returns [allowed, remaining]. */
public function consume(float $now): array
{
// Add the tokens that accrued since the last call, capped at capacity.
$elapsed = $now - $this->lastRefill;
$this->tokens = min($this->capacity, $this->tokens + $elapsed * $this->refillPerSecond);
$this->lastRefill = $now;
if ($this->tokens >= 1) {
$this->tokens--; // spend a token
return [true, (int) floor($this->tokens)];
}
return [false, 0]; // bucket empty -> deny
}
}
// Bucket of 5 tokens refilling at 2/sec. Fire 6 requests in the SAME instant
// (no time passes, so no refill), then one more 1 second later.
$now = 1700000000.0;
$bucket = new TokenBucket(capacity: 5, refillPerSecond: 2, now: $now);
for ($i = 1; $i <= 6; $i++) {
[$ok, $left] = $bucket->consume($now); // same timestamp each time
$status = $ok ? "200 OK" : "429 Too Many";
echo "t=0s Request {$i}: {$status} (remaining: {$left})\n";
}
[$ok, $left] = $bucket->consume($now + 1.0); // 1s later -> +2 tokens refilled
echo "t=1s Request 7: " . ($ok ? "200 OK" : "429 Too Many") . " (remaining: {$left})\n";t=0s Request 1: 200 OK (remaining: 4)
t=0s Request 2: 200 OK (remaining: 3)
t=0s Request 3: 200 OK (remaining: 2)
t=0s Request 4: 200 OK (remaining: 1)
t=0s Request 5: 200 OK (remaining: 0)
t=0s Request 6: 429 Too Many (remaining: 0)
t=1s Request 7: 200 OK (remaining: 1)Watch the timeline: five requests drain the bucket, the sixth is rejected, then after one second two tokens have refilled, so request 7 succeeds and leaves one in reserve. A leaky bucket is the close cousin — instead of spending tokens, requests drip out of a queue at a constant rate, which smooths output even harder but can add latency while requests wait their turn.
4️⃣ Counting With Redis & Returning 429
A PHP variable lives for one request, so it can't track a client across many requests on many servers. The standard tool is Redis, a shared in-memory store every server can reach. Two commands do the work: INCR atomically adds 1 and returns the new total (so two requests landing at once can't both read a stale count), and EXPIRE auto-deletes the key when the window ends. When a client goes over, you return HTTP 429 Too Many Requests with a Retry-After header telling it how long to wait, plus X-RateLimit-* headers on every response so well-behaved clients throttle themselves.
<?php
// IN PRODUCTION you run many servers, so the counter can't live in one process's
// memory — it lives in Redis, shared by all of them. The pattern is two commands:
// INCR (atomically add 1, returns the new value) and EXPIRE (auto-delete later).
// $redis = new Redis(); $redis->connect('127.0.0.1', 6379);
function rateLimit(Redis $redis, string $id, int $limit, int $windowSec): void
{
// One key per client per window. The window number is part of the key, so
// when the minute rolls over you get a brand-new key starting at 0.
$window = intdiv(time(), $windowSec);
$key = "ratelimit:{$id}:{$window}";
$count = $redis->incr($key); // atomic +1, even across servers
if ($count === 1) {
$redis->expire($key, $windowSec); // first hit sets the auto-expiry
}
$remaining = max(0, $limit - $count);
// Tell EVERY response how much budget is left (clients use these to back off).
header("X-RateLimit-Limit: {$limit}");
header("X-RateLimit-Remaining: {$remaining}");
header("X-RateLimit-Reset: " . (($window + 1) * $windowSec));
if ($count > $limit) {
$retryAfter = $windowSec - (time() % $windowSec); // seconds to next window
http_response_code(429); // 429 Too Many Requests
header("Retry-After: {$retryAfter}"); // when it's safe to retry
exit(json_encode(['error' => 'Rate limit exceeded']));
}
}
// Prefer the logged-in user id; fall back to IP for anonymous callers.
$id = $_SESSION['user_id'] ?? $_SERVER['REMOTE_ADDR'];
// rateLimit($redis, $id, limit: 100, windowSec: 60); // 100 requests/minute
// A blocked response looks like this on the wire:
echo "HTTP/1.1 429 Too Many Requests\n";
echo "X-RateLimit-Limit: 100\n";
echo "X-RateLimit-Remaining: 0\n";
echo "Retry-After: 37\n";
echo "\n";
echo '{"error":"Rate limit exceeded"}' . "\n";HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
Retry-After: 37
{"error":"Rate limit exceeded"}The if ($count === 1) check is the key trick: only the request that creates the counter sets its expiry, so the window measures from the first hit and the key tidies itself up. (For very high traffic you'd run this as a small Lua script in Redis to make the INCR and EXPIRE a single atomic step.)
5️⃣ Per-IP, Per-User, Per-Key — and Where the Check Goes
The counter is only as good as the identity you count by. Limit per API key for trusted partners, per user ID for logged-in people (so a shared office IP doesn't punish everyone), and per IP only as the last resort for anonymous traffic. Just as important is where the check runs: put it in middleware at the very front of the pipeline so an over-limit request is rejected before it touches your router, controller, or database. Limit too late and you've already paid for the work you were trying to block.
<?php
// WHO do you limit, and WHERE? Pick the right identity, then check it FIRST —
// before any expensive work — so an abuser never reaches your database.
function clientId(array $request): string
{
// 1) Per API key: trusted partners get a known, generous quota.
if (!empty($request['api_key'])) {
return "key:" . $request['api_key'];
}
// 2) Per user: logged-in people are limited as themselves, not their office IP.
if (!empty($request['user_id'])) {
return "user:" . $request['user_id'];
}
// 3) Per IP: the only handle you have on anonymous traffic.
return "ip:" . $request['ip'];
}
// Middleware runs at the FRONT of the pipeline. Return early on 429 so the
// controller, the database, and the rest of the app are never touched.
function handle(array $request): string
{
$id = clientId($request);
// if (overLimit($id)) return "429 Too Many Requests for {$id}";
return "200 OK -> routed to controller for {$id}";
}
$requests = [
['api_key' => 'PARTNER-7'], // -> key:PARTNER-7
['user_id' => 42], // -> user:42
['ip' => '203.0.113.9'], // -> ip:203.0.113.9
];
foreach ($requests as $r) {
echo handle($r) . "\n";
}200 OK -> routed to controller for key:PARTNER-7
200 OK -> routed to controller for user:42
200 OK -> routed to controller for ip:203.0.113.96️⃣ Your Turn
Time to write the parts that matter. The window logic below is done — you just supply the allow/deny decision and the remaining count. Fill each ___ using the 👉 hint, then run it and check against the Output panel.
<?php
// 🎯 YOUR TURN — finish this fixed-window check, then run it.
// The window already works; you just decide allowed/denied and what's left.
function check(array &$store, string $key, int $limit, int $windowSec, int $now): array
{
$bucket = $key . ':' . intdiv($now, $windowSec);
$count = ($store[$bucket] ?? 0) + 1;
$store[$bucket] = $count;
// 1) A request is allowed while the count has not passed the limit.
$allowed = ___; // 👉 compare $count with $limit (use <=)
// 2) Remaining must never go below zero.
$remaining = ___; // 👉 use max(0, $limit - $count)
return [$allowed, $remaining];
}
$store = [];
for ($i = 1; $i <= 4; $i++) {
[$ok, $left] = check($store, "user:7", 2, 60, 1700000000);
echo "Request {$i}: " . ($ok ? "200 OK" : "429") . " (remaining: {$left})\n";
}
// ✅ Expected output:
// Request 1: 200 OK (remaining: 1)
// Request 2: 200 OK (remaining: 0)
// Request 3: 429 (remaining: 0)
// Request 4: 429 (remaining: 0)Request 1: 200 OK (remaining: 1)
Request 2: 200 OK (remaining: 0)
Request 3: 429 (remaining: 0)
Request 4: 429 (remaining: 0)___ blanks: $count <= $limit and max(0, $limit - $count). Two requests should pass, then the rest get 429.Now the rejection side. A client is over its limit — send back a correct 429. Fill in the status code and the two headers a polite response must carry.
<?php
// 🎯 YOUR TURN — a client is over its limit. Send the right rejection.
// Fill in the status code and the two headers a well-behaved 429 must carry.
$limit = 100;
$retryAfter = 30; // seconds until the window resets
// 1) The HTTP status for "you've sent too many requests".
$status = ___; // 👉 the number is 429
// 2) Tells the client HOW LONG to wait before retrying.
$retryHeader = "Retry-After: " . ___; // 👉 use $retryAfter
// 3) Tells the client how many requests it had in total.
$limitHeader = "X-RateLimit-Limit: " . ___; // 👉 use $limit
echo "HTTP/1.1 {$status} Too Many Requests\n";
echo "{$limitHeader}\n";
echo "X-RateLimit-Remaining: 0\n";
echo "{$retryHeader}\n";
// ✅ Expected output:
// HTTP/1.1 429 Too Many Requests
// X-RateLimit-Limit: 100
// X-RateLimit-Remaining: 0
// Retry-After: 30HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
Retry-After: 30429; Retry-After uses $retryAfter and the limit header uses $limit. Match the four expected lines.Common Errors (and the fix)
- The count is sometimes wrong under load (race condition) — you did
$count = $redis->get($key); $redis->set($key, $count + 1);. Two requests can read the same value and both write back the same number, losing a hit. Use$redis->incr($key)instead: it reads, adds, and writes as one atomic step. - Clients can send double the limit around the minute mark — that's the fixed-window boundary burst. It's expected behaviour, not a bug. If the spike matters, switch to a sliding window or a token bucket, which spread the allowance smoothly instead of resetting in one jump.
- Clients hammer your server right after a 429 — you returned 429 with no
Retry-Afterheader, so the client has no idea when to come back and just retries instantly. Always sendRetry-After(and ideallyX-RateLimit-Reset) so it backs off. - Your database is still getting crushed despite a limit — you're limiting too late, after authentication and queries have run. Move the check into middleware at the front of the pipeline and
return/exiton 429 before any expensive work. - "Warning: Cannot modify header information — headers already sent" — you echoed output (even a stray blank line before
<?php) before callingheader()orhttp_response_code(). Send all status codes and headers before any output.
Pro Tips
- 💡 Limit writes harder than reads. A bot scraping GETs is annoying; a bot firing thousands of POSTs to create accounts or spam is dangerous — give write endpoints a tighter cap.
- 💡 Send
X-RateLimit-*on every response, not just the 429. Good clients readX-RateLimit-Remainingand slow down before they ever hit the wall. - 💡 Don't reinvent it in production. Frameworks ship this — Laravel's
throttlemiddleware and Symfony's RateLimiter component are battle-tested. Build it once to understand it, then lean on the framework.
📋 Quick Reference — Algorithms & Headers
| Algorithm | How It Works | Best For |
|---|---|---|
| Fixed Window | One counter per time slot, resets on the boundary | Simplest; OK if boundary bursts are fine |
| Sliding Window | Counts the trailing N seconds from now | Smooth limit, no boundary spike |
| Token Bucket | Spend tokens; refill at a steady rate | Allows bursts, caps the average (most common) |
| Leaky Bucket | Queue drains at a constant rate | Steady output; smooths bursts (adds latency) |
| 429 | HTTP status "Too Many Requests" | The response when a client is over limit |
| Retry-After | Seconds (or date) to wait before retrying | Always send with a 429 |
| X-RateLimit-* | Limit / Remaining / Reset budget | Send on every response so clients self-throttle |
Frequently Asked Questions
Q: Should I rate limit by IP address or by user?
Prefer the most specific identity you have. Limit logged-in callers by their user ID (or API key) so that everyone sharing one office or mobile-carrier IP isn't punished together — dozens of real users can sit behind a single corporate NAT. Fall back to IP only for anonymous traffic where you have nothing better. A common pattern is: API key if present, else user ID, else IP.
Q: What's the difference between fixed window and sliding window?
A fixed window counts requests inside rigid slots (e.g. each clock minute) and resets to zero at the boundary. It's dead simple but allows a burst at the seam: a client can send the full limit at 0:59 and the full limit again at 1:00 — double the rate across two seconds. A sliding window smooths this out by counting requests in the trailing N seconds from now, or by weighting the previous window, so there's no boundary spike. Token bucket solves the same problem differently by letting saved-up tokens absorb bursts up to a fixed capacity.
Q: Why use Redis instead of just a PHP variable or the session?
A normal PHP variable lives for one request and then vanishes, and the session is per-user on local disk — neither is shared. Real apps run several PHP processes across several servers behind a load balancer, so the counter has to live in one shared place they can all reach. Redis is the standard choice: its INCR command atomically adds 1 and returns the new value (so two simultaneous requests can't both read a stale count), and EXPIRE auto-deletes the key when the window ends, so you never have to clean up.
Q: What status code and headers should a rate-limited response return?
Return 429 Too Many Requests. Always add a Retry-After header (seconds to wait, or an HTTP date) so a well-behaved client backs off instead of hammering you. It's also good practice to send X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset on every response — including successful ones — so clients can self-throttle before they ever hit the wall.
Q: Where in my app should the rate-limit check go?
As early as possible — in middleware at the very front of the request pipeline, before routing, authentication-heavy work, or any database queries. The whole point is to reject abusive traffic cheaply; if you check the limit after running the expensive controller, you've already paid the cost you were trying to avoid. Return the 429 immediately and short-circuit the rest of the pipeline.
Mini-Challenge: Per-Endpoint Limits
No code is filled in this time — just a brief and an outline. Write it yourself, run it on onecompiler.com/php or your own machine, then check your result against the expected output in the comments. This is the write-run-check loop you'll use on every real limiter.
<?php
// 🎯 MINI-CHALLENGE: A per-endpoint limiter table.
// No code is filled in — work from the steps below, then run it.
//
// 1. Make an array $endpoints. Each entry has a 'limit' and a label, e.g.
// 'auth:login' => ['limit' => 5, 'label' => '5 per 5min'],
// 'api:search' => ['limit' => 30, 'label' => '30 req/min'],
// 'api:read' => ['limit' => 100, 'label' => '100 req/min'],
// 2. Loop over $endpoints with foreach ($endpoints as $name => $cfg).
// 3. For each one, pretend 6 requests just arrived. The remaining budget is
// max(0, $cfg['limit'] - 6).
// 4. echo a line: "<name>: <label> -> <remaining> remaining after 6 hits"
//
// Tip: printf("%-12s %s\n", ...) lines the columns up nicely.
//
// ✅ Expected output:
// auth:login 5 per 5min -> 0 remaining after 6 hits
// api:search 30 req/min -> 24 remaining after 6 hits
// api:read 100 req/min -> 94 remaining after 6 hits
// your code heremax(0, limit - 6) for each, and print one line per endpoint matching the expected output.🎉 Lesson Complete!
- ✅ Rate limiting stops abuse, brute-force, DDoS, and runaway costs by capping requests per client
- ✅ Fixed window is simplest but bursts at the boundary; sliding window smooths that out
- ✅ Token bucket allows bursts while capping the average; leaky bucket drains at a constant rate
- ✅ Redis
INCR+EXPIREcounts hits atomically across many servers - ✅ Scope limits per API key, user, then IP — and check in middleware, before any real work
- ✅ Reject with
429plusRetry-AfterandX-RateLimit-*headers - ✅ Next lesson: Working with Queues — offload slow work so requests stay fast
Sign up for free to track which lessons you've completed and get learning reminders.