Courses/PHP/Rate Limiting

Lesson 29 • Advanced

Rate Limiting & Throttling 🚦

By the end of this lesson you'll be able to protect a PHP API from abuse — choosing the right algorithm, counting hits with Redis, scoping limits per IP, user, or key, and returning a correct 429 with the headers clients expect.

What You'll Learn in This Lesson

Explain why rate limiting stops abuse, DDoS, and runaway costs
Compare fixed window, sliding window, token bucket, and leaky bucket
Implement a token bucket that allows bursts but caps the average
Count requests across servers with Redis INCR + EXPIRE
Scope limits per IP, per user, and per API key
Return 429 with Retry-After and X-RateLimit-* headers from middleware

Run the examples for real: Every snippet below is genuine PHP — paste it into onecompiler.com/php (no install) or run php file.php locally. The examples use a fixed timestamp and an in-memory array so they're deterministic and run anywhere; the Redis and header() lines are shown as comments because they need a live server. The Output panel under each one shows exactly what to expect.

📌 Prerequisites: This lesson assumes you can build a basic endpoint — see Dynamic APIs. Comfort with arrays, classes, and the request/response cycle will help.

🎯 Real-World Analogy: Rate limiting is the turnstile at a stadium. It lets people through at a controlled pace so the stands never get dangerously crowded. A fixed window is a guard who counts heads and resets the tally every minute — fine, but the door gets crushed right on the minute. A token bucket hands you a roll of entry tickets that tops up slowly: you can rush in using saved tickets (a burst), but once they're gone you wait for more. When the venue is full, the turnstile shows a 429 sign that says exactly how long to wait — the Retry-After header.

1️⃣ Why Rate Limit At All?

Rate limiting caps how many requests a single client may make in a window of time. Without it, one script can do enormous damage: brute-force a login by trying thousands of passwords a second, run a denial-of-service (DDoS) attack that buries your server in traffic so real users can't get in, scrape your whole catalogue in minutes, or rack up a huge bill when every request hits a paid API or database. A limit turns all of those from "trivially easy" into "not worth trying."

The cost of having no limit

<?php
// Why bother? Imagine a login form with NO rate limit. An attacker can try
// thousands of passwords per second. This loop shows how fast that adds up.

$attemptsPerSecond = 500;     // a script firing requests in a tight loop
$secondsRunning    = 60;      // left alone for just one minute

$total = $attemptsPerSecond * $secondsRunning;
echo "Password guesses in 1 minute: {$total}\n";   // 30000

// 30,000 free guesses a minute is how accounts get cracked, bills explode,
// and servers fall over. A rate limit caps each client at, say, 5 tries
// per 5 minutes — turning a brute-force attack into a non-starter.
$allowed = 5;
echo "With a limit of {$allowed}/5min, guesses in 1 minute: {$allowed}\n";

Output

Password guesses in 1 minute: 30000
With a limit of 5/5min, guesses in 1 minute: 5

This is real code — run it for free atonecompiler.com/phpor in your own editor.

2️⃣ Fixed Window — The Simplest Algorithm

A fixed window chops time into equal slots — say every 60 seconds — and keeps one counter per slot. Each request adds 1; once the counter passes the limit, the rest are denied until the clock ticks into the next slot, which starts a fresh count of zero. You don't need a cleanup job: the window number is baked into the key, so old windows simply stop being referenced. It's the easiest algorithm to get right, which is why it's the place to start.

Fixed-window limiter (3 requests / 60s)

<?php
// FIXED WINDOW: count requests inside a fixed time slot (e.g. each 60s).
// The window key changes when the clock ticks into the next minute, so old
// counts expire on their own. Simplest algorithm — one counter per window.

function fixedWindow(array &$store, string $key, int $limit, int $windowSec, int $now): array
{
    // intdiv($now, 60) gives the window number: 1700000000 -> 28333333.
    $bucket = $key . ':' . intdiv($now, $windowSec);   // e.g. "ip:1.2.3.4:28333333"
    $count  = ($store[$bucket] ?? 0) + 1;              // ?? 0 = "0 if not seen yet"
    $store[$bucket] = $count;

    $allowed   = $count <= $limit;                     // true while under the cap
    $remaining = max(0, $limit - $count);              // never report a negative
    return [$allowed, $remaining];
}

$store = [];
$now   = 1700000000;                                   // a fixed timestamp

// Limit: 3 requests per 60-second window.
for ($i = 1; $i <= 5; $i++) {
    [$ok, $left] = fixedWindow($store, "ip:1.2.3.4", 3, 60, $now);
    $status = $ok ? "200 OK" : "429 Too Many Requests";
    echo "Request {$i}: {$status} (remaining: {$left})\n";
}

Output

Request 1: 200 OK (remaining: 2)
Request 2: 200 OK (remaining: 1)
Request 3: 200 OK (remaining: 0)
Request 4: 429 Too Many Requests (remaining: 0)
Request 5: 429 Too Many Requests (remaining: 0)

This is real code — run it for free atonecompiler.com/phpor in your own editor.

The catch: a client can fire the full limit at the end of one window and the full limit again at the start of the next, doubling the real rate for a couple of seconds. That boundary burst is exactly what the next two algorithms fix.

3️⃣ Token Bucket — Allow Bursts, Cap the Average

The token bucket is the most widely used algorithm. Picture a bucket that holds up to capacity tokens and refills at a steady rate (say 2 tokens/second). Every request spends one token; if the bucket is empty, the request is denied. Because unused tokens build up, a client that's been quiet can burst — spend a pile of saved tokens at once — yet the steady refill rate still caps the long-term average. It's the friendliest behaviour for real users while staying strict on sustained abuse.

Token bucket (capacity 5, refill 2/sec)

<?php
// TOKEN BUCKET: each client has a bucket of N tokens that refills at a constant
// rate. Each request spends 1 token; an empty bucket means 429. This allows
// short bursts (spend saved-up tokens) while capping the long-term average.

class TokenBucket
{
    private float $tokens;          // tokens available right now
    private float $lastRefill;      // when we last topped the bucket up

    public function __construct(
        private int   $capacity,        // most tokens the bucket can hold
        private float $refillPerSecond, // tokens added every second
        float         $now,
    ) {
        $this->tokens     = $capacity;  // start full
        $this->lastRefill = $now;
    }

    /** Try to take one token. Returns [allowed, remaining]. */
    public function consume(float $now): array
    {
        // Add the tokens that accrued since the last call, capped at capacity.
        $elapsed = $now - $this->lastRefill;
        $this->tokens = min($this->capacity, $this->tokens + $elapsed * $this->refillPerSecond);
        $this->lastRefill = $now;

        if ($this->tokens >= 1) {
            $this->tokens--;                      // spend a token
            return [true, (int) floor($this->tokens)];
        }
        return [false, 0];                        // bucket empty -> deny
    }
}

// Bucket of 5 tokens refilling at 2/sec. Fire 6 requests in the SAME instant
// (no time passes, so no refill), then one more 1 second later.
$now    = 1700000000.0;
$bucket = new TokenBucket(capacity: 5, refillPerSecond: 2, now: $now);

for ($i = 1; $i <= 6; $i++) {
    [$ok, $left] = $bucket->consume($now);        // same timestamp each time
    $status = $ok ? "200 OK" : "429 Too Many";
    echo "t=0s  Request {$i}: {$status} (remaining: {$left})\n";
}

[$ok, $left] = $bucket->consume($now + 1.0);      // 1s later -> +2 tokens refilled
echo "t=1s  Request 7: " . ($ok ? "200 OK" : "429 Too Many") . " (remaining: {$left})\n";

Output

t=0s  Request 1: 200 OK (remaining: 4)
t=0s  Request 2: 200 OK (remaining: 3)
t=0s  Request 3: 200 OK (remaining: 2)
t=0s  Request 4: 200 OK (remaining: 1)
t=0s  Request 5: 200 OK (remaining: 0)
t=0s  Request 6: 429 Too Many (remaining: 0)
t=1s  Request 7: 200 OK (remaining: 1)

This is real code — run it for free atonecompiler.com/phpor in your own editor.

Watch the timeline: five requests drain the bucket, the sixth is rejected, then after one second two tokens have refilled, so request 7 succeeds and leaves one in reserve. A leaky bucket is the close cousin — instead of spending tokens, requests drip out of a queue at a constant rate, which smooths output even harder but can add latency while requests wait their turn.

4️⃣ Counting With Redis & Returning 429

A PHP variable lives for one request, so it can't track a client across many requests on many servers. The standard tool is Redis, a shared in-memory store every server can reach. Two commands do the work: INCR atomically adds 1 and returns the new total (so two requests landing at once can't both read a stale count), and EXPIRE auto-deletes the key when the window ends. When a client goes over, you return HTTP 429 Too Many Requests with a Retry-After header telling it how long to wait, plus X-RateLimit-* headers on every response so well-behaved clients throttle themselves.

Redis INCR + EXPIRE and the 429 response

<?php
// IN PRODUCTION you run many servers, so the counter can't live in one process's
// memory — it lives in Redis, shared by all of them. The pattern is two commands:
// INCR (atomically add 1, returns the new value) and EXPIRE (auto-delete later).

// $redis = new Redis(); $redis->connect('127.0.0.1', 6379);

function rateLimit(Redis $redis, string $id, int $limit, int $windowSec): void
{
    // One key per client per window. The window number is part of the key, so
    // when the minute rolls over you get a brand-new key starting at 0.
    $window = intdiv(time(), $windowSec);
    $key    = "ratelimit:{$id}:{$window}";

    $count = $redis->incr($key);          // atomic +1, even across servers
    if ($count === 1) {
        $redis->expire($key, $windowSec); // first hit sets the auto-expiry
    }

    $remaining = max(0, $limit - $count);

    // Tell EVERY response how much budget is left (clients use these to back off).
    header("X-RateLimit-Limit: {$limit}");
    header("X-RateLimit-Remaining: {$remaining}");
    header("X-RateLimit-Reset: " . (($window + 1) * $windowSec));

    if ($count > $limit) {
        $retryAfter = $windowSec - (time() % $windowSec);   // seconds to next window
        http_response_code(429);                            // 429 Too Many Requests
        header("Retry-After: {$retryAfter}");               // when it's safe to retry
        exit(json_encode(['error' => 'Rate limit exceeded']));
    }
}

// Prefer the logged-in user id; fall back to IP for anonymous callers.
$id = $_SESSION['user_id'] ?? $_SERVER['REMOTE_ADDR'];
// rateLimit($redis, $id, limit: 100, windowSec: 60);   // 100 requests/minute

// A blocked response looks like this on the wire:
echo "HTTP/1.1 429 Too Many Requests\n";
echo "X-RateLimit-Limit: 100\n";
echo "X-RateLimit-Remaining: 0\n";
echo "Retry-After: 37\n";
echo "\n";
echo '{"error":"Rate limit exceeded"}' . "\n";

Output

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
Retry-After: 37

{"error":"Rate limit exceeded"}

This is real code — run it for free atonecompiler.com/phpor in your own editor.

The if ($count === 1) check is the key trick: only the request that creates the counter sets its expiry, so the window measures from the first hit and the key tidies itself up. (For very high traffic you'd run this as a small Lua script in Redis to make the INCR and EXPIRE a single atomic step.)

5️⃣ Per-IP, Per-User, Per-Key — and Where the Check Goes

The counter is only as good as the identity you count by. Limit per API key for trusted partners, per user ID for logged-in people (so a shared office IP doesn't punish everyone), and per IP only as the last resort for anonymous traffic. Just as important is where the check runs: put it in middleware at the very front of the pipeline so an over-limit request is rejected before it touches your router, controller, or database. Limit too late and you've already paid for the work you were trying to block.

Choosing the identity and short-circuiting early

<?php
// WHO do you limit, and WHERE? Pick the right identity, then check it FIRST —
// before any expensive work — so an abuser never reaches your database.

function clientId(array $request): string
{
    // 1) Per API key: trusted partners get a known, generous quota.
    if (!empty($request['api_key'])) {
        return "key:" . $request['api_key'];
    }
    // 2) Per user: logged-in people are limited as themselves, not their office IP.
    if (!empty($request['user_id'])) {
        return "user:" . $request['user_id'];
    }
    // 3) Per IP: the only handle you have on anonymous traffic.
    return "ip:" . $request['ip'];
}

// Middleware runs at the FRONT of the pipeline. Return early on 429 so the
// controller, the database, and the rest of the app are never touched.
function handle(array $request): string
{
    $id = clientId($request);
    // if (overLimit($id)) return "429 Too Many Requests for {$id}";
    return "200 OK -> routed to controller for {$id}";
}

$requests = [
    ['api_key' => 'PARTNER-7'],                  // -> key:PARTNER-7
    ['user_id' => 42],                           // -> user:42
    ['ip' => '203.0.113.9'],                     // -> ip:203.0.113.9
];

foreach ($requests as $r) {
    echo handle($r) . "\n";
}

Output

200 OK -> routed to controller for key:PARTNER-7
200 OK -> routed to controller for user:42
200 OK -> routed to controller for ip:203.0.113.9

This is real code — run it for free atonecompiler.com/phpor in your own editor.

6️⃣ Your Turn

Time to write the parts that matter. The window logic below is done — you just supply the allow/deny decision and the remaining count. Fill each ___ using the 👉 hint, then run it and check against the Output panel.

🎯 Your turn: complete the limiter

<?php
// 🎯 YOUR TURN — finish this fixed-window check, then run it.
// The window already works; you just decide allowed/denied and what's left.

function check(array &$store, string $key, int $limit, int $windowSec, int $now): array
{
    $bucket = $key . ':' . intdiv($now, $windowSec);
    $count  = ($store[$bucket] ?? 0) + 1;
    $store[$bucket] = $count;

    // 1) A request is allowed while the count has not passed the limit.
    $allowed = ___;                 // 👉 compare $count with $limit (use <=)

    // 2) Remaining must never go below zero.
    $remaining = ___;               // 👉 use max(0, $limit - $count)

    return [$allowed, $remaining];
}

$store = [];
for ($i = 1; $i <= 4; $i++) {
    [$ok, $left] = check($store, "user:7", 2, 60, 1700000000);
    echo "Request {$i}: " . ($ok ? "200 OK" : "429") . " (remaining: {$left})\n";
}

// ✅ Expected output:
//    Request 1: 200 OK (remaining: 1)
//    Request 2: 200 OK (remaining: 0)
//    Request 3: 429 (remaining: 0)
//    Request 4: 429 (remaining: 0)

Output

Request 1: 200 OK (remaining: 1)
Request 2: 200 OK (remaining: 0)
Request 3: 429 (remaining: 0)
Request 4: 429 (remaining: 0)

Fill the two ___ blanks: $count <= $limit and max(0, $limit - $count). Two requests should pass, then the rest get 429.

Now the rejection side. A client is over its limit — send back a correct 429. Fill in the status code and the two headers a polite response must carry.

🎯 Your turn: build the 429 response

<?php
// 🎯 YOUR TURN — a client is over its limit. Send the right rejection.
// Fill in the status code and the two headers a well-behaved 429 must carry.

$limit      = 100;
$retryAfter = 30;                   // seconds until the window resets

// 1) The HTTP status for "you've sent too many requests".
$status = ___;                      // 👉 the number is 429

// 2) Tells the client HOW LONG to wait before retrying.
$retryHeader = "Retry-After: " . ___;   // 👉 use $retryAfter

// 3) Tells the client how many requests it had in total.
$limitHeader = "X-RateLimit-Limit: " . ___;   // 👉 use $limit

echo "HTTP/1.1 {$status} Too Many Requests\n";
echo "{$limitHeader}\n";
echo "X-RateLimit-Remaining: 0\n";
echo "{$retryHeader}\n";

// ✅ Expected output:
//    HTTP/1.1 429 Too Many Requests
//    X-RateLimit-Limit: 100
//    X-RateLimit-Remaining: 0
//    Retry-After: 30

Output

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
Retry-After: 30

The status is 429; Retry-After uses $retryAfter and the limit header uses $limit. Match the four expected lines.

Common Errors (and the fix)

The count is sometimes wrong under load (race condition) — you did $count = $redis->get($key); $redis->set($key, $count + 1);. Two requests can read the same value and both write back the same number, losing a hit. Use $redis->incr($key) instead: it reads, adds, and writes as one atomic step.
Clients can send double the limit around the minute mark — that's the fixed-window boundary burst. It's expected behaviour, not a bug. If the spike matters, switch to a sliding window or a token bucket, which spread the allowance smoothly instead of resetting in one jump.
Clients hammer your server right after a 429 — you returned 429 with no Retry-After header, so the client has no idea when to come back and just retries instantly. Always send Retry-After (and ideally X-RateLimit-Reset) so it backs off.
Your database is still getting crushed despite a limit — you're limiting too late, after authentication and queries have run. Move the check into middleware at the front of the pipeline and return/exit on 429 before any expensive work.
"Warning: Cannot modify header information — headers already sent" — you echoed output (even a stray blank line before <?php) before calling header() or http_response_code(). Send all status codes and headers before any output.

Pro Tips

💡 Limit writes harder than reads. A bot scraping GETs is annoying; a bot firing thousands of POSTs to create accounts or spam is dangerous — give write endpoints a tighter cap.
💡 Send X-RateLimit-* on every response, not just the 429. Good clients read X-RateLimit-Remaining and slow down before they ever hit the wall.
💡 Don't reinvent it in production. Frameworks ship this — Laravel's throttle middleware and Symfony's RateLimiter component are battle-tested. Build it once to understand it, then lean on the framework.

📋 Quick Reference — Algorithms & Headers

Algorithm	How It Works	Best For
Fixed Window	One counter per time slot, resets on the boundary	Simplest; OK if boundary bursts are fine
Sliding Window	Counts the trailing N seconds from now	Smooth limit, no boundary spike
Token Bucket	Spend tokens; refill at a steady rate	Allows bursts, caps the average (most common)
Leaky Bucket	Queue drains at a constant rate	Steady output; smooths bursts (adds latency)
429	HTTP status "Too Many Requests"	The response when a client is over limit
Retry-After	Seconds (or date) to wait before retrying	Always send with a 429
X-RateLimit-*	Limit / Remaining / Reset budget	Send on every response so clients self-throttle

Frequently Asked Questions

Q: Should I rate limit by IP address or by user?

Prefer the most specific identity you have. Limit logged-in callers by their user ID (or API key) so that everyone sharing one office or mobile-carrier IP isn't punished together — dozens of real users can sit behind a single corporate NAT. Fall back to IP only for anonymous traffic where you have nothing better. A common pattern is: API key if present, else user ID, else IP.

Q: What's the difference between fixed window and sliding window?

A fixed window counts requests inside rigid slots (e.g. each clock minute) and resets to zero at the boundary. It's dead simple but allows a burst at the seam: a client can send the full limit at 0:59 and the full limit again at 1:00 — double the rate across two seconds. A sliding window smooths this out by counting requests in the trailing N seconds from now, or by weighting the previous window, so there's no boundary spike. Token bucket solves the same problem differently by letting saved-up tokens absorb bursts up to a fixed capacity.

Q: Why use Redis instead of just a PHP variable or the session?

A normal PHP variable lives for one request and then vanishes, and the session is per-user on local disk — neither is shared. Real apps run several PHP processes across several servers behind a load balancer, so the counter has to live in one shared place they can all reach. Redis is the standard choice: its INCR command atomically adds 1 and returns the new value (so two simultaneous requests can't both read a stale count), and EXPIRE auto-deletes the key when the window ends, so you never have to clean up.

Q: What status code and headers should a rate-limited response return?

Return 429 Too Many Requests. Always add a Retry-After header (seconds to wait, or an HTTP date) so a well-behaved client backs off instead of hammering you. It's also good practice to send X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset on every response — including successful ones — so clients can self-throttle before they ever hit the wall.

Q: Where in my app should the rate-limit check go?

As early as possible — in middleware at the very front of the request pipeline, before routing, authentication-heavy work, or any database queries. The whole point is to reject abusive traffic cheaply; if you check the limit after running the expensive controller, you've already paid the cost you were trying to avoid. Return the 429 immediately and short-circuit the rest of the pipeline.

Mini-Challenge: Per-Endpoint Limits

No code is filled in this time — just a brief and an outline. Write it yourself, run it on onecompiler.com/php or your own machine, then check your result against the expected output in the comments. This is the write-run-check loop you'll use on every real limiter.

🎯 Mini-Challenge: build a per-endpoint limit table

<?php
// 🎯 MINI-CHALLENGE: A per-endpoint limiter table.
// No code is filled in — work from the steps below, then run it.
//
// 1. Make an array $endpoints. Each entry has a 'limit' and a label, e.g.
//      'auth:login' => ['limit' => 5,   'label' => '5 per 5min'],
//      'api:search' => ['limit' => 30,  'label' => '30 req/min'],
//      'api:read'   => ['limit' => 100, 'label' => '100 req/min'],
// 2. Loop over $endpoints with foreach ($endpoints as $name => $cfg).
// 3. For each one, pretend 6 requests just arrived. The remaining budget is
//      max(0, $cfg['limit'] - 6).
// 4. echo a line:  "<name>: <label> -> <remaining> remaining after 6 hits"
//
// Tip: printf("%-12s %s\n", ...) lines the columns up nicely.
//
// ✅ Expected output:
//    auth:login   5 per 5min -> 0 remaining after 6 hits
//    api:search   30 req/min -> 24 remaining after 6 hits
//    api:read     100 req/min -> 94 remaining after 6 hits

// your code here

Loop over an endpoints array, compute max(0, limit - 6) for each, and print one line per endpoint matching the expected output.

🎉 Lesson Complete!

✅ Rate limiting stops abuse, brute-force, DDoS, and runaway costs by capping requests per client
✅ Fixed window is simplest but bursts at the boundary; sliding window smooths that out
✅ Token bucket allows bursts while capping the average; leaky bucket drains at a constant rate
✅ Redis INCR + EXPIRE counts hits atomically across many servers
✅ Scope limits per API key, user, then IP — and check in middleware, before any real work
✅ Reject with 429 plus Retry-After and X-RateLimit-* headers
✅ Next lesson: Working with Queues — offload slow work so requests stay fast

Rate Limiting & Throttling 🚦

What You'll Learn in This Lesson

1️⃣ Why Rate Limit At All?

2️⃣ Fixed Window — The Simplest Algorithm

3️⃣ Token Bucket — Allow Bursts, Cap the Average

4️⃣ Counting With Redis & Returning 429

5️⃣ Per-IP, Per-User, Per-Key — and Where the Check Goes

6️⃣ Your Turn

Common Errors (and the fix)

Pro Tips

📋 Quick Reference — Algorithms & Headers

Frequently Asked Questions

Mini-Challenge: Per-Endpoint Limits

🎉 Lesson Complete!

Cookie & Privacy Settings