Lesson 44 • Advanced
Performance Optimization ⚡
By the end of this lesson you'll know how to measure a PHP app, find the real bottleneck, and fix it — killing N+1 queries, turning on OPcache, caching expensive work, and keeping memory flat with generators — so your pages stay fast under real traffic.
What You'll Learn in This Lesson
- Measure code with microtime() and profile with Xdebug or Blackfire before you optimise
- Spot and fix the N+1 query problem with eager loading and database indexes
- Speed responses 30-70% by enabling and tuning OPcache (and know when JIT helps)
- Cache expensive results with APCu or Redis using the check-miss-store pattern
- Move invariant work out of loops and stream data with generators to keep memory flat
- Tune the Composer autoloader and use output buffering / gzip for faster page delivery
php file.php. Timings and OPcache stats reflect your own machine, so the exact numbers will differ from the Output panels — the shape of the result is what matters.1️⃣ Measure First — Never Guess
The golden rule of performance is measure before you optimise. Your instinct about what's slow is wrong more often than it's right, and "optimised" code that you never measured is just code that's now harder to read. The quickest measuring tool is microtime(true), which returns the current time as a float — take one reading before a block and one after, and the difference is how long it took. Pair it with memory_get_peak_usage(true) to see the most RAM the script ever used.
<?php
// Rule #1 of performance: MEASURE before you change anything.
// microtime(true) gives the current time in seconds as a float.
// Subtract two readings to get how long a block took.
$start = microtime(true); // stopwatch START
$total = 0;
for ($i = 1; $i <= 1_000_000; $i++) { // do some real work
$total += $i;
}
$elapsedMs = (microtime(true) - $start) * 1000; // seconds -> milliseconds
echo "Result: " . number_format($total) . "\n";
printf("Took: %.2f ms\n", $elapsedMs); // %.2f = 2 decimal places
// memory_get_peak_usage(true) = the most RAM this script ever held.
$peakMb = memory_get_peak_usage(true) / 1048576; // bytes -> MB (1024*1024)
printf("Peak memory: %.2f MB\n", $peakMb);
?>Result: 500,000,500,000
Took: 14.30 ms
Peak memory: 2.00 MBFor real apps you'll graduate from manual timers to a proper profiler — a tool that records how long every function and query took in a request. Xdebug's profiler ships with PHP and writes a cachegrind file you open in a viewer; Blackfire.io is a hosted profiler with beautiful call graphs. Both answer the only question that matters: where is the time actually going?
2️⃣ Don't Repeat Work Inside Loops
A loop that runs a million times amplifies any waste inside it a million-fold. The classic mistake is calling something like count($items) in the loop condition, so it's recomputed on every single pass. Anything whose answer can't change inside the loop is invariant — compute it once above the loop and reuse the variable. Compare these two:
<?php
// ❌ BEFORE: count($items) runs on EVERY pass of the loop, and the
// "Item " text is rebuilt from scratch each time. Tiny here, but this
// exact pattern melts servers when the loop body does real work.
$items = range(1, 5);
$start = microtime(true);
for ($i = 0; $i < count($items); $i++) { // count() called every loop!
echo "Item " . ($i + 1) . " of " . count($items) . "\n"; // again!
}
printf("Time: %.4f ms\n", (microtime(true) - $start) * 1000);
?>Item 1 of 5
Item 2 of 5
Item 3 of 5
Item 4 of 5
Item 5 of 5
Time: 0.0480 ms<?php
// ✅ AFTER: hoist the invariant out of the loop. count() runs ONCE.
// Anything whose answer can't change inside the loop belongs ABOVE it.
$items = range(1, 5);
$start = microtime(true);
$n = count($items); // compute once, reuse $n
for ($i = 0; $i < $n; $i++) { // condition is now a cheap variable read
echo "Item " . ($i + 1) . " of " . $n . "\n";
}
printf("Time: %.4f ms\n", (microtime(true) - $start) * 1000);
?>Item 1 of 5
Item 2 of 5
Item 3 of 5
Item 4 of 5
Item 5 of 5
Time: 0.0210 msThe output is identical — but the "after" version does the expensive part once instead of N times. With five items it's invisible; with a database call or a heavy calculation in the loop body, hoisting it out is the difference between snappy and unusable.
3️⃣ The N+1 Query Problem (the big one)
Most PHP slowness lives in the database, and the worst offender is the N+1 query problem: one query to fetch a list, then one more query per row in that list. Fetch 100 users, loop over them asking for each one's orders, and you've fired 101 separate queries — each a round-trip to the database. Here's the trap:
<?php
// ❌ BEFORE: the N+1 query problem.
// 1 query to fetch users, then 1 MORE query PER user to fetch their orders.
// 100 users => 1 + 100 = 101 round-trips to the database. Pages crawl.
$users = $db->query("SELECT id, name FROM users"); // query #1
foreach ($users as $user) {
// ⚠️ a NEW query every single loop -> this is the N+1
$orders = $db->query(
"SELECT * FROM orders WHERE user_id = " . $user["id"]
);
echo $user["name"] . ": " . count($orders) . " orders\n";
}
// Total queries for 100 users: 101 (1 + N)$db connection. The point is the count of queries: 1 + N. Never run a query inside a loop like this.The fix is eager loading: gather the IDs, fetch every related row in one WHERE user_id IN (...) query, then group the results in PHP with an array. 101 queries become 2 — and they stay 2 whether you have 100 users or 100,000.
<?php
// ✅ AFTER: eager-load everything in ONE extra query, then group in PHP.
// 100 users => just 2 queries total, no matter how many users there are.
$users = $db->query("SELECT id, name FROM users"); // query #1
$ids = array_column($users, "id"); // [1, 2, 3, ...]
$in = implode(",", array_map("intval", $ids)); // "1,2,3,..." (safe ints)
// query #2: every matching order in a single round-trip
$rows = $db->query("SELECT user_id, id FROM orders WHERE user_id IN ($in)");
// Bucket the orders by user_id so lookups are instant (no more queries).
$byUser = [];
foreach ($rows as $row) {
$byUser[$row["user_id"]][] = $row;
}
foreach ($users as $user) {
$count = count($byUser[$user["id"]] ?? []); // array lookup, not a query
echo $user["name"] . ": " . $count . " orders\n";
}
// Total queries: 2 (1 + 1) — and an INDEX on orders.user_id makes #2 instant.orders.user_id so that single IN() query stays instant as the table grows.Two more database essentials: an index is a lookup structure that lets the database jump straight to matching rows instead of scanning the whole table — always index the columns you filter or join on (like a foreign key). And a value used in a query should be parameterised or cast (here intval) so it's safe. ORMs like Laravel's Eloquent expose eager loading as User::with('orders') — same fix, less typing.
4️⃣ Cache Expensive Work
If a result is slow to compute and rarely changes, cache it: do the work once, store the answer, and serve the stored copy next time. APCu is the simplest cache — an in-memory store local to one server. Redis works identically but is a shared service across many servers. The pattern never changes: check the cache → on a miss, compute and store with a TTL → return. A TTL (time-to-live) is how many seconds the cached value stays valid before it's recomputed.
<?php
// Caching = compute the slow thing ONCE, reuse the answer.
// APCu (in-process, per-server) is the simplest cache; Redis works the
// same way but is shared across servers. The pattern is identical:
// 1) try the cache 2) on a miss, compute + store 3) return.
function expensiveReport(): array {
usleep(200000); // pretend this takes 200 ms (a real DB report)
return ["sales" => 4200, "orders" => 87];
}
function getReport(): array {
$key = "daily_report";
// 1) Try the cache. apcu_fetch sets $hit = true on a hit.
$cached = apcu_fetch($key, $hit);
if ($hit) {
return $cached; // FAST path — no work done
}
// 2) Cache miss: do the slow work, then store it for 300 seconds.
$report = expensiveReport();
apcu_store($key, $report, 300); // TTL = time-to-live in seconds
return $report;
}
$first = getReport(); // miss -> ~200 ms (computes and stores)
$second = getReport(); // hit -> ~0 ms (served from memory)
echo "Sales: " . $first["sales"] . ", Orders: " . $first["orders"] . "\n";
echo "Second call served from cache (no recompute).\n";
?>Sales: 4200, Orders: 87
Second call served from cache (no recompute).apcu_* for Redis calls and the logic is identical.5️⃣ Keep Memory Flat with Generators & Lazy Loading
Speed isn't the only resource — memory matters too. Building a giant array to loop over it holds every element in RAM at once. A generator (a function that uses yield) produces values lazily, one at a time, so peak memory stays flat no matter how many items you process. This is the same idea as lazy loading: don't fetch or build something until you actually need it.
<?php
// Streaming with a generator (yield) keeps memory FLAT no matter the size.
// An array of 1,000,000 ints needs ~30+ MB at once. A generator hands you
// one value at a time, so peak memory barely moves.
function readNumbers(int $count): Generator {
for ($i = 1; $i <= $count; $i++) {
yield $i; // produced lazily, one at a time
}
}
$total = 0;
foreach (readNumbers(1_000_000) as $n) { // never builds the full array
$total += $n;
}
echo "Sum of 1..1,000,000 = " . number_format($total) . "\n";
printf("Peak memory: %.2f MB\n", memory_get_peak_usage(true) / 1048576);
echo "(A real array of 1M ints would use ~30+ MB; the generator stays tiny.)\n";
?>Sum of 1..1,000,000 = 500,000,500,000
Peak memory: 2.00 MB
(A real array of 1M ints would use ~30+ MB; the generator stays tiny.)Use generators whenever you read a large file line-by-line, page through a big query result, or transform a long stream — anything where you don't need every item in memory simultaneously.
6️⃣ Turn On OPcache (and Know When JIT Helps)
Every request, PHP normally re-reads and re-compiles your source files into bytecode before running them. OPcache stores that compiled bytecode in shared memory so the parse-and-compile step is skipped — commonly a 30-70% speed-up for free. It ships with PHP; you just enable it. In production set opcache.validate_timestamps=0 so PHP never checks file timestamps per request, and clear the cache on each deploy so new code is picked up.
<?php
// OPcache stores COMPILED bytecode in shared memory, so PHP skips the
// parse-and-compile step on every request (commonly 30-70% faster).
// opcache_get_status() lets you see how well it's working.
if (!function_exists("opcache_get_status")) {
exit("OPcache is not enabled in this PHP build.\n");
}
$status = opcache_get_status(false);
if ($status === false || empty($status["opcache_enabled"])) {
exit("OPcache is installed but disabled (set opcache.enable=1).\n");
}
$stat = $status["opcache_statistics"];
printf("Cached scripts: %d\n", $stat["num_cached_scripts"]);
printf("Hit rate: %.2f%%\n", $stat["opcache_hit_rate"]); // higher = better
printf("Cache hits: %s\n", number_format($stat["hits"]));
printf("Cache misses: %s\n", number_format($stat["misses"]));
/*
Recommended php.ini for PRODUCTION:
opcache.enable=1
opcache.memory_consumption=128 ; MB of shared memory for opcodes
opcache.max_accelerated_files=10000 ; raise above your file count
opcache.validate_timestamps=0 ; 0 in prod: never stat() files per request
; (clear the cache on each deploy instead)
opcache.jit_buffer_size=64M ; JIT cache (PHP 8+)
opcache.jit=tracing ; enable the tracing JIT for CPU-heavy code
*/JIT (Just-In-Time compilation), added in PHP 8, goes one step further by compiling hot code paths to native machine code. It's a big win for CPU-bound work — maths, image processing, simulations — but typical web apps spend most of their time waiting on the database, so JIT often helps them only a little. Turn it on with opcache.jit=tracing, then measure; if the database is your bottleneck, fix that first.
7️⃣ Autoloader, Output Buffering & gzip
Three quick production wins. First, optimise the Composer autoloader: composer dump-autoload --classmap-authoritative pre-builds one complete class-to-file map, so PHP stops hitting the filesystem to locate classes on every request. Second, output buffering with ob_start() collects your output in memory and sends it in one efficient chunk instead of many small writes. Third, enable gzip compression at the web server (or with ob_start("ob_gzhandler")) so HTML, CSS, and JS travel to the browser much smaller — often a 70%+ reduction in bytes over the wire.
8️⃣ Your Turn — Time a Block
Now you try. The script below is almost complete — fill in each ___ using the 👉 hint, then run it and compare against the Output panel. You only need two microtime(true) readings.
<?php
// 🎯 YOUR TURN — time a block of code. Fill in each ___ , then run it.
// You only need TWO microtime(true) readings: one before, one after.
$start = ___; // 👉 take the START reading: microtime(true)
$sum = 0;
for ($i = 1; $i <= 500_000; $i++) { // the work we're timing
$sum += $i;
}
$elapsedMs = (___ - $start) * 1000; // 👉 take the END reading, then subtract $start
echo "Sum: " . number_format($sum) . "\n";
printf("Took: %.2f ms\n", $elapsedMs);
// ✅ Expected output (numbers vary per machine):
// Sum: 125,000,250,000
// Took: 7.10 ms
?>Sum: 125,000,250,000
Took: 7.10 ms___ with microtime(true). The exact ms will differ on your machine — that's fine, the Sum must match.One more — hoist the invariant count() out of the loop so it's computed once, not on every pass.
<?php
// 🎯 YOUR TURN — kill the N+1 by hoisting work out of the loop.
// count($prices) is being recomputed on every pass. Move it ABOVE the loop
// into $n, then use $n in the condition so it's only computed once.
$prices = [10, 20, 30, 40];
___ // 👉 add a line: $n = count($prices);
for ($i = 0; $i < ___; $i++) { // 👉 use your $n here instead of count(...)
echo "Price #" . ($i + 1) . ": $" . $prices[$i] . "\n";
}
// ✅ Expected output:
// Price #1: $10
// Price #2: $20
// Price #3: $30
// Price #4: $40
?>Price #1: $10
Price #2: $20
Price #3: $30
Price #4: $40$n = count($prices); above the loop, then use $n in the condition. The output should be the four prices.Common Errors (and the fix)
- Premature optimisation — you rewrote code that wasn't even slow. You spent a day micro-tuning a loop while a single N+1 query was the real cost. Fix: always profile first; optimise the proven bottleneck, leave the rest readable.
- Optimising with no profiler ("I'm sure it's the array_map"). Guessing wastes time and usually misses the mark. Fix: measure with
microtime(), then move up to Xdebug's profiler or Blackfire to see exactly which calls and queries dominate. - Page is slow and the database log is full of identical queries — the N+1 problem. A query is running inside a loop. Fix: eager-load with one
WHERE id IN (...)query and group in PHP, and add an index on the filtered column. - Production is mysteriously slow but works fine locally — OPcache is off. Without it, PHP recompiles every file on every request. Fix: set
opcache.enable=1andopcache.validate_timestamps=0inphp.ini, then clear the cache on each deploy. - "Allowed memory size of N bytes exhausted" — you loaded a huge result set or file into one array. Fix: stream it with a generator (
yield) so only one item is in memory at a time.
Pro Tips
- 💡 Fix the biggest bottleneck first. A 10% gain on the slowest 80% beats a 50% gain on the fastest 1%. Profile, fix the top item, re-profile.
- 💡 The database is almost always the answer. Reach for query count, indexes, and caching before you ever micro-tune PHP code.
- 💡 Cache with a sensible TTL. Too long and users see stale data; too short and you barely cache. Match the TTL to how often the data really changes.
- 💡 OPcache in prod, validate_timestamps off, clear on deploy. This one-time setup is the cheapest speed-up you'll ever ship.
📋 Quick Reference — PHP Performance
| Tool / Technique | How | What It Does |
|---|---|---|
| microtime(true) | $t = microtime(true); | Time a block of code |
| Xdebug / Blackfire | profile a request | Find the real bottleneck |
| Eager loading | WHERE id IN (...) | Fix N+1 (1+N → 2 queries) |
| Index | CREATE INDEX ... | Make filtered lookups instant |
| APCu / Redis | apcu_store($k, $v, $ttl) | Cache expensive results |
| Generator | yield $item; | Stream data, keep memory flat |
| OPcache | opcache.enable=1 | Cache compiled bytecode (30-70% faster) |
| JIT | opcache.jit=tracing | Native code for CPU-heavy work |
| Autoloader | dump-autoload --classmap-authoritative | Skip filesystem class lookups |
Frequently Asked Questions
Q: Where should I start when a PHP page is slow?
Always measure before you change anything. Profile the request with Xdebug's profiler or Blackfire.io to see exactly which functions and database queries eat the time, then fix the single biggest offender first. Optimising code you only assumed was slow wastes effort and often makes the code harder to read for no measurable gain. The biggest real-world win is almost always the database — N+1 queries and missing indexes — not micro-tuning a loop.
Q: What is the N+1 query problem and how do I fix it?
N+1 happens when you run one query to fetch a list (the 1), then one extra query for each row in that list (the N) — 100 users becomes 101 database round-trips. The fix is eager loading: collect the IDs and fetch all the related rows in a single WHERE id IN (...) query, then group them in PHP. That turns 101 queries into 2. Add a database index on the foreign-key column (e.g. orders.user_id) so that second query stays fast as the table grows. ORMs like Eloquent expose this as with('orders').
Q: What is OPcache and do I need it in production?
OPcache stores the compiled bytecode of your PHP files in shared memory, so the server skips re-parsing and re-compiling every file on every request — commonly a 30-70% speed-up for free. It is bundled with PHP and you should absolutely enable it in production. Set opcache.validate_timestamps=0 in production so PHP never checks file timestamps on each request, and clear the cache (or restart PHP-FPM) as part of every deploy so new code is picked up.
Q: What is the JIT and will it speed up my web app?
JIT (Just-In-Time compilation), added in PHP 8, compiles hot code paths down to native machine code at runtime. It gives large gains on CPU-bound work like maths, image processing, or simulations, but typical web apps spend most of their time waiting on the database and network, so JIT often helps them only a little. Enable it with opcache.jit=tracing and opcache.jit_buffer_size=64M, then measure — if your bottleneck is the database, fix that first.
Q: When should I reach for caching, APCu, or Redis?
Cache any result that is expensive to compute and changes rarely — a dashboard report, a rendered template fragment, an external API response. APCu is the simplest: an in-memory cache local to one PHP process/server, ideal for a single box. Redis (or Memcached) is a separate service shared across many servers, so use it when you scale out or need the cache to survive a deploy. The pattern is the same everywhere: check the cache, and only on a miss do the slow work and store it with a sensible TTL (time-to-live).
Mini-Challenge: Measure-Then-Cache
No code is filled in this time — just a brief and an outline. Write it yourself, run it on onecompiler.com/php or your own machine, then check your result against the expected output in the comments. This is exactly the measure → cache → re-measure loop you'll use on every real bottleneck.
<?php
// 🎯 MINI-CHALLENGE: a measure-then-cache function.
// No code is filled in — work from the steps, then run it.
//
// 1. Write slowSquare(int $n): sleep with usleep(100000) (~100 ms), return $n * $n.
// 2. Write cachedSquare(int $n): use a static array $cache as your cache —
// - if $cache[$n] is set, return it (the FAST path)
// - otherwise call slowSquare($n), store it in $cache[$n], return it.
// 3. Time TWO calls to cachedSquare(8) with microtime(true):
// - the first call should take ~100 ms (a cache miss)
// - the second should take ~0 ms (a cache hit)
// 4. echo the result and both timings.
//
// 💡 A "static" variable inside a function keeps its value between calls —
// perfect for a tiny in-memory cache.
//
// ✅ Expected output (timings vary):
// cachedSquare(8) = 64
// First call: 100.20 ms (miss)
// Second call: 0.00 ms (hit)
// your code here
?>🎉 Lesson Complete!
- ✅ Measure first with
microtime(), then profile with Xdebug or Blackfire — never guess - ✅ Move invariant work out of loops so it runs once, not N times
- ✅ Kill the N+1 query problem with eager loading (
WHERE id IN (...)) and add indexes - ✅ Cache expensive results with APCu or Redis using check → miss → store with a TTL
- ✅ Keep memory flat by streaming with generators (
yield) and lazy loading - ✅ Enable OPcache in production (and JIT for CPU-heavy work); optimise the autoloader and turn on gzip
- ✅ Next lesson: PHPUnit Testing — write unit and integration tests so your fast code stays correct
Sign up for free to track which lessons you've completed and get learning reminders.