Advanced • C++
Understanding Undefined Behavior
By the end of this lesson you'll be able to recognise the most common sources of undefined behavior (UB) in C++, explain why "it works on my machine" is never proof of correctness, understand how the optimizer exploits UB, and catch it automatically with sanitizers — replacing each dangerous pattern with a safe alternative.
What You'll Learn
- Define UB and explain why it is more dangerous than a crash
- Spot the common UB sources: out-of-bounds, use-after-free, signed overflow, uninitialized reads, null deref, bad shifts
- Understand dangling references, invalid downcasts, and data races as UB
- Explain how the optimizer assumes-and-exploits UB to delete your checks
- Catch UB with UBSan, ASan, and -Wall -Wextra
- Replace every UB pattern with a safe modern-C++ alternative
💡 Real-World Analogy
UB is like an unsigned legal contract with a blank clause. The C++ standard is the contract; most operations have spelled-out rules. But a handful — reading past an array, dereferencing a freed pointer — fall into a clause that simply says "results are undefined". The compiler is then free to interpret that blank however makes your program fastest, including assuming you'd never trigger it. So when you do trigger it, there's no rule protecting you: the program might crash, might print nonsense, or might silently behave differently after the next recompile. UB isn't "a bug that crashes" — it's "a bug with no guaranteed behavior at all", which is far worse, because you can't even rely on it failing.
⚠️ The Common Sources of UB
| UB Source | What triggers it | Caught by |
|---|---|---|
| Out-of-bounds access | v[5] on a size-3 vector | ASan |
| Use-after-free | use *p after delete p | ASan |
| Dangling reference | return ref to a local | ASan / -Wall |
| Signed overflow | INT_MAX + 1 | UBSan |
| Uninitialized read | int x; use x; | -Wall / MSan |
| Null dereference | *p when p is null | UBSan / ASan |
| Invalid downcast | bad static_cast of base* | UBSan (vptr) |
| Data race | 2 threads, 1 unguarded var | TSan |
UBSan = -fsanitize=undefined, ASan = -fsanitize=address, TSan = -fsanitize=thread, MSan = -fsanitize=memory. None of these slow your release build — you run them in debug and test builds.
1. Common UB — and the Safe Fix Beside It
The fastest way to learn UB is to see the dangerous line right next to its safe replacement. In the worked example below, every UB line is commented out (so the program still runs) and the safe version is live. Read each comment, run it, and notice that the fixes — .at(), a null check, initializing, a wider type — cost almost nothing.
Worked example: five UB traps and their fixes
Each UB line is commented out; the safe version runs. Read every comment.
#include <iostream>
#include <vector>
#include <climits>
using namespace std;
// Undefined Behavior (UB) = an operation the C++ standard leaves
// with NO rules. The compiler may crash, print garbage, or seem
// to "work" today and break after the next recompile.
// Below: each UB line is COMMENTED OUT, with the safe fix live.
int main() {
// 1) Signed integer overflow -> UB
// int x = INT_MAX; x = x + 1; // UB: overflow is undefined
long long big = (long long)INT_MAX + 1; // S
...Notice the pattern: the bounds-checked call v.at(5) throws instead of quietly corrupting memory, and a wider type holds a value that would overflow a 32-bit int. The unsafe versions might appear to work — that's exactly why UB is so treacherous.
2. How the Compiler Exploits UB
Here's the part that surprises everyone. The optimizer is allowed to assume your program never has UB. So if you dereference a pointer, it concludes the pointer can't be null — and may delete a null check you wrote afterwards, because (in its reasoning) that check could only matter in a case that already invoked UB. Your safety net silently disappears. The rule that saves you is simple: check before you act, never after.
Worked example: the optimizer deletes a late check
See why a null check AFTER a dereference can vanish — and how to order it correctly.
#include <iostream>
using namespace std;
// The optimizer ASSUMES your program has no UB. So it reasons:
// "this branch can only run after UB, therefore it never runs,
// therefore I can delete it." Your safety net vanishes silently.
int readFirst(int* p) {
int value = *p; // (A) dereference -> compiler now ASSUMES p != null
if (p == nullptr) { // (B) compiler can DELETE this check —
return -1; // it "proved" p is non-null at (A)
}
return v
...Now you try. The program below contains UB. Fill in the two blanks marked ___ to make every access safe, using the hints in the comments.
🎯 Your turn: make the access safe
Fill in the ___ blanks so the bad index throws and the null pointer is guarded.
#include <iostream>
#include <vector>
using namespace std;
int main() {
// 🎯 YOUR TURN — this code has UB. Make it safe.
vector<int> scores = {90, 80, 70};
// 1) Read index 5 SAFELY so a bad index throws instead of UB.
// scores[5] would be undefined behavior — use the checked call.
try {
int s = scores.___(5); // 👉 the bounds-checked member: at
cout << s << endl;
} catch (const out_of_range& e) {
cout << "Out of range!" << endl;
}
...3. Lifetime UB: Dangling References & Use-After-Free
A dangling reference points at memory whose owner has already been destroyed; use-after-free touches memory you've deleted. Both are UB and both are among the hardest bugs to find, because the freed memory often still holds the old value — until something else reuses it. The cure is ownership: return by value instead of returning references to locals, and let std::unique_ptr own heap memory so it can never outlive its owner.
Worked example: dangling refs and use-after-free
See the UB versions (commented) beside ownership-based safe fixes.
#include <iostream>
#include <memory>
#include <string>
using namespace std;
// Lifetime UB: touching memory AFTER its owner is gone.
// Two classics — a dangling reference and use-after-free.
// BAD: returns a reference to a local that dies at the brace.
// const string& makeName() {
// string name = "Ada"; // local
// return name; // UB: 'name' is destroyed on return
// }
// SAFE: return BY VALUE — the caller gets its own copy.
string makeNameSafe() {
string name = "
...4. Signed Overflow & Catching UB with Sanitizers
Signed integer overflow — like INT_MAX + 1 — is UB (unsigned overflow, by contrast, is defined and wraps around). You prevent it by checking before you add, widening the type, or using an unsigned counter. But you can't eyeball UB reliably, so the real workhorses are sanitizers: build with -fsanitize=undefined (UBSan) and -fsanitize=address (ASan) plus -Wall -Wextra, run your tests, and the tools print the exact file and line where UB occurs.
🎯 Your turn: stop signed overflow
Fill in the blank so INT_MAX + 1 is blocked instead of overflowing.
#include <iostream>
#include <climits>
using namespace std;
int main() {
// 🎯 YOUR TURN — signed overflow is UB; prevent it with a check.
int a = INT_MAX;
int b = 1;
// 1) Only add when it is SAFE. Adding b would overflow if
// a is already greater than INT_MAX - b.
if (a > INT_MAX - ___) { // 👉 the value being added: b
cout << "Would overflow - blocked!" << endl;
} else {
cout << a + b << endl;
}
// ✅ Expected output:
// Wou
...This last worked example isn't something to run for output — it's the exact compile command you'll use to catch UB in your own projects. Read the flags and the sample sanitizer reports.
Worked example: the sanitizer build command
The compile flags that catch UB at runtime, with sample reports.
#include <iostream>
using namespace std;
// You cannot SEE UB by reading code alone — tools catch it.
// Build with sanitizers and full warnings, then run your tests.
int main() {
// Compile this file like so, then run it:
//
// g++ -std=c++20 -Wall -Wextra -fsanitize=undefined,address -g main.cpp
//
// -Wall -Wextra turn on the high-value warnings
// -fsanitize=undefined UBSan: overflow, bad shifts, bad casts
// -fsanitize=address ASan: out-o
...🔎 Deep Dive: invalid downcasts & data races
Two more UB sources catch intermediate programmers. An invalid downcast happens when you static_cast a base pointer to a derived type the object isn't — the compiler trusts you, and using the result is UB. Use dynamic_cast (which returns nullptr on a bad cast) when you're not certain of the runtime type.
A data race is two threads accessing the same variable at the same time with at least one writing, and no synchronisation — also UB. Protect shared state with a std::mutex or make it std::atomic. The thread sanitizer (-fsanitize=thread) finds these.
Base* b = new Base();
Derived* d = static_cast<Derived*>(b); // ❌ UB: b is not a Derived
Derived* safe = dynamic_cast<Derived*>(b); // ✅ safe -> nullptr here
int counter = 0; // shared by two threads:
// thread A: counter++; // ❌ data race -> UB
// thread B: counter++;
std::atomic<int> ok{0}; // ✅ atomic, or guard with a std::mutexPro Tips
- 💡 "It works" proves nothing: UB can produce correct output today and fail after a recompile. Only sanitizers and reasoning prove a program is UB-free.
- 💡 Check before you act: guard a pointer before dereferencing — a check placed after UB can be optimized away.
- 💡 Prefer
.at()while learning: it throws on a bad index instead of silently corrupting memory like[]. - 💡 Own memory with smart pointers:
std::unique_ptrandstd::shared_ptrmake use-after-free almost impossible. - 💡 Make sanitizers your default debug build:
-Wall -Wextra -fsanitize=undefined,addresson every test run.
Common Errors (and the fix)
- "runtime error: signed integer overflow": UBSan caught
a + bexceedingINT_MAX. Checka > INT_MAX - bfirst, or use a wider/unsigned type. - "AddressSanitizer: heap-buffer-overflow": you indexed past the end with
v[i]. Usev.at(i)or guardi < v.size(). - "AddressSanitizer: heap-use-after-free": you used a pointer after
delete. Let astd::unique_ptrown the memory so it can't be used past its lifetime. - "warning: reference to local variable returned" (-Wall): you returned a reference to a local. Return by value instead.
- "runtime error: load of null pointer" (UBSan): you dereferenced a null pointer. Add
if (p)before using*p.
📋 Quick Reference: UB → sanitizer / fix
| UB Source | Catch it with | Safe fix |
|---|---|---|
| Out-of-bounds | -fsanitize=address | .at() / std::span |
| Use-after-free | -fsanitize=address | unique_ptr |
| Dangling ref | -Wall / ASan | return by value |
| Signed overflow | -fsanitize=undefined | check / wider type |
| Uninitialized read | -Wall / MSan | int x{}; |
| Null deref | -fsanitize=undefined | if (p) before *p |
| Invalid downcast | -fsanitize=undefined | dynamic_cast |
| Data race | -fsanitize=thread | mutex / atomic |
Frequently Asked Questions
Q: What exactly is undefined behavior in C++?
Undefined behavior (UB) is any operation the C++ standard places no requirements on — like reading past the end of an array. When it happens, the compiler is allowed to do literally anything: crash, return wrong answers, or appear to work today and break after a recompile. It is not the same as a guaranteed crash; that is the danger.
Q: If my program runs fine, can it still have UB?
Yes, and this is the trap that catches everyone. UB can produce the 'right' output on your machine, with your compiler, today, then fail on a different platform or after the optimizer changes. 'It works' is never proof that code is UB-free — only sanitizers and careful reasoning are.
Q: Why does the compiler 'exploit' UB instead of warning me?
The standard says a program with UB has no defined meaning, so the optimizer is free to assume UB never happens. If you dereference a pointer, it assumes the pointer is non-null and may delete a later null check. This produces faster code for correct programs, but silently removes your safety nets in buggy ones.
Q: What is the difference between UBSan and ASan?
UBSan (-fsanitize=undefined) catches language-level UB like signed overflow, invalid shifts, and bad casts. ASan (-fsanitize=address) catches memory errors like out-of-bounds access, use-after-free, and leaks. They complement each other — run both in your debug and test builds.
Q: Is unsigned overflow also undefined behavior?
No. Unsigned integer overflow is fully defined — it wraps around modulo 2^N. Only signed integer overflow is UB. That is exactly why checked arithmetic, wider types, or unsigned counters are the safe fixes when you cannot guarantee a result fits.
Mini-Challenge: Safe Lookup
No blanks this time — just a brief and an outline. Write a lookup that never invokes UB no matter what index it's given. Build it, run it, and check your output against the examples in the comments.
🎯 Mini-Challenge: write a UB-free lookup
Return a vector element safely, printing 'Invalid index' for a bad index.
#include <iostream>
#include <vector>
using namespace std;
int main() {
// 🎯 MINI-CHALLENGE: Safe lookup
// 1. Make a vector<int> called data with {5, 10, 15}.
// 2. Ask for an index with: int i; cin >> i; (or just set int i = 5;)
// 3. SAFELY return data at index i:
// - if i is in range, print the value
// - otherwise print "Invalid index" (no UB, no crash)
// Hint: guard with if (i >= 0 && i < (int)data.size())
// OR use a try/catch a
...🎉 Lesson Complete
- ✅ UB is an operation the standard leaves with no rules — worse than a guaranteed crash
- ✅ Common sources: out-of-bounds, use-after-free, dangling refs, signed overflow, uninitialized reads, null deref, bad downcasts, data races
- ✅ The optimizer assumes UB never happens and can delete checks placed after it — so check first
- ✅ Catch UB with
-fsanitize=undefined(UBSan),-fsanitize=address(ASan),-fsanitize=thread(TSan), and-Wall -Wextra - ✅ Safe fixes:
.at(), smart pointers, return-by-value, checked arithmetic,dynamic_cast,mutex/atomic - ✅ Next lesson: Inline Assembly — drop below C++ to talk to the CPU directly
Sign up for free to track which lessons you've completed and get learning reminders.