Courses/C++/Advanced Containers

Lesson 19 • Advanced

Advanced Containers: How the STL Works Inside

By the end of this lesson you'll understand what really happens when a vector grows, why map and unordered_map have completely different performance, how list and deque store their data, and how to pick the right container by its Big-O cost instead of guessing.

What You'll Learn

How vector capacity grows and why push_back is amortised O(1)
When to call reserve() to avoid reallocations and copies
Why growing a vector invalidates iterators, pointers, and references
map (red-black tree, ordered, O(log n)) vs unordered_map (hash, O(1) avg)
How list and deque store data, and their cache trade-offs
Choosing a container by its complexity, not by habit

Before you start: you should already be comfortable using STL containers from STL Algorithm Mastery — creating a vector, looping over a map, and reading Big-O notation like O(n) and O(log n). This lesson opens up how those containers work so you can choose between them on purpose.

💡 Real-World Analogy

Think of a vector as a row of seats bolted to the floor. When the row fills up you can't just add one more seat on the end — you rent a bigger room, carry everyone over, and throw away the old room. That "carry everyone over" is a reallocation, and it's why anyone holding a ticket to "seat 3 in the old room" (an iterator or pointer) is suddenly pointing at nothing. A std::map is more like a library card catalogue: drawers kept in strict alphabetical order, so finding a card means a few quick narrowing steps (O(log n)). An unordered_map is a coat check: your ticket number is hashed straight to a hook, so retrieval is instant on average (O(1)) — but the coats hang in no particular order.

📊 How Each Container Stores Its Data

Container	Structure	Access	Insert / Remove
vector	Contiguous array	O(1) random	O(1) back*, O(n) middle
deque	Chunked array	O(1) random	O(1) front & back
list	Doubly-linked nodes	O(n) sequential	O(1) anywhere**
map	Red-black tree	O(log n)	O(log n)
unordered_map	Hash table (buckets)	O(1) average	O(1) average

* vector push_back is amortised O(1) — most pushes are instant, but the occasional grow copies everything. ** list O(1) insert/erase needs an iterator already pointing at the spot.

1. Inside `vector`: size, capacity & growth

A vector keeps its elements in one contiguous block of memory. Two numbers describe it: size() is how many elements you've added, and capacity() is how many it can hold before it must grab a bigger block. When you push_back past the capacity, the vector reallocates: it allocates a larger buffer (most implementations double it), copies or moves every element across, and frees the old one. Run this and watch capacity jump 1 → 2 → 4 → 8 → 16 instead of climbing by one.

Worked example: watch capacity grow

Run it and see how few reallocations 16 push_backs actually cause.

Try it Yourself »

C++

#include <iostream>
#include <vector>
using namespace std;

int main() {
    // size()     = how many elements you have put in.
    // capacity() = how many it can hold before it must reallocate.
    vector<int> v;                 // empty: no elements, no buffer yet
    cout << "start: size=" << v.size()
         << " capacity=" << v.capacity() << endl;  // size=0 capacity=0

    // Push 16 values and print ONLY when capacity changes.
    for (int i = 0; i < 16; i++) {
        size_t before = v
...

Because the buffer doubles, the rare expensive copy is spread thin across many cheap pushes — so the average cost per push_back is constant. That's what amortised O(1) means. If you already know how many elements are coming, reserve(n) grabs the whole buffer once, so there are zero reallocations while you fill it.

Worked example: reserve vs resize vs data

See how reserve pre-allocates, and how resize differs.

Try it Yourself »

C++

#include <iostream>
#include <vector>
using namespace std;

int main() {
    // reserve(n) pre-allocates the buffer WITHOUT adding elements.
    // It removes the repeated grow-and-copy cost when you know the size.
    vector<int> v;
    v.reserve(1000);               // one allocation for 1000 ints
    cout << "after reserve: size=" << v.size()
         << " capacity=" << v.capacity() << endl;  // size=0 capacity=1000

    for (int i = 0; i < 1000; i++) v.push_back(i);  // NO reallocations now

...

Your turn. Fill in the three blanks marked ___ using the hints in the comments. The goal: pre-allocate for 50 elements so the capacity never has to grow.

🎯 Your turn: reserve up front

Fill in the ___ blanks, then check your output against the expected lines.

Try it Yourself »

C++

#include <iostream>
#include <vector>
using namespace std;

int main() {
    // 🎯 YOUR TURN — replace each ___ then press "Try it Yourself".

    vector<int> scores;

    // 1) You KNOW you will add 50 scores. Pre-allocate so there are
    //    zero reallocations while you fill the vector.
    scores.___(50);               // 👉 the method that pre-allocates capacity

    for (int i = 1; i <= 50; i++) scores.push_back(i * 2);

    // 2) Print how many elements are stored (NOT the capacity)
   
...

🔎 Deep Dive: iterator invalidation on growth

Because a reallocation moves the entire buffer, every iterator, pointer, and reference you were holding into the vector now points at freed memory. Using one after the vector grows is undefined behaviour — often a crash, sometimes a silent wrong answer.

vector<int> v = {1, 2, 3};
int* p = &v[0];     // points into the current buffer
v.push_back(4);     // may reallocate -> old buffer freed
cout << *p;         // ❌ DANGLING: p points at freed memory

// Fix A: reserve so no reallocation happens
v.reserve(100);     // p taken AFTER this stays valid through 100 pushes
// Fix B: re-fetch the pointer/iterator after any growth
p = &v[0];          // ✅ valid again

Rule of thumb: never hold a raw pointer or iterator across an operation that can grow the vector. Either reserve() first, or re-fetch afterwards.

2. `map` (tree) vs `unordered_map` (hash)

Both store key → value pairs, but their internals are nothing alike. std::map is a balanced binary search tree (a red-black tree): keys are always kept sorted, so iterating gives you them in order, and every lookup is O(log n) — a handful of comparisons even for millions of keys. std::unordered_map is a hash table: it runs the key through a hash function to pick a bucket, giving O(1) average lookup, but iteration comes out in no useful order.

Worked example: ordered tree vs hashed buckets

Compare sorted map iteration with a hash table's buckets and load factor.

Try it Yourself »

C++

#include <iostream>
#include <map>
#include <unordered_map>
using namespace std;

int main() {
    // std::map = a balanced binary search tree (red-black tree).
    // Keys are kept SORTED; every lookup/insert is O(log n).
    map<string, int> ages;
    ages["Charlie"] = 35;
    ages["Alice"]   = 30;
    ages["Bob"]     = 25;

    cout << "map iterates in SORTED key order:" << endl;
    for (const auto& [name, age] : ages)        // structured binding (C++17)
        cout << "  " << name << " = 
...

The load factor (size / bucket_count) measures how crowded the buckets are. When it rises past max_load_factor (default 1.0), the table rehashes into more buckets — the hash-table version of a vector reallocation. Choose unordered_map for raw lookup speed; choose map when you need sorted order or range queries.

Now you choose. You need to count word frequencies — fast lookups, order doesn't matter. Fill in the blanks to use the right container and increment each count:

🎯 Your turn: pick the hashed container

Use the O(1)-average map and increment each word's count.

Try it Yourself »

C++

#include <iostream>
#include <unordered_map>
using namespace std;

int main() {
    // 🎯 YOUR TURN — count word frequencies. You need fast key lookup
    // and you do NOT care about sorted order -> the hash container.

    // 1) Declare a container mapping string -> int with O(1) average lookup
    ___<string, int> freq;        // 👉 the hash-table map type

    string words[] = {"red", "blue", "red", "red", "blue"};
    for (const string& w : words) {
        // 2) Increment the count for wor
...

3. Inside `list` and `deque`

A std::list is a doubly-linked list: each element is a separate node holding a value plus pointers to its neighbours. That makes inserting or erasing anywhere O(1) — if you already hold an iterator there — but there's no list[i] random access, and the scattered nodes hurt cache performance. A std::deque stores data in a sequence of fixed-size chunks, which buys you O(1) push at both ends plus O(1) random access — the one thing a vector can't do cheaply at the front.

Worked example: linked nodes vs chunked array

See deque front/back pushes and a list mid-insert via iterator.

Try it Yourself »

C++

#include <iostream>
#include <list>
#include <deque>
using namespace std;

int main() {
    // std::deque = double-ended queue, stored as a set of fixed chunks.
    // O(1) push at BOTH ends AND O(1) random access -> unlike vector.
    deque<int> dq = {2, 3};
    dq.push_front(1);          // O(1) at the front (vector would be O(n))
    dq.push_back(4);           // O(1) at the back
    cout << "deque dq[2] = " << dq[2] << endl;   // 3  (random access works)

    // std::list = a doubly-linked l
...

Pro Tips

💡 Default to vector: contiguous memory is cache-friendly and beats "smarter" containers far more often than beginners expect. Switch only with a measured reason.
💡 reserve() when you know the size: one allocation instead of a dozen grow-and-copy cycles, and your iterators survive the fill.
💡 emplace_back over push_back: it constructs the element in place, skipping a temporary copy.
💡 Don't reach for list reflexively: its O(1) middle insert is often slower in practice than a vector's O(n) because of cache misses. Profile first.

Common Errors (and the fix)

Iterator/pointer invalidation on growth: holding auto it = v.begin(); (or int* p = &v[0];) and then calling push_back leaves it dangling if the vector reallocated. Call v.reserve(n) first, or re-fetch the iterator after the push.
Not reserving in a known-size loop: for (...) v.push_back(x); without reserve triggers repeated reallocations and copies. If you know the count, v.reserve(count); once before the loop.
Assuming map is a hash table: std::map is a tree with O(log n) lookup, not O(1). If you only need fast lookup and don't care about order, use std::unordered_map.
operator[] silently inserts: if (m["missing"]) on a map creates the key with a default value. Use m.count(key) or m.find(key) != m.end() to check without inserting.
Indexing a list: myList[2] won't compile — a linked list has no random access. Use std::next(myList.begin(), 2), or pick a vector/deque if you need indexing.

📋 Quick Reference: container complexity

Container	Lookup	Insert	Ordered?	Use when
vector	O(1) by index	O(1) back*	insertion	Default; indexed data
deque	O(1) by index	O(1) both ends	insertion	Push at front & back
list	O(n)	O(1) at iterator	insertion	Many mid-list edits
map	O(log n)	O(log n)	sorted keys	Need sorted order
unordered_map	O(1) avg	O(1) avg	no	Fastest key lookup

* vector push_back is amortised O(1); a single grow copies all elements.

Frequently Asked Questions

Q: Why does vector capacity jump in big steps instead of growing by one?

Each time a vector runs out of room it allocates a bigger block (typically 1.5x or 2x the old capacity), copies the existing elements over, and frees the old block. Growing by one every time would make N push_backs cost O(N^2). Doubling makes the cost per push_back O(1) on average — this is called amortised O(1).

Q: When should I use std::map instead of std::unordered_map?

Use std::map when you need the keys kept in sorted order, or you need range queries like lower_bound. It is a balanced binary tree, so every lookup is O(log n). Use unordered_map when you only need fast key->value lookup and do not care about order — it is a hash table with O(1) average lookup, usually the faster choice.

Q: Why did my iterators and pointers break after a push_back?

When a vector reallocates to grow, every element moves to a new memory block, so all existing iterators, pointers, and references become dangling. Either call reserve() up front so no reallocation happens, or re-fetch your iterators after any operation that can grow the vector.

Q: Is std::list actually faster than vector for inserting in the middle?

On paper list insertion is O(1) once you hold an iterator, versus O(n) for vector. In practice vector usually still wins for small to medium data because its elements sit contiguously in memory and the CPU cache loves that. list nodes are scattered, so traversal causes cache misses. Measure before reaching for list.

Q: What is the load factor of an unordered_map?

Load factor = number of elements / number of buckets. As you insert, the load factor rises; when it passes max_load_factor (default 1.0) the table rehashes into more buckets to keep collisions low. That rehash is the hash-table equivalent of a vector reallocation, so reserve() up front avoids it.

Mini-Challenge: Prove Amortised O(1)

No blanks this time — just a brief and an outline. Push 100 elements and count how many times the capacity actually changes. If push_back were O(n) you'd see 100 reallocations; with doubling you'll see only a handful. Run it and check the count against the expected note in the comments.

🎯 Mini-Challenge: count the reallocations

Push 100 elements and prove growth is amortised O(1).

Try it Yourself »

C++

#include <iostream>
#include <vector>
using namespace std;

int main() {
    // 🎯 MINI-CHALLENGE: Prove amortised O(1) growth
    // 1. Make an empty vector<int> called v.
    // 2. Loop i from 0 to 99 (100 push_backs total).
    // 3. Each iteration: record v.capacity() BEFORE the push_back,
    //    push_back(i), then if the capacity changed, print the old and
    //    new capacity (a "reallocation happened" line).
    // 4. At the end print how many reallocations occurred.
    //
    // ✅ 
...

🎉 Lesson Complete

✅ vector stores elements contiguously; size() ≤ capacity(), and growth reallocates by doubling
✅ push_back is amortised O(1); reserve(n) pre-allocates to avoid reallocations
✅ Growing a vector invalidates all iterators, pointers, and references into it
✅ map is a red-black tree — sorted, O(log n); unordered_map is a hash table — unordered, O(1) average
✅ list is doubly-linked (O(1) edits at an iterator, no random access); deque is chunked (O(1) at both ends)
✅ Pick a container by its complexity, but default to vector for cache friendliness
✅ Next lesson: Move Semantics Advanced — rvalue references, perfect forwarding, and copy elision

Advanced Containers: How the STL Works Inside

What You'll Learn

💡 Real-World Analogy

📊 How Each Container Stores Its Data

1. Inside vector: size, capacity & growth

Worked example: watch capacity grow

Worked example: reserve vs resize vs data

🎯 Your turn: reserve up front

🔎 Deep Dive: iterator invalidation on growth

2. map (tree) vs unordered_map (hash)

Worked example: ordered tree vs hashed buckets

🎯 Your turn: pick the hashed container

3. Inside list and deque

Worked example: linked nodes vs chunked array

Pro Tips

Common Errors (and the fix)

📋 Quick Reference: container complexity

Frequently Asked Questions

Mini-Challenge: Prove Amortised O(1)

🎯 Mini-Challenge: count the reallocations

🎉 Lesson Complete

Cookie & Privacy Settings

1. Inside `vector`: size, capacity & growth

2. `map` (tree) vs `unordered_map` (hash)

3. Inside `list` and `deque`