Skip to main content
    Courses/C++/Advanced Containers

    Lesson 19 • Advanced

    Advanced Containers: How the STL Works Inside

    By the end of this lesson you'll understand what really happens when a vector grows, why map and unordered_map have completely different performance, how list and deque store their data, and how to pick the right container by its Big-O cost instead of guessing.

    What You'll Learn

    • How vector capacity grows and why push_back is amortised O(1)
    • When to call reserve() to avoid reallocations and copies
    • Why growing a vector invalidates iterators, pointers, and references
    • map (red-black tree, ordered, O(log n)) vs unordered_map (hash, O(1) avg)
    • How list and deque store data, and their cache trade-offs
    • Choosing a container by its complexity, not by habit

    💡 Real-World Analogy

    Think of a vector as a row of seats bolted to the floor. When the row fills up you can't just add one more seat on the end — you rent a bigger room, carry everyone over, and throw away the old room. That "carry everyone over" is a reallocation, and it's why anyone holding a ticket to "seat 3 in the old room" (an iterator or pointer) is suddenly pointing at nothing. A std::map is more like a library card catalogue: drawers kept in strict alphabetical order, so finding a card means a few quick narrowing steps (O(log n)). An unordered_map is a coat check: your ticket number is hashed straight to a hook, so retrieval is instant on average (O(1)) — but the coats hang in no particular order.

    📊 How Each Container Stores Its Data

    ContainerStructureAccessInsert / Remove
    vectorContiguous arrayO(1) randomO(1) back*, O(n) middle
    dequeChunked arrayO(1) randomO(1) front & back
    listDoubly-linked nodesO(n) sequentialO(1) anywhere**
    mapRed-black treeO(log n)O(log n)
    unordered_mapHash table (buckets)O(1) averageO(1) average

    * vector push_back is amortised O(1) — most pushes are instant, but the occasional grow copies everything. ** list O(1) insert/erase needs an iterator already pointing at the spot.

    1. Inside vector: size, capacity & growth

    A vector keeps its elements in one contiguous block of memory. Two numbers describe it: size() is how many elements you've added, and capacity() is how many it can hold before it must grab a bigger block. When you push_back past the capacity, the vector reallocates: it allocates a larger buffer (most implementations double it), copies or moves every element across, and frees the old one. Run this and watch capacity jump 1 → 2 → 4 → 8 → 16 instead of climbing by one.

    Worked example: watch capacity grow

    Run it and see how few reallocations 16 push_backs actually cause.

    Try it Yourself »
    C++
    #include <iostream>
    #include <vector>
    using namespace std;
    
    int main() {
        // size()     = how many elements you have put in.
        // capacity() = how many it can hold before it must reallocate.
        vector<int> v;                 // empty: no elements, no buffer yet
        cout << "start: size=" << v.size()
             << " capacity=" << v.capacity() << endl;  // size=0 capacity=0
    
        // Push 16 values and print ONLY when capacity changes.
        for (int i = 0; i < 16; i++) {
            size_t before = v
    ...

    Because the buffer doubles, the rare expensive copy is spread thin across many cheap pushes — so the average cost per push_back is constant. That's what amortised O(1) means. If you already know how many elements are coming, reserve(n) grabs the whole buffer once, so there are zero reallocations while you fill it.

    Worked example: reserve vs resize vs data

    See how reserve pre-allocates, and how resize differs.

    Try it Yourself »
    C++
    #include <iostream>
    #include <vector>
    using namespace std;
    
    int main() {
        // reserve(n) pre-allocates the buffer WITHOUT adding elements.
        // It removes the repeated grow-and-copy cost when you know the size.
        vector<int> v;
        v.reserve(1000);               // one allocation for 1000 ints
        cout << "after reserve: size=" << v.size()
             << " capacity=" << v.capacity() << endl;  // size=0 capacity=1000
    
        for (int i = 0; i < 1000; i++) v.push_back(i);  // NO reallocations now
    
    ...

    Your turn. Fill in the three blanks marked ___ using the hints in the comments. The goal: pre-allocate for 50 elements so the capacity never has to grow.

    🎯 Your turn: reserve up front

    Fill in the ___ blanks, then check your output against the expected lines.

    Try it Yourself »
    C++
    #include <iostream>
    #include <vector>
    using namespace std;
    
    int main() {
        // 🎯 YOUR TURN — replace each ___ then press "Try it Yourself".
    
        vector<int> scores;
    
        // 1) You KNOW you will add 50 scores. Pre-allocate so there are
        //    zero reallocations while you fill the vector.
        scores.___(50);               // 👉 the method that pre-allocates capacity
    
        for (int i = 1; i <= 50; i++) scores.push_back(i * 2);
    
        // 2) Print how many elements are stored (NOT the capacity)
       
    ...

    🔎 Deep Dive: iterator invalidation on growth

    Because a reallocation moves the entire buffer, every iterator, pointer, and reference you were holding into the vector now points at freed memory. Using one after the vector grows is undefined behaviour — often a crash, sometimes a silent wrong answer.

    vector<int> v = {1, 2, 3};
    int* p = &v[0];     // points into the current buffer
    v.push_back(4);     // may reallocate -> old buffer freed
    cout << *p;         // ❌ DANGLING: p points at freed memory
    
    // Fix A: reserve so no reallocation happens
    v.reserve(100);     // p taken AFTER this stays valid through 100 pushes
    // Fix B: re-fetch the pointer/iterator after any growth
    p = &v[0];          // ✅ valid again

    Rule of thumb: never hold a raw pointer or iterator across an operation that can grow the vector. Either reserve() first, or re-fetch afterwards.

    2. map (tree) vs unordered_map (hash)

    Both store key → value pairs, but their internals are nothing alike. std::map is a balanced binary search tree (a red-black tree): keys are always kept sorted, so iterating gives you them in order, and every lookup is O(log n) — a handful of comparisons even for millions of keys. std::unordered_map is a hash table: it runs the key through a hash function to pick a bucket, giving O(1) average lookup, but iteration comes out in no useful order.

    Worked example: ordered tree vs hashed buckets

    Compare sorted map iteration with a hash table's buckets and load factor.

    Try it Yourself »
    C++
    #include <iostream>
    #include <map>
    #include <unordered_map>
    using namespace std;
    
    int main() {
        // std::map = a balanced binary search tree (red-black tree).
        // Keys are kept SORTED; every lookup/insert is O(log n).
        map<string, int> ages;
        ages["Charlie"] = 35;
        ages["Alice"]   = 30;
        ages["Bob"]     = 25;
    
        cout << "map iterates in SORTED key order:" << endl;
        for (const auto& [name, age] : ages)        // structured binding (C++17)
            cout << "  " << name << " = 
    ...

    The load factor (size / bucket_count) measures how crowded the buckets are. When it rises past max_load_factor (default 1.0), the table rehashes into more buckets — the hash-table version of a vector reallocation. Choose unordered_map for raw lookup speed; choose map when you need sorted order or range queries.

    Now you choose. You need to count word frequencies — fast lookups, order doesn't matter. Fill in the blanks to use the right container and increment each count:

    🎯 Your turn: pick the hashed container

    Use the O(1)-average map and increment each word's count.

    Try it Yourself »
    C++
    #include <iostream>
    #include <unordered_map>
    using namespace std;
    
    int main() {
        // 🎯 YOUR TURN — count word frequencies. You need fast key lookup
        // and you do NOT care about sorted order -> the hash container.
    
        // 1) Declare a container mapping string -> int with O(1) average lookup
        ___<string, int> freq;        // 👉 the hash-table map type
    
        string words[] = {"red", "blue", "red", "red", "blue"};
        for (const string& w : words) {
            // 2) Increment the count for wor
    ...

    3. Inside list and deque

    A std::list is a doubly-linked list: each element is a separate node holding a value plus pointers to its neighbours. That makes inserting or erasing anywhere O(1)if you already hold an iterator there — but there's no list[i] random access, and the scattered nodes hurt cache performance. A std::deque stores data in a sequence of fixed-size chunks, which buys you O(1) push at both ends plus O(1) random access — the one thing a vector can't do cheaply at the front.

    Worked example: linked nodes vs chunked array

    See deque front/back pushes and a list mid-insert via iterator.

    Try it Yourself »
    C++
    #include <iostream>
    #include <list>
    #include <deque>
    using namespace std;
    
    int main() {
        // std::deque = double-ended queue, stored as a set of fixed chunks.
        // O(1) push at BOTH ends AND O(1) random access -> unlike vector.
        deque<int> dq = {2, 3};
        dq.push_front(1);          // O(1) at the front (vector would be O(n))
        dq.push_back(4);           // O(1) at the back
        cout << "deque dq[2] = " << dq[2] << endl;   // 3  (random access works)
    
        // std::list = a doubly-linked l
    ...

    Pro Tips

    • 💡 Default to vector: contiguous memory is cache-friendly and beats "smarter" containers far more often than beginners expect. Switch only with a measured reason.
    • 💡 reserve() when you know the size: one allocation instead of a dozen grow-and-copy cycles, and your iterators survive the fill.
    • 💡 emplace_back over push_back: it constructs the element in place, skipping a temporary copy.
    • 💡 Don't reach for list reflexively: its O(1) middle insert is often slower in practice than a vector's O(n) because of cache misses. Profile first.

    Common Errors (and the fix)

    • Iterator/pointer invalidation on growth: holding auto it = v.begin(); (or int* p = &v[0];) and then calling push_back leaves it dangling if the vector reallocated. Call v.reserve(n) first, or re-fetch the iterator after the push.
    • Not reserving in a known-size loop: for (...) v.push_back(x); without reserve triggers repeated reallocations and copies. If you know the count, v.reserve(count); once before the loop.
    • Assuming map is a hash table: std::map is a tree with O(log n) lookup, not O(1). If you only need fast lookup and don't care about order, use std::unordered_map.
    • operator[] silently inserts: if (m["missing"]) on a map creates the key with a default value. Use m.count(key) or m.find(key) != m.end() to check without inserting.
    • Indexing a list: myList[2] won't compile — a linked list has no random access. Use std::next(myList.begin(), 2), or pick a vector/deque if you need indexing.

    📋 Quick Reference: container complexity

    ContainerLookupInsertOrdered?Use when
    vectorO(1) by indexO(1) back*insertionDefault; indexed data
    dequeO(1) by indexO(1) both endsinsertionPush at front & back
    listO(n)O(1) at iteratorinsertionMany mid-list edits
    mapO(log n)O(log n)sorted keysNeed sorted order
    unordered_mapO(1) avgO(1) avgnoFastest key lookup

    * vector push_back is amortised O(1); a single grow copies all elements.

    Frequently Asked Questions

    Q: Why does vector capacity jump in big steps instead of growing by one?

    Each time a vector runs out of room it allocates a bigger block (typically 1.5x or 2x the old capacity), copies the existing elements over, and frees the old block. Growing by one every time would make N push_backs cost O(N^2). Doubling makes the cost per push_back O(1) on average — this is called amortised O(1).

    Q: When should I use std::map instead of std::unordered_map?

    Use std::map when you need the keys kept in sorted order, or you need range queries like lower_bound. It is a balanced binary tree, so every lookup is O(log n). Use unordered_map when you only need fast key->value lookup and do not care about order — it is a hash table with O(1) average lookup, usually the faster choice.

    Q: Why did my iterators and pointers break after a push_back?

    When a vector reallocates to grow, every element moves to a new memory block, so all existing iterators, pointers, and references become dangling. Either call reserve() up front so no reallocation happens, or re-fetch your iterators after any operation that can grow the vector.

    Q: Is std::list actually faster than vector for inserting in the middle?

    On paper list insertion is O(1) once you hold an iterator, versus O(n) for vector. In practice vector usually still wins for small to medium data because its elements sit contiguously in memory and the CPU cache loves that. list nodes are scattered, so traversal causes cache misses. Measure before reaching for list.

    Q: What is the load factor of an unordered_map?

    Load factor = number of elements / number of buckets. As you insert, the load factor rises; when it passes max_load_factor (default 1.0) the table rehashes into more buckets to keep collisions low. That rehash is the hash-table equivalent of a vector reallocation, so reserve() up front avoids it.

    Mini-Challenge: Prove Amortised O(1)

    No blanks this time — just a brief and an outline. Push 100 elements and count how many times the capacity actually changes. If push_back were O(n) you'd see 100 reallocations; with doubling you'll see only a handful. Run it and check the count against the expected note in the comments.

    🎯 Mini-Challenge: count the reallocations

    Push 100 elements and prove growth is amortised O(1).

    Try it Yourself »
    C++
    #include <iostream>
    #include <vector>
    using namespace std;
    
    int main() {
        // 🎯 MINI-CHALLENGE: Prove amortised O(1) growth
        // 1. Make an empty vector<int> called v.
        // 2. Loop i from 0 to 99 (100 push_backs total).
        // 3. Each iteration: record v.capacity() BEFORE the push_back,
        //    push_back(i), then if the capacity changed, print the old and
        //    new capacity (a "reallocation happened" line).
        // 4. At the end print how many reallocations occurred.
        //
        // ✅ 
    ...

    🎉 Lesson Complete

    • vector stores elements contiguously; size()capacity(), and growth reallocates by doubling
    • push_back is amortised O(1); reserve(n) pre-allocates to avoid reallocations
    • ✅ Growing a vector invalidates all iterators, pointers, and references into it
    • map is a red-black tree — sorted, O(log n); unordered_map is a hash table — unordered, O(1) average
    • list is doubly-linked (O(1) edits at an iterator, no random access); deque is chunked (O(1) at both ends)
    • ✅ Pick a container by its complexity, but default to vector for cache friendliness
    • Next lesson: Move Semantics Advanced — rvalue references, perfect forwarding, and copy elision

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service

    Install LearnCodingFast

    Learn faster with the app on your home screen.