What You'll Learn

    • Measure code with high-resolution timers
    • Cache-friendly vs cache-hostile patterns
    • Find hotspots with data structure choice
    • Use perf, gprof, VTune, callgrind

    Profiling & Optimizing C++ Applications

    The golden rule of optimization: measure first, optimize second. Guessing where bottlenecks are is almost always wrong. This lesson teaches you how to measure precisely, identify real hotspots, and apply targeted optimizations that actually matter.

    Measuring with High-Resolution Timers

    std::chrono::high_resolution_clock gives microsecond or nanosecond precision. Wrap timing in an RAII class so you never forget to stop the timer — the destructor prints the elapsed time automatically.

    Pro Tip: Always benchmark with optimizations enabled (-O2 or -O3). Debug builds (-O0) are 10-100× slower and produce misleading results.

    RAII Timer

    Compare loop vs accumulate and sort vs stable_sort

    Try it Yourself »
    C++
    #include <iostream>
    #include <chrono>
    #include <vector>
    #include <algorithm>
    #include <numeric>
    using namespace std;
    
    // Simple RAII timer — measures scope lifetime
    class Timer {
        string label;
        chrono::high_resolution_clock::time_point start;
    public:
        Timer(const string& l) : label(l), start(chrono::high_resolution_clock::now()) {}
        ~Timer() {
            auto end = chrono::high_resolution_clock::now();
            auto us = chrono::duration_cast<chrono::microseconds>(end - start).count();
    
    ...

    Cache-Friendly Access Patterns

    Modern CPUs load memory in 64-byte cache lines. Sequential access (row-major) hits the cache; jumping across rows (column-major) causes cache misses. A cache miss costs 100+ CPU cycles — making memory layout the single biggest performance factor in data-heavy code.

    Common Mistake: Using vector<vector<int>> for matrices. Each inner vector is a separate heap allocation — terrible cache locality. Use a flat vector<int> with manual indexing for performance-critical matrices.

    Cache Performance

    Compare row-major, column-major, and flat array traversal

    Try it Yourself »
    C++
    #include <iostream>
    #include <vector>
    #include <chrono>
    using namespace std;
    
    class Timer {
        string label;
        chrono::high_resolution_clock::time_point start;
    public:
        Timer(const string& l) : label(l), start(chrono::high_resolution_clock::now()) {}
        ~Timer() {
            auto us = chrono::duration_cast<chrono::microseconds>(
                chrono::high_resolution_clock::now() - start).count();
            cout << label << ": " << us << " µs" << endl;
        }
    };
    
    int main() {
        const int ROWS = 
    ...

    Finding Hotspots — Data Structure Choice

    Often the biggest optimization is choosing the right data structure. A linear scan through a vector is O(n) per lookup; an unordered_map is O(1) average. Profile first to find where time is spent, then fix the algorithm — not the micro-optimizations.

    Hotspot Analysis

    See how data structure choice dominates performance

    Try it Yourself »
    C++
    #include <iostream>
    #include <vector>
    #include <string>
    #include <unordered_map>
    #include <chrono>
    #include <algorithm>
    using namespace std;
    
    class Timer {
        string label;
        chrono::high_resolution_clock::time_point start;
    public:
        Timer(const string& l) : label(l), start(chrono::high_resolution_clock::now()) {}
        ~Timer() {
            auto us = chrono::duration_cast<chrono::microseconds>(
                chrono::high_resolution_clock::now() - start).count();
            cout << label << ": " << u
    ...

    Quick Reference

    ToolCommandBest For
    gprofg++ -pg; gprof a.outFunction-level time
    perfperf record/reportCPU sampling
    callgrindvalgrind --tool=callgrindCall graphs
    chronohigh_resolution_clockMicro-benchmarks

    Lesson Complete!

    You can now measure performance accurately, identify real bottlenecks, and apply data-driven optimizations.

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service