Courses/Python/Profiling & Optimising Performance

    Lesson 24 • Advanced

    Profiling & Optimising Python Performance

    High-performance Python isn't about writing "faster code" — it's about finding bottlenecks and eliminating them with scientific precision. You cannot optimise what you do not measure.

    What You'll Learn in This Lesson

    • • How to profile CPU usage with cProfile and line_profiler
    • • How to measure memory usage with tracemalloc and memory_profiler
    • • How to find the real bottleneck in your code (it's rarely where you think)
    • • Practical optimisation techniques: caching, algorithm choice, data structures
    • • How to write benchmarks using timeit and interpret results correctly
    • • Production-level performance patterns used in real Python systems

    🔥 1. Why Profiling Matters

    Beginners try to "guess" what's slow.
    Advanced developers measure what's slow.

    ApproachMethodResult
    ❌ Guessing"This loop looks slow"Waste time optimizing the wrong code
    ✔ ProfilingMeasure actual execution timeFind and fix real bottlenecks

    The 80/20 Rule:

    20% of code → 80% of runtime

    Optimizing the wrong 80% gives no improvement!

    ⚙️ 2. Timing Functions with time.perf_counter()

    For quick micro-benchmarks:

    Timing with perf_counter

    Quick micro-benchmarks

    Try it Yourself »
    Python
    import time
    
    start = time.perf_counter()
    # code block
    end = time.perf_counter()
    
    print("Elapsed:", end - start)

    Use this for comparing:

    • two ways of looping
    • two algorithms
    • two function implementations

    But for full programs, we need real profilers.

    🧠 3. Profiling With cProfile — The Standard Tool

    Run a script with profiling from command line:

    cProfile Command

    Run a script with profiling

    Try it Yourself »
    Python
    python -m cProfile myscript.py

    ⚠️ Requires local Python installation • Download Python

    Or profile a specific function in your code:

    Profile a Function

    Profile a specific function

    Try it Yourself »
    Python
    import cProfile
    
    def slow():
        # A loop that runs 5 million times
        for _ in range(5_000_000):
            pass
    
    # Profile this specific function call
    cProfile.run("slow()")
    # Output shows: number of calls, total time, time per call
    ColumnWhat It Shows
    ncallsNumber of times function was called
    tottimeTotal time in this function (excluding subcalls)
    cumtimeCumulative time (including subcalls)

    📊 4. Making Results Readable With pstats

    Readable Profiling Results

    Use pstats to sort and filter

    Try it Yourself »
    Python
    import cProfile, pstats
    
    profiler = cProfile.Profile()
    profiler.enable()
    
    # code…
    for _ in range(3_000_000):
        pass
    
    profiler.disable()
    stats = pstats.Stats(profiler)
    stats.sort_stats("tottime").print_stats(10)
    Sort ByWhat It ShowsBest For Finding
    "tottime"Time in function itselfThe actual slow functions
    "cumtime"Time including all sub-callsFunctions that call slow things
    "ncalls"Number of times calledUnexpectedly hot loops

    Sorting options:

    • "tottime" → slowest total functions
    • "cumtime" → functions including subcalls
    • "ncalls" → most-called functions

    🧠 5. Line-by-Line Profiling With line_profiler

    Install:

    Install line_profiler

    Install the profiler

    Try it Yourself »
    Python
    pip install line_profiler

    ⚠️ Requires local Python installation • Download Python

    Use:

    Line Profiler Decorator

    Mark functions for line-by-line profiling

    Try it Yourself »
    Python
    @profile
    def slow():
        total = 0
        for i in range(10_000_000):
            total += i

    Run:

    Run Line Profiler

    Run and view results

    Try it Yourself »
    Python
    kernprof -l myscript.py
    python myscript.py.lprof

    ⚠️ Command-line tool - requires local setup

    Shows exactly which line is slow.

    This is invaluable for:

    • nested loops
    • ML preprocessing
    • tight functions
    • recursive code

    🧩 6. Memory Profiling

    Install:

    Install memory_profiler

    Install the memory profiler

    Try it Yourself »
    Python
    pip install memory_profiler

    ⚠️ Requires local Python installation • Download Python

    Usage:

    Memory Profiler

    Track memory usage line-by-line

    Try it Yourself »
    Python
    from memory_profiler import profile
    
    @profile
    def load_items():
        items = [i for i in range(5_000_000)]
        return items
    Memory IssueSymptomCommon Cause
    Memory spikeSudden +500MB on one lineLoading large dataset at once
    Memory leakMemory grows over timeData accumulating in loops
    High baselineProgram starts with 200MB+Heavy imports (pandas, tensorflow)

    Shows memory growth line-by-line.

    Useful for:

    • ✔ large lists
    • ✔ pandas
    • ✔ numpy allocations
    • ✔ memory leaks
    • ✔ generators vs lists performance

    ⚡ 7. Real Techniques for Faster Python

    1. Use built-ins over manual loops

    Built-ins vs Manual Loops

    Built-ins use C-level optimisations

    Try it Yourself »
    Python
    sum(list)     # faster
    
    vs
    
    total = 0
    for x in list: total += x

    Built-ins use C-level optimisations.

    2. Prefer list comprehensions

    List Comprehensions

    Faster than explicit loops

    Try it Yourself »
    Python
    [x*x for x in nums]  # faster
    
    vs
    
    result = []
    for x in nums:
        result.append(x*x)

    3. Use generators for large data

    Generators

    Saves huge amounts of memory

    Try it Yourself »
    Python
    (x*x for x in nums)

    saves huge amounts of memory.

    4. Use numpy for heavy math

    Pure Python loops are slow.
    NumPy performs operations in C — often 50–200× faster.

    5. Cache Results With functools.lru_cache

    lru_cache

    Cache results for instant performance

    Try it Yourself »
    Python
    from functools import lru_cache
    
    @lru_cache(None)
    def fib(n):
        if n < 2: return n
        return fib(n-1) + fib(n-2)

    Transforms slow recursive functions → instant.

    6. Use Multiprocessing for CPU

    Multiprocessing

    Run tasks on multiple cores

    Try it Yourself »
    Python
    from multiprocessing import Pool
    
    with Pool() as pool:
        pool.map(func, items)

    ⚠️ Works best with local Python installation

    Runs tasks on multiple cores.

    🏎️ 8. Avoiding the Biggest Performance Mistakes

    ❌ MistakeWhy It's Slow✅ Better Approach
    Unnecessary list copiesCopies entire list in memoryUse slices or itertools
    Python loops for mathInterpreted = slowNumPy vectorized operations
    String concatenation in loopCreates new string each timeUse ''.join(list)
    Opening files repeatedlyDisk I/O is expensiveOpen once, read/write many
    Blocking I/O in asyncBlocks the entire event loopUse run_in_executor()

    Summary of Common Traps:

    • ❌ Unnecessary list copies
    • ❌ Using Python loops for math
    • ❌ Excessive string concatenation
    • ❌ Opening files repeatedly
    • ❌ Overuse of classes when simple functions work
    • ❌ Blocking I/O in async code

    🧪 9. Real-World Example: Speeding Up JSON Parsing

    Slow version:

    Slow JSON Parsing

    Standard json module

    Try it Yourself »
    Python
    import json
    
    data = [json.loads(x) for x in lines]

    Optimised:

    Fast JSON Parsing

    orjson is 5-20x faster

    Try it Yourself »
    Python
    import orjson
    
    data = [orjson.loads(x) for x in lines]

    ⚠️ Requires: pip install orjsonDownload Python

    orjson is 5–20× faster than Python's JSON parser.

    🎉 Conclusion

    By mastering profiling and optimisation, you gain the ability to:

    ✔ Identify real bottlenecks

    ✔ Build faster APIs and scripts

    ✔ Optimise ML preprocessing

    ✔ Save CPU & memory in production

    ✔ Think like a performance engineer

    Performance comes from measure → diagnose → optimise, not guessing.

    📋 Quick Reference — Profiling & Performance

    Tool / SyntaxWhat it does
    cProfile.run('fn()')Profile function call counts and time
    timeit.timeit('expr', number=1000)Benchmark small code snippets
    line_profilerProfile line-by-line execution time
    memory_profilerTrack memory usage per line
    __slots__Reduce class memory footprint

    🎉 Great work! You've completed this lesson.

    You now know how to measure, diagnose, and fix performance bottlenecks — the professional workflow every senior engineer uses.

    Up next: Memory Management — understand how Python's garbage collector works and prevent memory leaks.

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service