Lesson 25 • Advanced
Memory Management & Garbage Collection Internals
Python may look simple on the surface, but underneath it has a powerful and complex memory management system. To write high-performance Python — whether you're building ML pipelines, backend servers, or tools that process millions of objects — you must understand how Python allocates and frees memory, reference counting, garbage collection cycles, memory fragmentation, and how to track leaks and optimize memory usage.
What You'll Learn in This Lesson
- • How Python allocates memory at the object level using PyMalloc
- • How reference counting works and when it fails (circular refs)
- • How the cyclic garbage collector detects and breaks reference cycles
- • How to detect memory leaks using
tracemallocandobjgraph - • How
__slots__reduces per-instance memory by 40–70% - • How weak references prevent memory leaks in caches and observers
🔥 1. How Python Allocates Memory
Python uses a private memory manager (PyMalloc) layered on top of the OS allocator.
There are three layers:
| Layer | What It Does | Speed |
|---|---|---|
| Object-specific allocators | Custom optimized allocators for ints, lists, dicts, strings | ⚡ Fastest |
| Python memory manager | Handles small object pools, caches freed memory | ⚡ Fast |
| OS-level allocator | malloc(), free() — used for large blocks | 🐢 Slow |
Python tries to avoid calling the OS too often, because OS allocations are slow.
⚙️ 2. Reference Counting — The Core Mechanism
Every Python object has an internal counter: how many references point to it.
You can inspect it:
Reference Counting
Inspect object reference counts with sys.getrefcount()
import sys
x = []
print(sys.getrefcount(x))
# Whenever you do:
y = x
# refcount increases.
# Whenever something goes out of scope:
del y
# refcount decreases.When it reaches 0, Python immediately frees the memory.
| Action | Effect on Refcount | Example |
|---|---|---|
| Create object | +1 | x = [] |
| Assign to another variable | +1 | y = x |
| Delete reference | -1 | del y |
| Leave function scope | -1 | Local variables cleaned up |
Reference counting gives:
- ✔ deterministic cleanup
- ✔ predictable object lifetime
- ✔ fast destruction
But it has one big problem…
🧠 3. The Problem: Reference Cycles
Example:
Reference Cycles
Understand how circular references prevent garbage collection
a = []
b = []
a.append(b)
b.append(a)
# Both objects reference each other.
# Even if you del a and del b, refcount never reaches 0.🌀 4. Garbage Collection for Cycles
Python's cyclic garbage collector scans container objects (lists, dicts, sets, classes) to find reference cycles.
| Generation | Contains | Checked |
|---|---|---|
| Gen 0 | Newest objects | Most frequently |
| Gen 1 | Survived 1+ collections | Less often |
| Gen 2 | Long-lived objects | Rarely |
The cycle detector periodically:
- Scans generation
- Finds unreachable cycles
- Frees them
This prevents memory leaks caused by circular references.
⚡ 5. Viewing & Controlling the GC
You can inspect thresholds:
Controlling Garbage Collection
Inspect and control the garbage collector
import gc
print(gc.get_threshold())
# Typical default: (700, 10, 10)
# Meaning:
# - collect gen0 every 700 allocations
# - collect gen1 every 10 gen0 collections
# - collect gen2 every 10 gen1 collections
# Force a collection:
gc.collect()
# Disable GC (not recommended unless profiling):
gc.disable()📦 6. Memory Fragmentation
Python memory isn't always "returned" to the OS immediately.
| Reason | What Happens |
|---|---|
| Freed blocks stay in pools | Python keeps them for reuse |
| Partially used arenas | OS can't reclaim until completely empty |
| Long-lived objects | Create "holes" in memory |
| Extension modules | Allocate outside Python's control |
This is why a Python process may appear large even after freeing objects.
Tools like Heapy, tracemalloc, and Pympler help inspect fragmentation.
🧪 7. Detecting Memory Leaks
Python leaks often happen because:
- ❌ lingering references
- ❌ global caches
- ❌ closures holding variables
- ❌ large lists still in scope
- ❌ unclosed file handles
- ❌ cycles involving custom classes
Using tracemalloc:
Using tracemalloc
Track memory allocations to find leaks
import tracemalloc
tracemalloc.start()
# run code
snapshot = tracemalloc.take_snapshot()
top = snapshot.statistics('lineno')
for item in top[:5]:
print(item)
# Shows exactly where memory increases.Using objgraph:
Using objgraph
Visualize object growth to detect leaks
# pip install objgraph
import objgraph
objgraph.show_growth()
# Helpful for finding leaking objects over time.🧩 8. Efficient Memory Techniques
✅ Prefer generators over lists
Generators vs Lists
Use generators to reduce memory usage
# Use generators:
nums = range(1000000)
gen = (x*x for x in nums)
# Instead of:
# lst = [x*x for x in nums]
# Generators use much less memory!
print(next(gen))
print(next(gen))✅ Use __slots__ for large class collections
Using __slots__
Reduce memory footprint of class instances
class Point:
__slots__ = ("x", "y")
def __init__(self, x, y):
self.x = x
self.y = y
# Saves memory by removing per-object dict.
p = Point(1, 2)
print(p.x, p.y)✅ Avoid unnecessary large structures
Replace:
- huge dictionaries → array, struct, numpy
- nested lists → numpy arrays
- long strings concatenation → io.StringIO
✅ Reuse objects when possible
Instead of allocating repeatedly inside loops.
✅ Clean up large variables manually
Manual Cleanup
Explicitly free large objects and run garbage collection
import gc
big_object = [i for i in range(1000000)]
# ... use big_object ...
del big_object
gc.collect()
# Useful in ML/data pipelines.
print("Memory cleaned")🔥 9. Memory & Speed Tradeoffs
Optimising memory may reduce speed, and vice-versa.
Examples:
- Lists: faster, more memory
- Generators: less memory, slower iteration
- C extensions: ultra fast, but limited flexibility
Your optimisation depends on whether the bottleneck is:
- RAM usage
- CPU speed
- I/O wait
- GC pauses
Profiling tells you which.
🔥 10. How Python Stores Objects in Memory (Deep Internal View)
Every Python object is stored in a PyObject structure that contains:
- Reference count
- Type pointer
- Object data
But different types store additional metadata:
✔ Integers
Small integers (from –5 to 256) are pre-allocated and reused → "integer cache".
This means:
Integer Caching
Understand Python's small integer cache
a = 5
b = 5
# Both point to the same object in memory.
print(a is b) # True
# But large integers are different objects:
c = 500
d = 500
print(c is d) # May be False✔ Strings
Short strings and identifiers are interned (cached forever) to speed up comparisons.
Used heavily in:
- • variable names
- • dictionary keys
- • tokens in parsers
✔ Lists
Python lists are dynamic arrays with over-allocation (extra capacity) to avoid constant resizing.
Example:
- Before append: capacity 10
- After append: capacity 14
This saves CPU time but increases memory use.
✔ Dictionaries
Use open addressing with "sparse tables".
They resize when the load factor gets too high (~66%).
🧬 11. Arena Allocation (The Deepest Python Memory Detail)
CPython allocates memory in units:
Arenas → Pools → Blocks
| Layer | Size | Purpose |
|---|---|---|
| Arena | ~256 KB | Large chunk from OS |
| Pool | 4 KB | For objects of same size |
| Block | variable | Individual object |
This explains two things:
- Python rarely returns memory to the OS - Even if the object is deleted, the arena remains allocated.
- Long-running servers keep growing in memory - Because arenas don't shrink unless all blocks in it are free.
🧠 12. Why Lists & Dicts "Grow" in Memory
Python over-allocates:
List:
When you append items, the list grows faster than needed.
Example internal growth pattern:
0 → 4 → 8 → 16 → 25 → 35 → 46 → ...
Dict:
When keys increase, it resizes to maintain fast O(1) access.
This resizing:
- ✔ improves speed
- ❌ increases memory footprint
Understanding this helps you design efficient data structures.
🧨 13. Object Lifetimes — From Creation to Deallocation
- Object created → refcount = 1
- More references → refcount increases
- When all references drop → refcount becomes 0
- Python immediately frees object memory
- Freed memory may stay inside the arena
- Cyclic GC occasionally clears unreachable cycles
This means:
Python is deterministic for most objects…
…but not for cycles.
🔍 14. Memory Leak Patterns in Real Python Code
Here are the 7 most common memory leak patterns seen in production:
- Growing global lists
Growing Global Lists
A common memory leak pattern
cache = []
def add(x):
cache.append(x)
# This cache grows forever!
for i in range(100):
add(i)
print(len(cache))- Caches that never expire (Flask, Django, ML models)
- Closures capturing large objects
Closure Memory Trap
Closures can keep large objects alive
def make():
big = [1] * 1_000_000
def inner():
return len(big)
return inner
# 'big' is kept alive by the closure!
fn = make()
print(fn())- Referencing objects inside loops unintentionally
- Pandas dataframes not deleted
- Open file handles never closed
- Cycles between class instances
🧪 15. Real-World Debugging — Finding a Leak in a Web Server
Imagine you run a FastAPI app, and memory keeps rising.
Step 1: Enable tracemalloc
Step 1: Enable tracemalloc
Start tracking memory allocations
import tracemalloc
tracemalloc.start()
print("tracemalloc started")Step 2: Snapshot
Step 2: Take Snapshot
Capture memory state at a point in time
import tracemalloc
tracemalloc.start()
# ... run some code ...
s1 = tracemalloc.take_snapshot()
print("Snapshot taken")Step 3: Compare after operations
Step 3: Compare Snapshots
Find what's consuming memory
import tracemalloc
tracemalloc.start()
# Initial state
s1 = tracemalloc.take_snapshot()
# Simulate some allocations
data = [i for i in range(100000)]
# New state
s2 = tracemalloc.take_snapshot()
stats = s2.compare_to(s1, "lineno")
print(stats[:3])You instantly see:
- ✔ which file
- ✔ which line
- ✔ how much leaked
Step 4: Identify dangling references
Using objgraph:
Step 4: Find Dangling References
Use objgraph to visualize reference chains
# pip install objgraph
import objgraph
objgraph.show_most_common_types()
# To see what's holding a reference:
# objgraph.show_backrefs(target_object)This reveals why something never got garbage-collected.
⚡ 16. Avoiding Fragmentation in Large Applications
Memory fragmentation is a silent killer for long-running apps.
Techniques used by big companies:
- ✔ Restart worker processes periodically (Gunicorn / Celery)
- ✔ Keep objects small
- ✔ Reuse buffers
- ✔ Pre-allocate large structures
- ✔ Move heavy data to NumPy arrays
- ✔ Use memory pools (custom allocators)
- ✔ Offload work to Rust/C for stable memory control
Major apps like Instagram and Dropbox use multi-process setups for this exact reason.
📦 17. Working With Huge Data Without Crashing RAM
If you process:
- • 10GB CSV
- • 50M database rows
- • million images
- • huge logs
—you must avoid loading everything at once.
Use chunking
File Chunking
Read large files in chunks to save memory
def read_chunks(file, chunk=1024):
while True:
data = file.read(chunk)
if not data:
break
yield data
# Usage:
# with open("bigfile.txt") as f:
# for chunk in read_chunks(f):
# process(chunk)Use generators
Generator Streaming
Process data one item at a time
def stream_data():
for i in range(1000000):
yield {"id": i, "value": i * 2}
# Process one at a time:
for row in stream_data():
if row["id"] < 5:
print(row)Use mmap
Memory-map files to avoid RAM explosion.
🔧 18. Advanced Optimisation Tools
🟧 1. Cython
Compiles Python code to C → 10×–200× speed + fixed memory layout.
🟩 2. Numba
JIT compiler for numeric loops.
🟦 3. PyPy
Alternative Python interpreter with a fast JIT.
Great for long-running loops.
🟪 4. Mypyc
Compiles typed Python into C extensions.
📊 19. Memory & Performance Profiling Workflow (Professional Method)
Here's the exact workflow used in production:
- Identify if the bottleneck is CPU or RAM
Use psutil, htop, profiling. - Profile CPU
cProfile, line_profiler. - Profile memory
tracemalloc, Pympler, Heapy. - Check GC behaviour
Too many collections? Too few? - Find ref cycles
gc.get_objects(), objgraph. - Fix or rewrite the hotspot
Use NumPy/Numba/Rust if needed. - Benchmark again
Verify improvement.
This is the method used by performance engineers at scale.
🧠 20. Python Memory Myths (Corrected)
❌ Myth: Python returns memory to OS when freed
✔ Truth: Almost never — only FULL arenas are returned.
❌ Myth: Garbage collection slows Python
✔ Truth: GC rarely runs unless many container objects are created.
❌ Myth: Variables disappear after function exit
✔ Truth: Closures, globals, caches can keep them alive forever.
❌ Myth: Generators are slower than lists
✔ Truth: They are massively more memory-efficient and usually faster for pipelines.
🎓 21. Final Summary of Python Memory Mastery
You now understand:
- ✔ PyMalloc
- ✔ Object reference counting
- ✔ Garbage collector generations
- ✔ Memory fragmentation
- ✔ Cycles + leak detection
- ✔ Efficient memory coding
- ✔ NumPy vs lists
- ✔ Slots & reusable objects
- ✔ Profiling tools
- ✔ Large-scale memory engineering
This knowledge puts you way above normal Python developers — this is senior-level backend engineer understanding.
🔥 Practical Engineering Summary
How Python Actually Manages Memory (High-Level Recap)
Python uses a multi-layered system:
- Reference Counting - Every object tracks how many variables reference it. When the counter hits zero → memory is freed instantly.
- Garbage Collector (GC) - Handles cycles (objects referencing each other). Works in three generations, promoting "older" objects that survive.
- PyMalloc - An internal allocator designed to reduce fragmentation, reuse freed memory, and avoid expensive OS calls.
- OS Allocator - Used only for large blocks. Python returns memory to OS only when a full arena is unused.
🧠 The Biggest Causes of Memory Problems in Real Systems
- Reference Cycles - Especially between custom class instances.
- Containers that never shrink - Lists, dicts, sets can grow endlessly if not managed.
- Hidden references - Closures, globals, long-lived objects.
- Fragmentation - Python pools memory and often cannot release it back to OS.
- Large objects kept alive accidentally - Pandas DataFrames, NumPy arrays, big lists.
- Not streaming data - Loading 5GB into RAM instead of processing in chunks.
- Unclosed resources - Sockets, file handles, DB connections.
These are the real culprits when you see "Python memory leak".
⚙️ Practical Checklist for Writing Memory-Safe Python
Here is what senior engineers follow:
- ✔ Use generators for large data - Avoid loading huge datasets at once
- ✔ Avoid unnecessary copies - Slice carefully, avoid converting between structures
- ✔ Prefer NumPy for math-heavy work - Lists of lists are slow and heavy
- ✔ Clear large structures manually -
del big_list; gc.collect() - ✔ Use context managers for resources - Files, locks, DB sessions
- ✔ Avoid unbounded in-memory caches - Use TTL-based caching (Redis, LRUCache)
- ✔ Beware of closures keeping unneeded variables - This is a common memory trap
- ✔ Monitor memory over time - Especially in long-running backend services
- ✔ Restart worker processes in production - Gunicorn, Celery, and Uvicorn workers are often auto-restarted to clean memory
This checklist alone prevents 90% of real-world problems.
🔥 Ultimate Takeaways (The "If You Remember Only 10 Things…" List)
Memorise this list — it's the essence of professional Python memory engineering:
- Reference counting frees most objects instantly.
- GC only handles cycles — not everything.
- Python rarely returns memory to the OS.
- Lists/dicts grow but do not shrink automatically.
- Fragmentation is normal — not a bug.
- Profiling > guessing. Always measure first.
- Generators prevent RAM explosion.
- NumPy is mandatory for large numeric workloads.
- Unclosed resources cause real leaks.
- Long-running apps must recycle workers.
If you follow these principles, you'll never struggle with memory issues again.
🎉 Final Conclusion — You Now Understand Memory Like a Senior Engineer
By completing all parts, you now understand:
- ✔ Python's internal allocator
- ✔ Reference counting
- ✔ Garbage collection
- ✔ Fragmentation
- ✔ Cycles
- ✔ Object lifetime
- ✔ Memory profiling
- ✔ Leak detection
- ✔ Optimisation techniques
- ✔ High-scale memory architecture
This is deep Python internals knowledge that most developers never learn.
With this mastery, you're ready to:
- 🔥 write high-performance code
- 🔥 build scalable backends
- 🔥 optimise ML pipelines
- 🔥 debug memory like a professional
- 🔥 build fast, efficient apps and systems
📋 Quick Reference — Memory Management
| Concept / Tool | What it does |
|---|---|
| sys.getrefcount(obj) | Check reference count of an object |
| gc.collect() | Manually trigger garbage collection |
| weakref.ref(obj) | Hold reference without preventing GC |
| __slots__ | Reduce per-instance memory overhead |
| tracemalloc | Trace memory allocations |
🎉 Great work! You've completed this lesson.
You now understand reference counting, the garbage collector, and how to avoid memory leaks in long-running Python programs.
Up next: Type Hints — add static typing to Python code for better tooling and fewer bugs.
Sign up for free to track which lessons you've completed and get learning reminders.