Lesson 19 • Advanced
Generators & Iterators Mastery
Generators and iterators are two of Python's most powerful features — allowing you to process massive datasets, stream data, write memory-efficient code, build pipelines, and design systems that behave like professional-grade libraries.
What You'll Learn in This Lesson
- • What iterators are and how the iterator protocol works
- • How to create generators with
yield - • The difference between
returnandyield - • Generator expressions and memory-efficient data pipelines
- • Real-world use cases: streaming large files, infinite sequences, lazy evaluation
What You'll Learn
This lesson will take you from advanced fundamentals → deep internal mechanics → real-world patterns used in production.
🔥 1. What Exactly Are Iterators?
An iterator is any object that can give you items one at a time. It follows a simple contract:
| Method | What It Does | When It's Called |
|---|---|---|
| __iter__() | Returns the iterator itself | When you start a for-loop |
| __next__() | Returns the next item in sequence | Each iteration of the loop |
| StopIteration | Signals "no more items" | When sequence is exhausted |
for loop in Python, you're using iterators behind the scenes! Python automatically calls these methods for you.Manual Iterator
Create your own iterator class
# Manual Iterator Example
class CountUpTo:
def __init__(self, max_value):
self.max = max_value
self.current = 1
def __iter__(self):
return self
def __next__(self):
if self.current > self.max:
raise StopIteration
value = self.current
self.current += 1
return value
# Usage
for x in CountUpTo(5):
print(x)Try It Yourself: Generators
Practice creating generators and iterators in Python
# Generators Practice
# Simple generator function
def count_up_to(n):
count = 1
while count <= n:
yield count
count += 1
print("Counting to 5:")
for num in count_up_to(5):
print(num)
# Generator expression
squares = (x**2 for x in range(1, 6))
print("\nSquares:")
print(list(squares))
# Fibonacci generator
def fibonacci(n):
a, b = 0, 1
for _ in range(n):
yield a
a, b = b, a + b
print("\nFirst 10 Fibonacci numbers:")
print(list(fibonacci(10)
...⚙️ 2. Why Iterators Matter
Iterators solve the "Big Data" problem:
| Without Iterators | With Iterators |
|---|---|
| Load entire 5GB file into RAM | Process one line at a time |
| Computer crashes with "Out of Memory" | Works smoothly with constant memory |
| Must wait for all data before starting | Start processing immediately |
What iterators enable:
- Streaming: Process data as it arrives (like Netflix streaming)
- Pipelines: Chain operations together (like assembly lines)
- Lazy evaluation: Only compute when needed (saves CPU & RAM)
- Infinite sequences: Generate data forever without memory limits
⚡ 3. Enter Generators — The Shortcut to Iterators
A generator is Python's elegant shortcut for creating iterators using the yield keyword.
| Iterator Class | Generator Function |
|---|---|
| ~15 lines of code | ~5 lines of code |
| Define __init__, __iter__, __next__ | Just use yield |
| Manual state management | Automatic state saving |
Instead of writing an entire class, just use yield:
Generator Shortcut
Create iterators easily with yield
def count_up_to(max_value):
current = 1
while current <= max_value:
yield current
current += 1
# Usage
for num in count_up_to(5):
print(num)This produces the same behavior as the iterator class — but with 90% less code.
🧠 4. How yield Works Internally
yield like a bookmark in a book. You read up to a point, put in a bookmark (yield), close the book. Later, you open it and continue exactly where you left off. The generator remembers its "page number"!When Python sees yield, something magical happens:
Step 1: Function becomes a generator object (not executed yet!)
Step 2: First next() runs until first yield
Step 3: Execution pauses, value is returned
Step 4: Next next() resumes from exact pause point
Step 5: Repeat until function ends → StopIteration
return exits a function forever.yield pauses it and lets you resume. Big difference!| return | yield |
|---|---|
| Function ends permanently | Function pauses, can resume |
| Returns single value | Can yield many values over time |
| All local variables lost | All local variables preserved |
🧵 5. Real-World Example — Log File Streaming
Imagine parsing a 5GB log file. Without generators, you'd crash. With generators:
Log File Streaming
Stream large files line by line
def read_logs(path):
with open(path) as f:
for line in f:
yield line.strip()
# Simulated usage (without actual file)
def simulate_logs():
logs = ["INFO: Started", "ERROR: Failed", "INFO: Done"]
for log in logs:
yield log
for log in simulate_logs():
print(log)Key benefit:
- ✔ Doesn't load the entire file
- ✔ Works line by line
- ✔ Streams infinitely large files
Python itself uses this pattern in its own IO libraries.
💧 6. Infinite Generators (Perfect for Simulations & AI)
Infinite Generators
Create endless sequences for simulations
def infinite_counter(start=0):
while True:
yield start
start += 1
# Take first 10 values
counter = infinite_counter()
for _ in range(10):
print(next(counter))Use cases:
- Reinforcement learning training loops
- Game worlds
- Randomized datasets
- Unique ID generation
📦 7. Generator Pipelines (Functional Programming Style)
Chain generators together like Unix shell pipes (command1 | command2 | command3):
Generator Pipelines
Chain generators like Unix pipes
def read_lines(text):
for line in text.split("\n"):
yield line.strip()
def filter_errors(lines):
for line in lines:
if "ERROR" in line:
yield line
def extract_messages(lines):
for l in lines:
yield l.split(":")[-1].strip()
# Sample log data
logs = """INFO: System started
ERROR: Connection timeout
INFO: User logged in
ERROR: File not found"""
pipeline = extract_messages(filter_errors(read_lines(logs)))
for msg in pipeline:
print(msg)Each step processes data lazily.
Nothing loads into memory at once.
🧩 8. yield from — Delegating to Sub-Generators
yield from is like saying "pass all calls to this other generator until they're done."yield from lets you delegate iteration to another generator:
yield from
Delegate to sub-generators cleanly
def flatten(list_of_lists):
for sub in list_of_lists:
yield from sub
nested = [[1, 2], [3, 4], [5, 6]]
print(list(flatten(nested)))This is cleaner and faster than nested loops.
🌀 9. Two-Way Generators (send() Method)
.send(). Think of it like a walkie-talkie instead of a one-way radio.You can send values INTO generators, making them interactive:
Two-Way Generators
Send values into generators with send()
def accumulator():
total = 0
while True:
value = yield total
if value is not None:
total += value
g = accumulator()
next(g) # start generator
print(g.send(10)) # -> 10
print(g.send(5)) # -> 15
print(g.send(3)) # -> 18This technique is used in:
- Async frameworks
- Stream processors
- Complex event simulation
⚙️ 10. Generator-Based Coroutines (Pre-asyncio Style)
async/await, generators were THE way to do async programming. Understanding this helps you read older codebases and understand how async works under the hood.Generators can act as coroutines — functions that can pause and receive data:
Generator Coroutines
Cooperative concurrency with generators
def coroutine():
while True:
message = yield
print("Received:", message)
c = coroutine()
next(c) # Start coroutine
c.send("Hello")
c.send("World")Now superseded by async def, but still heavily used in internal libraries and advanced scheduling systems.
📚 11. Generators as Context Managers
with statements) from earlier? You can create them using generators! The code before yield runs on entry, and code after yield runs on exit. Super elegant!Using contextlib.contextmanager decorator:
Generators as Context Managers
Combine generators with cleanup logic
from contextlib import contextmanager
@contextmanager
def managed_resource(name):
print(f"Opening {name}")
try:
yield name
finally:
print(f"Closing {name}")
with managed_resource("database") as db:
print(f"Using {db}")This pattern merges generators + cleanup logic.
🚀 12. Generator Expressions (Faster, Cleaner, Memory Efficient)
Generator Expressions
Memory-efficient inline generators
# Generator expression - uses no extra memory
squares = (x*x for x in range(10))
print("Generator:", squares)
print("Values:", list(squares))
# Compare with list comprehension
squares_list = [x*x for x in range(10)]
print("\nList:", squares_list)This uses 0 RAM regardless of range size.
Compare:
| Structure | Memory Use |
|---|---|
| List comprehension | Loads entire list |
| Generator expression | Streams items one by one |
🔬 13. Building Your Own Iterable Class (Advanced)
Here's a complete example with detailed comments:
Custom Iterable Class
Build a full iterator from scratch
class Fibonacci:
def __init__(self, limit):
self.limit = limit
self.a, self.b = 0, 1
def __iter__(self):
return self
def __next__(self):
if self.a > self.limit:
raise StopIteration
value = self.a
self.a, self.b = self.b, self.a + self.b
return value
for num in Fibonacci(100):
print(num)🕹 14. How Python's For-Loops Use Iterators Internally
for loop you've ever written is secretly using iterators! Understanding this reveals how Python really works.This simple for-loop:
For-Loop Internals
How Python implements for-loops
data = [1, 2, 3]
# This for loop:
for x in data:
print(x)
# Actually does this internally:
print("\nManual iteration:")
it = data.__iter__()
while True:
try:
x = it.__next__()
print(x)
except StopIteration:
breakUnderstanding this gives you total control over how objects behave.
📊 15. Real-World Use Cases (Professional Level)
✔ Pandas Chunk Loading
Streaming large CSVs with chunksize parameter
✔ PyTorch DataLoader
Uses iterators to stream mini-batches.
✔ Web Scrapers (Scrapy)
Iterators control crawling pipelines.
✔ SQLAlchemy Query Results
Rows are lazily streamed.
✔ Async Systems
Generators still power internal scheduling logic.
🧪 16. Mini Project — Build a Streaming Data Pipeline
You will create:
- A data reader
- A validator
- A filter stage
- A transformer
- An exporter
Each one is a generator.
This simulates how Airflow, Spark, and Pandas internally process data.
🎉 Conclusion
You now fully understand:
✔ How iterators work internally
✔ How to build custom iterable objects
✔ How to use generators for memory-efficient processing
✔ How Python implements lazy evaluation
✔ How to build generator pipelines
✔ How advanced frameworks use iterators under the hood
📋 Quick Reference — Generators
| Syntax | What it does |
|---|---|
| yield value | Produce a value and pause execution |
| next(gen) | Advance generator to next yield |
| (x for x in lst) | Generator expression (lazy list) |
| yield from iterable | Delegate to a sub-generator |
| list(gen) | Materialise all generator values at once |
🏆 Lesson Complete!
You understand lazy evaluation and how to build memory-efficient pipelines with generators — the same technique used inside Pandas, Airflow, and Spark.
Up next: Advanced Async & Await — write non-blocking concurrent code with asyncio.
Sign up for free to track which lessons you've completed and get learning reminders.