Courses/Python/Generators & Iterators

    Lesson 19 • Advanced

    Generators & Iterators Mastery

    Generators and iterators are two of Python's most powerful features — allowing you to process massive datasets, stream data, write memory-efficient code, build pipelines, and design systems that behave like professional-grade libraries.

    What You'll Learn in This Lesson

    • • What iterators are and how the iterator protocol works
    • • How to create generators with yield
    • • The difference between return and yield
    • • Generator expressions and memory-efficient data pipelines
    • • Real-world use cases: streaming large files, infinite sequences, lazy evaluation

    What You'll Learn

    This lesson will take you from advanced fundamentals → deep internal mechanics → real-world patterns used in production.

    🔥 1. What Exactly Are Iterators?

    An iterator is any object that can give you items one at a time. It follows a simple contract:

    MethodWhat It DoesWhen It's Called
    __iter__()Returns the iterator itselfWhen you start a for-loop
    __next__()Returns the next item in sequenceEach iteration of the loop
    StopIterationSignals "no more items"When sequence is exhausted

    Manual Iterator

    Create your own iterator class

    Try it Yourself »
    Python
    # Manual Iterator Example
    class CountUpTo:
        def __init__(self, max_value):
            self.max = max_value
            self.current = 1
    
        def __iter__(self):
            return self
    
        def __next__(self):
            if self.current > self.max:
                raise StopIteration
            value = self.current
            self.current += 1
            return value
    
    # Usage
    for x in CountUpTo(5):
        print(x)

    Try It Yourself: Generators

    Practice creating generators and iterators in Python

    Try it Yourself »
    Python
    # Generators Practice
    
    # Simple generator function
    def count_up_to(n):
        count = 1
        while count <= n:
            yield count
            count += 1
    
    print("Counting to 5:")
    for num in count_up_to(5):
        print(num)
    
    # Generator expression
    squares = (x**2 for x in range(1, 6))
    print("\nSquares:")
    print(list(squares))
    
    # Fibonacci generator
    def fibonacci(n):
        a, b = 0, 1
        for _ in range(n):
            yield a
            a, b = b, a + b
    
    print("\nFirst 10 Fibonacci numbers:")
    print(list(fibonacci(10)
    ...

    ⚙️ 2. Why Iterators Matter

    Iterators solve the "Big Data" problem:

    Without IteratorsWith Iterators
    Load entire 5GB file into RAMProcess one line at a time
    Computer crashes with "Out of Memory"Works smoothly with constant memory
    Must wait for all data before startingStart processing immediately

    What iterators enable:

    • Streaming: Process data as it arrives (like Netflix streaming)
    • Pipelines: Chain operations together (like assembly lines)
    • Lazy evaluation: Only compute when needed (saves CPU & RAM)
    • Infinite sequences: Generate data forever without memory limits

    ⚡ 3. Enter Generators — The Shortcut to Iterators

    A generator is Python's elegant shortcut for creating iterators using the yield keyword.

    Iterator ClassGenerator Function
    ~15 lines of code~5 lines of code
    Define __init__, __iter__, __next__Just use yield
    Manual state managementAutomatic state saving

    Instead of writing an entire class, just use yield:

    Generator Shortcut

    Create iterators easily with yield

    Try it Yourself »
    Python
    def count_up_to(max_value):
        current = 1
        while current <= max_value:
            yield current
            current += 1
    
    # Usage
    for num in count_up_to(5):
        print(num)

    This produces the same behavior as the iterator class — but with 90% less code.

    🧠 4. How yield Works Internally

    When Python sees yield, something magical happens:

    Step 1: Function becomes a generator object (not executed yet!)

    Step 2: First next() runs until first yield

    Step 3: Execution pauses, value is returned

    Step 4: Next next() resumes from exact pause point

    Step 5: Repeat until function ends → StopIteration

    returnyield
    Function ends permanentlyFunction pauses, can resume
    Returns single valueCan yield many values over time
    All local variables lostAll local variables preserved

    🧵 5. Real-World Example — Log File Streaming

    Imagine parsing a 5GB log file. Without generators, you'd crash. With generators:

    Log File Streaming

    Stream large files line by line

    Try it Yourself »
    Python
    def read_logs(path):
        with open(path) as f:
            for line in f:
                yield line.strip()
    
    # Simulated usage (without actual file)
    def simulate_logs():
        logs = ["INFO: Started", "ERROR: Failed", "INFO: Done"]
        for log in logs:
            yield log
    
    for log in simulate_logs():
        print(log)

    Key benefit:

    • ✔ Doesn't load the entire file
    • ✔ Works line by line
    • ✔ Streams infinitely large files

    Python itself uses this pattern in its own IO libraries.

    💧 6. Infinite Generators (Perfect for Simulations & AI)

    Infinite Generators

    Create endless sequences for simulations

    Try it Yourself »
    Python
    def infinite_counter(start=0):
        while True:
            yield start
            start += 1
    
    # Take first 10 values
    counter = infinite_counter()
    for _ in range(10):
        print(next(counter))

    Use cases:

    • Reinforcement learning training loops
    • Game worlds
    • Randomized datasets
    • Unique ID generation

    📦 7. Generator Pipelines (Functional Programming Style)

    Chain generators together like Unix shell pipes (command1 | command2 | command3):

    Generator Pipelines

    Chain generators like Unix pipes

    Try it Yourself »
    Python
    def read_lines(text):
        for line in text.split("\n"):
            yield line.strip()
    
    def filter_errors(lines):
        for line in lines:
            if "ERROR" in line:
                yield line
    
    def extract_messages(lines):
        for l in lines:
            yield l.split(":")[-1].strip()
    
    # Sample log data
    logs = """INFO: System started
    ERROR: Connection timeout
    INFO: User logged in
    ERROR: File not found"""
    
    pipeline = extract_messages(filter_errors(read_lines(logs)))
    for msg in pipeline:
        print(msg)

    Each step processes data lazily.
    Nothing loads into memory at once.

    🧩 8. yield from — Delegating to Sub-Generators

    yield from lets you delegate iteration to another generator:

    yield from

    Delegate to sub-generators cleanly

    Try it Yourself »
    Python
    def flatten(list_of_lists):
        for sub in list_of_lists:
            yield from sub
    
    nested = [[1, 2], [3, 4], [5, 6]]
    print(list(flatten(nested)))

    This is cleaner and faster than nested loops.

    🌀 9. Two-Way Generators (send() Method)

    You can send values INTO generators, making them interactive:

    Two-Way Generators

    Send values into generators with send()

    Try it Yourself »
    Python
    def accumulator():
        total = 0
        while True:
            value = yield total
            if value is not None:
                total += value
    
    g = accumulator()
    next(g)        # start generator  
    print(g.send(10))     # -> 10  
    print(g.send(5))      # -> 15
    print(g.send(3))      # -> 18

    This technique is used in:

    • Async frameworks
    • Stream processors
    • Complex event simulation

    ⚙️ 10. Generator-Based Coroutines (Pre-asyncio Style)

    Generators can act as coroutines — functions that can pause and receive data:

    Generator Coroutines

    Cooperative concurrency with generators

    Try it Yourself »
    Python
    def coroutine():
        while True:
            message = yield
            print("Received:", message)
    
    c = coroutine()
    next(c)  # Start coroutine
    c.send("Hello")
    c.send("World")

    Now superseded by async def, but still heavily used in internal libraries and advanced scheduling systems.

    📚 11. Generators as Context Managers

    Using contextlib.contextmanager decorator:

    Generators as Context Managers

    Combine generators with cleanup logic

    Try it Yourself »
    Python
    from contextlib import contextmanager
    
    @contextmanager
    def managed_resource(name):
        print(f"Opening {name}")
        try:
            yield name
        finally:
            print(f"Closing {name}")
    
    with managed_resource("database") as db:
        print(f"Using {db}")

    This pattern merges generators + cleanup logic.

    🚀 12. Generator Expressions (Faster, Cleaner, Memory Efficient)

    Generator Expressions

    Memory-efficient inline generators

    Try it Yourself »
    Python
    # Generator expression - uses no extra memory
    squares = (x*x for x in range(10))
    print("Generator:", squares)
    print("Values:", list(squares))
    
    # Compare with list comprehension
    squares_list = [x*x for x in range(10)]
    print("\nList:", squares_list)

    This uses 0 RAM regardless of range size.

    Compare:

    StructureMemory Use
    List comprehensionLoads entire list
    Generator expressionStreams items one by one

    🔬 13. Building Your Own Iterable Class (Advanced)

    Here's a complete example with detailed comments:

    Custom Iterable Class

    Build a full iterator from scratch

    Try it Yourself »
    Python
    class Fibonacci:
        def __init__(self, limit):
            self.limit = limit
            self.a, self.b = 0, 1
    
        def __iter__(self):
            return self
    
        def __next__(self):
            if self.a > self.limit:
                raise StopIteration
            value = self.a
            self.a, self.b = self.b, self.a + self.b
            return value
    
    for num in Fibonacci(100):
        print(num)

    🕹 14. How Python's For-Loops Use Iterators Internally

    This simple for-loop:

    For-Loop Internals

    How Python implements for-loops

    Try it Yourself »
    Python
    data = [1, 2, 3]
    
    # This for loop:
    for x in data:
        print(x)
    
    # Actually does this internally:
    print("\nManual iteration:")
    it = data.__iter__()
    while True:
        try:
            x = it.__next__()
            print(x)
        except StopIteration:
            break

    Understanding this gives you total control over how objects behave.

    📊 15. Real-World Use Cases (Professional Level)

    ✔ Pandas Chunk Loading

    Streaming large CSVs with chunksize parameter

    ✔ PyTorch DataLoader

    Uses iterators to stream mini-batches.

    ✔ Web Scrapers (Scrapy)

    Iterators control crawling pipelines.

    ✔ SQLAlchemy Query Results

    Rows are lazily streamed.

    ✔ Async Systems

    Generators still power internal scheduling logic.

    🧪 16. Mini Project — Build a Streaming Data Pipeline

    You will create:

    • A data reader
    • A validator
    • A filter stage
    • A transformer
    • An exporter

    Each one is a generator.

    This simulates how Airflow, Spark, and Pandas internally process data.

    🎉 Conclusion

    You now fully understand:

    ✔ How iterators work internally

    ✔ How to build custom iterable objects

    ✔ How to use generators for memory-efficient processing

    ✔ How Python implements lazy evaluation

    ✔ How to build generator pipelines

    ✔ How advanced frameworks use iterators under the hood

    📋 Quick Reference — Generators

    SyntaxWhat it does
    yield valueProduce a value and pause execution
    next(gen)Advance generator to next yield
    (x for x in lst)Generator expression (lazy list)
    yield from iterableDelegate to a sub-generator
    list(gen)Materialise all generator values at once

    🏆 Lesson Complete!

    You understand lazy evaluation and how to build memory-efficient pipelines with generators — the same technique used inside Pandas, Airflow, and Spark.

    Up next: Advanced Async & Await — write non-blocking concurrent code with asyncio.

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service