Lesson 21 • Advanced

The Java Stream API

Stop writing loops to filter, transform, and summarise data. Learn to build readable stream pipelines that say what you want, not how to loop for it.

Before You Start

You should be comfortable with Collections (List, Map) and the basics of lambda expressions — the short x -> x * 2 functions you pass to stream operations. A stream is built on top of a collection, so if you can create a List<String> you're ready.

What You'll Learn in This Lesson

✓Create streams with .stream(), Stream.of, and IntStream.range
✓Chain intermediate ops: filter, map, sorted, distinct, limit
✓Finish with terminal ops: collect, forEach, reduce, count, toList
✓Use Collectors: toList, toMap, groupingBy, joining
✓Understand lazy evaluation — why nothing runs until the end
✓Use parallel streams safely (and know when not to)

🏭 Real-World Analogy: A Factory Assembly Line

A stream pipeline works exactly like a factory assembly line. Raw materials (your collection) enter at one end. They pass through a series of stations — one keeps only the good parts (filter), one reshapes each part (map), one puts them in order (sorted). At the very end, a worker boxes up the finished goods (collect).

💡 The key insight: the conveyor belt does not move until someone at the end asks for a finished product. The middle stations (intermediate operations) are just instructions written on a clipboard. Only the final terminal operation switches the belt on. That is lazy evaluation, and it is what makes streams efficient.

1️⃣ Creating a Stream

A stream is a one-shot sequence of elements you push through a pipeline. You don't store data in a stream — you flow data through one. There are three common ways to start a stream:

collection.stream() — the usual starting point, from any List<T>, Set, etc.
Stream.of(a, b, c) — when you have loose values, not a collection.
IntStream.range(0, n) — a stream of ints, replacing index-based for loops. The end is exclusive; use rangeClosed to include it.

💡 Mental model: a stream is lazy and single-use. After you run a terminal operation on it, that stream is finished — to process the data again you call .stream() on the original collection a second time.

2️⃣ Intermediate Operations (the Recipe)

Intermediate operations each take a stream and return a new stream, so you can chain them. They are lazy — calling them just records a step; no data moves yet. The ones you'll reach for constantly:

Operation	What it does	Example
filter	Keep elements matching a test	.filter(n -> n > 10)
map	Transform each element	.map(String::toUpperCase)
sorted	Order the elements	.sorted()
distinct	Drop duplicates	.distinct()
limit	Keep only the first N	.limit(3)

Worked Example: Create a Stream & Chain Operations

import java.util.List;
import java.util.stream.IntStream;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        // ── Three ways to CREATE a stream ──────────────────────────────
        List<String> names = List.of("Alice", "Bob", "Charlie", "Bob");

        // 1) From a collection: .stream()
        names.stream().forEach(n -> System.out.println("From list: " + n));

        // 2) From values: Stream.of(...)
        Stream.of("red", "green", "blue")
              .forEach(c -> System.out.println("Stream.of: " + c));

        // 3) From a range of ints: IntStream.range(start, end)  (end is EXCLUSIVE)
        IntStream.range(1, 4)               // 1, 2, 3  (NOT 4)
                 .forEach(i -> System.out.println("range: " + i));

        // ── INTERMEDIATE ops build a recipe (lazy, nothing runs yet) ───
        // ── The TERMINAL op (collect) runs the whole pipeline ──────────
        List<String> result = names.stream()
            .filter(n -> n.length() > 3)    // keep names longer than 3 chars ("Bob" is out)
            .distinct()                     // drop any duplicate values
            .map(String::toUpperCase)       // transform each surviving element
            .sorted()                       // order alphabetically
            .toList();                      // TERMINAL: collect into a List (Java 16+)

        System.out.println("Pipeline result: " + result);
    }
}

Output

From list: Alice
From list: Bob
From list: Charlie
From list: Bob
Stream.of: red
Stream.of: green
Stream.of: blue
range: 1
range: 2
range: 3
Pipeline result: [ALICE, CHARLIE]

This is real code — run it for free atonecompiler.com/javaor in your own editor.

3️⃣ Terminal Operations & Collectors (the Finished Product)

A terminal operation is what finally runs the pipeline and produces a result. After it runs, the stream is consumed. The common ones:

collect(...) — gather elements into a collection, driven by a Collector.
toList() — a shortcut (Java 16+) for collecting into an unmodifiable List.
forEach(...) — perform a side effect (like printing) on each element.
reduce(...) — fold every element into a single value (a sum, a max).
count() — how many elements made it through.

The real power lives in the Collectors class, which you pass to collect:

Collectors.toList() — gather into a List.
Collectors.toMap(keyFn, valueFn) — build a Map lookup.
Collectors.groupingBy(classifier) — bucket elements by a key, like SQL GROUP BY.
Collectors.joining(", ") — concatenate strings with a separator.

Worked Example: Terminal Ops & Collectors

import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import java.util.stream.Collectors;

public class Main {
    record Product(String name, String category, double price) {}

    public static void main(String[] args) {
        List<Product> products = List.of(
            new Product("Apple",  "Fruit",  0.50),
            new Product("Banana", "Fruit",  0.30),
            new Product("Carrot", "Veg",    0.20),
            new Product("Milk",   "Dairy",  1.20),
            new Product("Cheese", "Dairy",  3.50));

        // count() — how many items survive the filter?
        long cheapCount = products.stream()
            .filter(p -> p.price() < 1.00)
            .count();
        System.out.println("Items under $1: " + cheapCount);

        // reduce() — fold every price into one running total
        double total = products.stream()
            .map(Product::price)
            .reduce(0.0, Double::sum);          // start at 0.0, keep adding
        System.out.printf("Total price: $%.2f%n", total);

        // Collectors.toList() — gather names into a List
        List<String> nameList = products.stream()
            .map(Product::name)
            .collect(Collectors.toList());
        System.out.println("Names: " + nameList);

        // Collectors.joining() — concatenate with a delimiter
        String joined = products.stream()
            .map(Product::name)
            .collect(Collectors.joining(", ", "[", "]"));
        System.out.println("Joined: " + joined);

        // Collectors.toMap() — build a name -> price lookup
        Map<String, Double> priceOf = products.stream()
            .collect(Collectors.toMap(Product::name, Product::price));
        System.out.println("Milk costs: $" + priceOf.get("Milk"));

        // Collectors.groupingBy() — like SQL GROUP BY
        Map<String, List<String>> byCategory = products.stream()
            .collect(Collectors.groupingBy(Product::category,
                     Collectors.mapping(Product::name, Collectors.toList())));
        new TreeMap<>(byCategory).forEach((cat, items) ->
            System.out.println(cat + ": " + String.join(", ", items)));
    }
}

Output

Items under $1: 3
Total price: $5.70
Names: [Apple, Banana, Carrot, Milk, Cheese]
Joined: [Apple, Banana, Carrot, Milk, Cheese]
Milk costs: $1.2
Dairy: Milk, Cheese
Fruit: Apple, Banana
Veg: Carrot

This is real code — run it for free atonecompiler.com/javaor in your own editor.

🎯 Your Turn #1: Filter Big Numbers

Finish the pipeline below. Fill in the filter predicate so only numbers greater than 10 survive. The expected output is in the comment — run it on your machine and check.

Exercise: filter

import java.util.List;

public class Main {
    public static void main(String[] args) {
        // 🎯 YOUR TURN — fill in the blanks marked with ___

        List<Integer> numbers = List.of(5, 12, 8, 21, 3, 17, 9);

        // 1) Keep only the numbers greater than 10
        // 👉 replace ___ with a filter predicate
        List<Integer> big = numbers.stream()
            .filter(n -> ___)
            .toList();

        // 2) Print the kept numbers
        System.out.println("Big numbers: " + big);

        // ✅ Expected output:
        // Big numbers: [12, 21, 17]
    }
}

This is real code — run it for free atonecompiler.com/javaor in your own editor.

🎯 Your Turn #2: Map Words to Lengths

This time fill in two blanks: the intermediate operation that transforms each element, and the expression that gives a word's length. Compare with the expected output in the comment.

Exercise: map

import java.util.List;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        // 🎯 YOUR TURN — fill in the blanks marked with ___

        List<String> words = List.of("sky", "ocean", "sun", "mountain");

        // 1) Transform each word to its length, then collect into a List
        // 👉 replace the first ___ with the op that transforms each element
        // 👉 replace the second ___ with the length of the string (use w.length())
        List<Integer> lengths = words.stream()
            .___(w -> ___)
            .collect(Collectors.toList());

        System.out.println("Lengths: " + lengths);

        // ✅ Expected output:
        // Lengths: [3, 5, 3, 8]
    }
}

This is real code — run it for free atonecompiler.com/javaor in your own editor.

4️⃣ Lazy Evaluation & Parallel Streams

Lazy evaluation means the pipeline does nothing until a terminal operation pulls data through. Intermediate operations are fused into a single pass, so .filter().map().filter() does not create three temporary lists — each element flows through the whole chain at once. Short-circuiting operations like limit and findFirst can stop early, so the source is never fully processed.

A parallel stream (.parallel() or collection.parallelStream()) splits the work across CPU cores using the shared ForkJoinPool. It can be dramatically faster — but only sometimes.

Parallel caveats: only worth it for large datasets (think tens of thousands of elements) doing CPU-heavy, stateless work. For small collections the splitting overhead makes it slower. Never share mutable state across the threads, and remember the processing order is not guaranteed.

Real-World Example: Lazy Evaluation Proof & a Safe Parallel Reduction

import java.util.List;
import java.util.stream.IntStream;

public class Main {
    public static void main(String[] args) {
        // ── LAZY EVALUATION proof ──────────────────────────────────────
        // map() prints a tag as each element passes. Because limit(2) is a
        // short-circuiting terminal-feeder, the pipeline stops after 2 — it
        // does NOT process all five. Nothing runs until findFirst() pulls.
        System.out.println("Lazy demo:");
        java.util.Optional<Integer> first = List.of(1, 2, 3, 4, 5).stream()
            .map(n -> {
                System.out.println("  mapping " + n);
                return n * 10;
            })
            .filter(n -> n > 15)
            .findFirst();                   // pulls only until the first match
        System.out.println("First match: " + first.get());

        // ── PARALLEL stream ────────────────────────────────────────────
        // Splits work across CPU cores via the ForkJoinPool. Only worth it
        // for LARGE datasets and CPU-heavy, stateless work. The RESULT of a
        // reduction is deterministic; the processing ORDER is not.
        long sum = IntStream.rangeClosed(1, 1_000_000)
            .parallel()                     // opt in to parallelism
            .filter(n -> n % 2 == 0)        // even numbers only
            .mapToLong(n -> (long) n)
            .sum();                         // associative reduction — safe in parallel
        System.out.println("Sum of evens 1..1,000,000: " + sum);
    }
}

Output

Lazy demo:
  mapping 1
  mapping 2
First match: 20
Sum of evens 1..1,000,000: 250000500000

This is real code — run it for free atonecompiler.com/javaor in your own editor.

Mini-Challenge: Big Spenders

Time to fly solo. The starter below has only a comment outline — no filled-in logic. Build a single pipeline that finds the customers who spent $100 or more, sorted alphabetically. The expected output is in the comments.

Challenge: full pipeline

import java.util.List;

public class Main {
    record Order(String customer, double amount) {}

    public static void main(String[] args) {
        // 🎯 MINI-CHALLENGE: Big spenders
        List<Order> orders = List.of(
            new Order("Ana",  120.0),
            new Order("Ben",   45.0),
            new Order("Cara", 200.0),
            new Order("Dan",   80.0));

        // 1. Build a stream from "orders"
        // 2. Keep only orders with amount >= 100
        // 3. Map each surviving order to its customer name
        // 4. Sort the names alphabetically
        // 5. Collect into a List and print it
        //
        // ✅ Expected output:
        // [Ana, Cara]

        // your code here
    }
}

This is real code — run it for free atonecompiler.com/javaor in your own editor.

Common Errors (and How to Fix Them)

❌ Reusing a consumed stream: IllegalStateException: stream has already been operated upon or closed. A stream is one-shot. Don't store it in a variable and run two terminal ops on it — call list.stream() again to get a fresh one.
❌ Side effects inside map(): printing or mutating an external list from map(x -> { list.add(x); return x; }) looks fine sequentially but corrupts data under parallel(). Keep map pure; do side effects in forEach.
❌ Forgetting the terminal operation: list.stream().filter(...).map(...); compiles and runs but does nothing — intermediate ops are lazy. Nothing happens until you add .collect(...), .forEach(...), or .toList().
❌ Parallel misuse: calling .parallel() on a 10-item list, or reducing into a shared ArrayList, is slower or outright buggy. Use parallel only for large, stateless, CPU-bound work with an associative reduction.
❌ Duplicate keys in toMap(): IllegalStateException: Duplicate key when two elements map to the same key. Pass a merge function: Collectors.toMap(k, v, (a, b) -> a).

📋 Quick Reference

Goal	Code	Type
Stream from a list	list.stream()	Source
Stream from values	Stream.of(a, b, c)	Source
Range of ints	IntStream.range(0, n)	Source
Keep matching	.filter(pred)	Intermediate
Transform each	.map(fn)	Intermediate
Order / dedupe / cap	.sorted() .distinct() .limit(n)	Intermediate
Collect to list	.toList()	Terminal
Group by key	.collect(groupingBy(fn))	Terminal
Build a map	.collect(toMap(k, v))	Terminal
Join strings	.collect(joining(", "))	Terminal
Fold to one value	.reduce(0, Integer::sum)	Terminal
Run in parallel	.parallel()	Modifier

❓ Frequently Asked Questions

What is the difference between an intermediate and a terminal operation?

Intermediate operations (filter, map, sorted, distinct, limit) return a new stream and are lazy — they only describe the pipeline. A terminal operation (collect, forEach, reduce, count, toList) actually runs the pipeline and produces a result or side effect. A stream does nothing until a terminal operation is called.

Why can't I reuse a stream after calling a terminal operation?

A stream is a one-shot pipeline, not a collection. Once a terminal operation consumes it, the stream is closed and calling another operation throws IllegalStateException: stream has already been operated upon or closed. If you need to process the data again, create a fresh stream from the original collection.

Should I use collect(Collectors.toList()) or the newer toList()?

Both gather a stream into a List. toList() (Java 16+) is shorter and returns an unmodifiable list. collect(Collectors.toList()) returns a list with no guarantee about mutability or type. Use toList() for new code unless you specifically need a mutable list or an older Java version.

When are parallel streams actually faster?

Only when you have a large dataset (tens of thousands of elements or more) and stateless, CPU-bound work that splits cleanly. For small collections the cost of splitting and merging across threads outweighs the gain, so a sequential stream wins. Never use parallel streams with shared mutable state.

Can I put a print statement or other side effect inside map()?

You can, but you shouldn't. map() is for pure transformations — input in, transformed value out. Side effects belong in forEach() (a terminal op) or peek() (for debugging only). Side effects in map() break badly under parallel streams and make pipelines hard to reason about.

🎉 Lesson Complete!

You can now build complete stream pipelines — create a stream with .stream(), Stream.of, or IntStream.range, chain intermediate ops like filter and map, and finish with a terminal op such as collect, reduce, or toList. You also understand lazy evaluation and when parallel streams help.

Next: Lambda Expressions Deep Dive — functional interfaces, method references, and the lambdas that power every stream operation you just learned.

The Java Stream API

Before You Start

What You'll Learn in This Lesson

🏭 Real-World Analogy: A Factory Assembly Line

1️⃣ Creating a Stream

2️⃣ Intermediate Operations (the Recipe)

3️⃣ Terminal Operations & Collectors (the Finished Product)

🎯 Your Turn #1: Filter Big Numbers

🎯 Your Turn #2: Map Words to Lengths

4️⃣ Lazy Evaluation & Parallel Streams

Mini-Challenge: Big Spenders

Common Errors (and How to Fix Them)

📋 Quick Reference

❓ Frequently Asked Questions

🎉 Lesson Complete!

Cookie & Privacy Settings