Lesson 21 • Advanced
Streams API Advanced Pipelines
Build complex data pipelines with filter, map, reduce, collect, and flatMap.
Before You Start
You should be comfortable with Java Collections (Lesson 13), Generics (Lesson 14), and Lambda Expressions basics. Streams build heavily on functional interfaces like Predicate, Function, and Consumer.
What You'll Learn
- ✅ Stream creation and intermediate operations
- ✅ Terminal operations: collect, reduce, forEach
- ✅ flatMap for nested collections
- ✅ Collectors: groupingBy, partitioningBy, joining
- ✅ Parallel streams for performance
- ✅ Stream best practices and pitfalls
1️⃣ Declarative Data Processing
Streams transform how you think about data. Instead of writing imperative loops ("for each item, if condition, add to list"), you describe what you want: "filter items matching condition, transform each, collect results." This declarative style is more readable, composable, and parallelizable.
💡 Analogy: Assembly Line
A stream pipeline is like a factory assembly line. Raw materials (data source) enter at one end, pass through stations (intermediate operations: filter, map, sort), and a finished product comes out at the end (terminal operation: collect, reduce). The key insight: nothing happens until the terminal operation starts — intermediate operations are lazy.
2️⃣ Lazy Evaluation & Stream Pipeline
Streams are lazy — intermediate operations don't execute until a terminal operation triggers the pipeline. This means .filter().map().filter() doesn't create three separate collections; it fuses the operations and processes each element through the entire chain in one pass. This is why streams can be more efficient than multiple loops.
Try It: Filter, Map & Collect
// Java Streams API — Filter, Map, Collect
console.log("=== Stream Basics: Filter → Map → Collect ===\n");
let employees = [
{ name: "Alice", dept: "Engineering", salary: 95000, age: 28 },
{ name: "Bob", dept: "Marketing", salary: 65000, age: 35 },
{ name: "Charlie", dept: "Engineering", salary: 110000, age: 42 },
{ name: "Diana", dept: "HR", salary: 72000, age: 31 },
{ name: "Eve", dept: "Engineering", salary: 88000, age: 26 },
{ name: "Frank", dept: "Marketing", salary
...3️⃣ Collectors: groupingBy & partitioningBy
Collectors are the powerful terminal operations that gather stream results into collections. groupingBy partitions elements by a classifier function (like SQL GROUP BY), while partitioningBy splits into true/false groups. joining concatenates strings with a delimiter. These collectors compose — you can nest groupingBy with counting() or averagingDouble() for multi-level aggregations.
Try It: GroupBy, Partition & FlatMap
// Collectors: groupingBy, partitioningBy, flatMap
console.log("=== Advanced Collectors ===\n");
let employees = [
{ name: "Alice", dept: "Engineering", salary: 95000 },
{ name: "Bob", dept: "Marketing", salary: 65000 },
{ name: "Charlie", dept: "Engineering", salary: 110000 },
{ name: "Diana", dept: "HR", salary: 72000 },
{ name: "Eve", dept: "Engineering", salary: 88000 },
{ name: "Frank", dept: "Marketing", salary: 78000 },
{ name: "Grace", dept: "HR", salary: 850
...4️⃣ Parallel Streams
Adding .parallelStream() or .parallel() splits work across CPU cores using the ForkJoinPool. But parallel isn't always faster — the overhead of splitting and merging only pays off for large datasets (>10,000 elements) or CPU-intensive operations. Never use parallel streams with shared mutable state or ordered operations on small collections.
Try It: Complex Pipeline & Performance
// Complex Pipelines & Parallel Simulation
console.log("=== Complex Stream Pipelines ===\n");
let employees = [
{ name: "Alice", dept: "Engineering", salary: 95000, age: 28 },
{ name: "Bob", dept: "Marketing", salary: 65000, age: 35 },
{ name: "Charlie", dept: "Engineering", salary: 110000, age: 42 },
{ name: "Diana", dept: "HR", salary: 72000, age: 31 },
{ name: "Eve", dept: "Engineering", salary: 88000, age: 26 },
{ name: "Frank", dept: "Marketing", salary: 78000, age:
...Common Mistakes
map() — it's meant for pure transformations. Use forEach() for side effects..parallelStream() on a list of 10 items is slower than sequential due to thread overhead.forEach(list::add) is fragile. Use .collect(Collectors.toList()) instead.Pro Tips
💡 Use Collectors.toUnmodifiableList() (Java 10+) instead of toList() when you want an immutable result.
💡 IntStream.range(0, 10) is great for creating streams without a collection — replaces index-based for-loops.
💡 Prefer reduce() for custom aggregations and collect() for building collections.
📋 Quick Reference
| Operation | Type | Purpose |
|---|---|---|
| filter() | Intermediate | Keep matching elements |
| map() | Intermediate | Transform each element |
| flatMap() | Intermediate | Flatten nested streams |
| sorted() | Intermediate | Order elements |
| collect() | Terminal | Gather into collection |
| reduce() | Terminal | Combine into single result |
| forEach() | Terminal | Side-effect on each element |
🎉 Lesson Complete!
You can now build powerful data processing pipelines with the Java Streams API!
Next: Lambda Expressions Deep Dive — functional interfaces, method references, and lambda best practices.
Sign up for free to track which lessons you've completed and get learning reminders.