Lesson 15 • Advanced Track
Advanced LINQ Queries & Optimization
By the end of this lesson you'll be able to group, flatten, join, and reduce real-world data with LINQ — and write queries that stay fast and predictable in production by mastering deferred execution and materialisation.
What You'll Learn
- Group and summarise data with GroupBy (Count, Sum, Average per bucket)
- Flatten nested collections into one sequence with SelectMany
- Combine two sources on a shared key with Join
- Reduce a whole sequence to one value with Aggregate (and a seed)
- Page through results with Skip / Take
- Spot and fix deferred-execution traps with ToList / ToArray / ToDictionary
Where, Select, OrderBy, the single-item operators (First/FirstOrDefault), and the basics of deferred execution. This lesson builds straight on top of all of that.💡 Real-World Analogy
Think of a busy mailroom. GroupBy is sorting the post into pigeonholes by recipient, then counting each pile. SelectMany is tipping every pigeonhole out onto one big table — flattening many piles into a single stream. Join is matching each letter to its recipient's address card by a shared name. And deferred execution is the difference between writing the sorting instructions on a clipboard (the query) and actually walking the room to do it (enumerating) — nothing moves until someone reads the clipboard and acts.
📊 Advanced LINQ Methods Reference
| Method | What it does | Returns |
|---|---|---|
| GroupBy | Bucket items by a key | IEnumerable<IGrouping> |
| SelectMany | Flatten nested sequences | one flat sequence |
| Join | Match two sources on a key | combined sequence |
| Aggregate | Fold to one value (reduce) | a single value |
| Skip / Take | Page through a sequence | a sub-sequence |
| ToList / ToArray | Run now & snapshot | List<T> / T[] |
| ToDictionary | Build O(1) key lookup | Dictionary<K,V> |
| ToLookup | Materialised one-to-many grouping | ILookup<K,V> |
Every one of these still works on IEnumerable<T>. The To... family is special: those are the methods that force execution and turn a lazy recipe into concrete data.
Running C# locally: install the .NET SDK or use dotnetfiddle.net. Every example here uses an in-memory List<T>, so it runs unchanged in a console app. Keep using System.Linq; at the top.
1. GroupBy — Bucket & Summarise
GroupBy splits a collection into groups using the key its lambda returns. The clever part: each group is an IGrouping<K, T> — it carries a .Key and it is itself a sequence of the items in that bucket. That means you can call Count(), Sum(...), or Average(...) straight on a group. Pair it with Select to turn each group into a tidy summary object. Read this worked example, run it, then you'll write your own.
Worked example: GroupBy with Count, Sum & Average
Group orders by customer and by category, then summarise each bucket.
using System;
using System.Linq;
using System.Collections.Generic;
class Order
{
public string Customer { get; set; }
public string Product { get; set; }
public decimal Price { get; set; }
public string Category { get; set; }
}
class Program
{
static void Main()
{
var orders = new List<Order>
{
new Order { Customer = "Alice", Product = "Laptop", Price = 999, Category = "Electronics" },
new Order { Customer = "Bob", Product = "
...Your turn. The program below groups sales by region — fill in the three blanks so it counts the sales and sums the amount per region, then run it and check the output.
🎯 Your turn: group sales and total per region
Fill in the GroupBy key, Count(), and the Sum selector.
using System;
using System.Linq;
using System.Collections.Generic;
class Sale
{
public string Region { get; set; }
public decimal Amount { get; set; }
}
class Program
{
static void Main()
{
// 🎯 YOUR TURN — fill in the blanks marked with ___, then run it.
var sales = new List<Sale>
{
new Sale { Region = "North", Amount = 100 },
new Sale { Region = "South", Amount = 250 },
new Sale { Region = "North", Amount = 75 },
...2. SelectMany — Flatten Nested Data
Select maps each item to something new — but if each item is itself a list, you end up with a list of lists. SelectMany solves this by flattening: it runs your lambda, then splices each returned sequence into one continuous stream. It's the go-to for departments→employees, posts→tags, or orders→line-items. A second overload takes a result selector so you can pair each child with its parent in one pass.
Worked example: SelectMany flatten + Distinct
Flatten employees from many departments, then collect unique tags.
using System;
using System.Linq;
using System.Collections.Generic;
class Department
{
public string Name { get; set; }
public List<string> Employees { get; set; }
}
class Program
{
static void Main()
{
var departments = new List<Department>
{
new Department { Name = "Engineering", Employees = new List<string> { "Alice", "Bob", "Carol" } },
new Department { Name = "Marketing", Employees = new List<string> { "Dave", "Eve" } },
...Now you try. Each shopping basket holds its own list of prices. Flatten them all into one sequence, then chain an aggregate to total them. Fill in the two blanks:
🎯 Your turn: flatten baskets, then Sum
Use SelectMany to flatten the price lists, then chain Sum() to total them.
using System;
using System.Linq;
using System.Collections.Generic;
class Basket
{
public string Owner { get; set; }
public List<int> Prices { get; set; }
}
class Program
{
static void Main()
{
// 🎯 YOUR TURN — fill in the two blanks, then run it.
var baskets = new List<Basket>
{
new Basket { Owner = "Ada", Prices = new List<int> { 10, 20, 5 } },
new Basket { Owner = "Ben", Prices = new List<int> { 40, 60 } },
new B
...3. Join — Combine Two Sources
Real data is rarely in one list. Join matches rows from two sequences on a shared key — exactly like an inner JOIN in SQL. You give it the second source, a key selector for each side, and a result selector that combines the matched pair. Items with no match on the other side are simply dropped (that's an inner join). It's how you stitch customers to their purchases, or orders to their products.
Worked example: join customers to purchases
Match purchases to customers by ID and build a combined report.
using System;
using System.Linq;
using System.Collections.Generic;
class Customer
{
public int Id { get; set; }
public string Name { get; set; }
}
class Purchase
{
public int CustomerId { get; set; }
public string Product { get; set; }
public decimal Amount { get; set; }
}
class Program
{
static void Main()
{
var customers = new List<Customer>
{
new Customer { Id = 1, Name = "Alice" },
new Customer { Id = 2, Name = "Bob" },
...4. Aggregate — Custom Reduction
Sum and Count are pre-built reductions; Aggregate is the general one — LINQ's reduce. It walks the sequence carrying an accumulator, applying your lambda (acc, item) => ... at each step. Always prefer the form that takes a seed (the starting accumulator): it lets the result be a different type from the items, and — crucially — it won't throw on an empty sequence the way the seedless form does.
Worked example: Aggregate with and without a seed
Fold numbers into a product, a running total, and a sentence.
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
static void Main()
{
var numbers = new List<int> { 2, 3, 4 };
// Aggregate is LINQ's "reduce": it folds a sequence into ONE value.
// It walks left to right, carrying an accumulator (acc) along the way.
// Without a seed: acc starts as the FIRST item.
int product = numbers.Aggregate((acc, n) => acc * n);
Console.WriteLine($"Product: {product}"); // 2 * 3
...5. Skip / Take — Paging
When you can't show everything at once — search results, an admin table, an API endpoint — you page. Skip(n) jumps past the first n items; Take(n) keeps the next n. The classic formula is Skip((page - 1) * pageSize).Take(pageSize). Their cousins TakeWhile/SkipWhile work on a condition instead of a count, stopping at the first item that fails the test.
Worked example: page 23 items, 5 per page
Build pages with Skip/Take, then try condition-based TakeWhile.
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
static void Main()
{
// 23 results we want to show 5 per page (like search results).
var items = Enumerable.Range(1, 23)
.Select(i => $"Item {i}")
.ToList();
int pageSize = 5;
for (int page = 1; page <= 3; page++)
{
// Skip the items on earlier pages, Take this page's worth.
// Page 1 -> Skip 0 Take 5 ; Page 2 -> Ski
...6. Deferred Execution & Multiple Enumeration
This is the difference between a senior and a junior LINQ user. A query is deferred: defining it runs nothing — it's a recipe. The work happens every time you enumerate it (foreach, Count(), ToList()…). So enumerating the same query twice runs the whole pipeline twice — wasteful if it's expensive, and a real bug if the source changed in between. The fix is to materialise once with ToList() (or ToArray()) and reuse that snapshot. Watch the evaluation count in the example.
Worked example: see a query re-run, then freeze it
Watch the pipeline evaluate twice, then snapshot it once with ToList().
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
static void Main()
{
var numbers = new List<int> { 1, 2, 3, 4, 5 };
// Defining a query runs NOTHING — it's just a stored recipe.
var query = numbers.Where(n =>
{
Console.WriteLine($" evaluating {n}"); // proves WHEN it runs
return n > 2;
});
Console.WriteLine("Query defined — nothing has run yet.");
Console.WriteLine("Fi
...🔎 Deep Dive: ToList vs ToArray vs ToDictionary vs ToLookup
These four all force execution — they turn a lazy query into concrete, in-memory data. Which one you pick depends on how you'll use the result:
query.ToList(); // List<T> — the everyday choice; you'll add/index it
query.ToArray(); // T[] — fixed size, tiny bit leaner than a List
query.ToDictionary( // Dictionary<K,V> — O(1) lookups by a UNIQUE key
x => x.Id); // ⚠ throws if two items share the same key
query.ToLookup( // ILookup<K,V> — like a Dictionary but ONE key -> MANY
x => x.Category); // perfect when keys repeat; never throws on duplicatesRule of thumb: reach for ToList() by default; ToDictionary when you'll look items up by a unique key; and ToLookup when keys repeat (it's essentially a materialised GroupBy you can index into).
7. Performance — Filter Early, Avoid Re-enumeration
LINQ is readable, but order still matters. Filter early so later steps process fewer items; project late so you only carry the fields you need; limit with Take so you stop as soon as you have enough. Prefer Any() over Count() > 0 — Any stops at the first match while Count scans everything. And when you look the same data up repeatedly, build a ToDictionary once (O(1) lookups) instead of calling Where/First in a loop (O(n) every time — the classic N+1 trap).
Worked example: an optimised 1,000-item pipeline
Filter early, project late, Take, Any(), and a ToDictionary lookup.
using System;
using System.Linq;
using System.Collections.Generic;
class Product
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Price { get; set; }
public bool InStock { get; set; }
}
class Program
{
static void Main()
{
var products = Enumerable.Range(1, 1000).Select(i => new Product
{
Id = i,
Name = $"Product {i}",
Price = i * 1.5m,
InStock = i % 4 != 0
}).ToList();
...Pro Tips
- 💡 Materialise once: if you'll iterate a query more than once, call
.ToList()first and reuse it — otherwise the whole pipeline re-runs every time. - 💡
ToLookupbeats repeatedGroupBy: it runs immediately and lets you index by key, so it's ideal when you'll query the same buckets again and again. - 💡 Watch for null keys in
GroupBy: items with anullkey all land in a single group whoseKeyisnull— coalesce first (o.Category ?? "Unknown") if that's not what you want. - 💡 Give
Aggregatea seed: the seeded form is empty-safe and lets the result differ in type from the items. - 💡 .NET 6+ shortcuts:
DistinctBy(x => x.Key),MaxBy/MinBy, andChunk(100)(batch a sequence) often replace clunkyGroupBy+Firstpatterns.
Common Errors (and the fix)
- "Possible multiple enumeration of IEnumerable" (analyser warning): you iterate the same deferred query twice, re-running the whole pipeline each time. Call
.ToList()once and reuse the result. - "System.ArgumentException: An item with the same key has already been added":
ToDictionaryhit a duplicate key. UseToLookupfor one-key-many-values, or de-duplicate withDistinctBy/GroupBy(...).Select(g => g.First())first. - "System.InvalidOperationException: Sequence contains no elements" from
Aggregate: the no-seed overload was called on an empty sequence. Use theAggregate(seed, ...)form — a seed makes it empty-safe. - Group "lost" some items / a stray null bucket: a
nullgrouping key sweeps every null-keyed item into one group. Coalesce the key:GroupBy(o => o.Category ?? "Unknown"). - Trying to index a
GroupByresult like a list: a group is anIGrouping, not aList. Enumerate it withforeach, or call.ToList()on the group if you need indexing.
📋 Quick Reference
| Task | Code | Result |
|---|---|---|
| Group + count | items.GroupBy(x => x.Cat) | groups w/ .Key |
| Sum per group | g.Sum(x => x.Amount) | a number |
| Flatten | items.SelectMany(x => x.Sub) | flat sequence |
| Join | a.Join(b, x => x.Id, y => y.AId, ...) | matched pairs |
| Reduce | items.Aggregate(0, (a, x) => a + x) | one value |
| Page | items.Skip(10).Take(10) | items 11–20 |
| Snapshot | query.ToList() | List<T> |
| Key lookup | items.ToDictionary(x => x.Id) | O(1) lookups |
Frequently Asked Questions
Q: When should I use GroupBy versus ToLookup?
Both bucket items by a key. GroupBy is deferred — it re-groups every time you enumerate it. ToLookup runs immediately and gives you an indexable ILookup<K, V> you can reuse. Use ToLookup when you'll hit the same buckets repeatedly; GroupBy when it's a one-pass summary.
Q: What's the real difference between Select and SelectMany?
Select gives you one output per input — if each input is a list, you get a list of lists. SelectMany flattens those inner lists into one continuous sequence. Reach for it whenever a Select would leave you with nested collections.
Q: My query seems slow / runs twice — why?
LINQ is lazy, so enumerating the same query object more than once re-executes the entire pipeline each time. Call .ToList() once to materialise the result, then iterate the list as often as you like.
Q: When do I need a seed in Aggregate?
Use a seed whenever the result type differs from the item type, or whenever the sequence might be empty. The seedless overload throws on an empty sequence; Aggregate(0, ...) just returns the seed.
Mini-Challenge: Spend Per Account
No blanks this time — just a brief and an outline to keep you on track. Starting from the list of transactions, GroupBy the account name, Sum each group's amount, and print one line per account. Then add the bonus: OrderByDescending the total so the biggest account leads. Run it and check your output against the expected lines in the comments.
🎯 Mini-Challenge: total spend per account
GroupBy account, Sum the amounts, then order biggest-first.
using System;
using System.Linq;
using System.Collections.Generic;
class Transaction
{
public string Account { get; set; }
public decimal Amount { get; set; }
}
class Program
{
static void Main()
{
var transactions = new List<Transaction>
{
new Transaction { Account = "Savings", Amount = 200 },
new Transaction { Account = "Current", Amount = 50 },
new Transaction { Account = "Savings", Amount = 125 },
new Transact
...🎉 Lesson Complete
- ✅
GroupBybuckets by a key; each group is anIGroupingyou canCount/Sum/Average - ✅
SelectManyflattens nested collections into one sequence - ✅
Joinmatches two sources on a shared key (inner join) - ✅
Aggregateis reduce — give it a seed to stay empty-safe and change types - ✅
Skip/Takepage results;TakeWhile/SkipWhilepage by condition - ✅ Queries are deferred — materialise once with
ToList/ToArray/ToDictionaryto avoid re-enumeration - ✅ Next lesson: Deep Dive Into Delegates — the function-passing power behind every LINQ lambda
Sign up for free to track which lessons you've completed and get learning reminders.