Skip to main content

    Lesson 15 • Advanced Track

    Advanced LINQ Queries & Optimization

    By the end of this lesson you'll be able to group, flatten, join, and reduce real-world data with LINQ — and write queries that stay fast and predictable in production by mastering deferred execution and materialisation.

    What You'll Learn

    • Group and summarise data with GroupBy (Count, Sum, Average per bucket)
    • Flatten nested collections into one sequence with SelectMany
    • Combine two sources on a shared key with Join
    • Reduce a whole sequence to one value with Aggregate (and a seed)
    • Page through results with Skip / Take
    • Spot and fix deferred-execution traps with ToList / ToArray / ToDictionary

    💡 Real-World Analogy

    Think of a busy mailroom. GroupBy is sorting the post into pigeonholes by recipient, then counting each pile. SelectMany is tipping every pigeonhole out onto one big table — flattening many piles into a single stream. Join is matching each letter to its recipient's address card by a shared name. And deferred execution is the difference between writing the sorting instructions on a clipboard (the query) and actually walking the room to do it (enumerating) — nothing moves until someone reads the clipboard and acts.

    📊 Advanced LINQ Methods Reference

    MethodWhat it doesReturns
    GroupByBucket items by a keyIEnumerable<IGrouping>
    SelectManyFlatten nested sequencesone flat sequence
    JoinMatch two sources on a keycombined sequence
    AggregateFold to one value (reduce)a single value
    Skip / TakePage through a sequencea sub-sequence
    ToList / ToArrayRun now & snapshotList<T> / T[]
    ToDictionaryBuild O(1) key lookupDictionary<K,V>
    ToLookupMaterialised one-to-many groupingILookup<K,V>

    Every one of these still works on IEnumerable<T>. The To... family is special: those are the methods that force execution and turn a lazy recipe into concrete data.

    Running C# locally: install the .NET SDK or use dotnetfiddle.net. Every example here uses an in-memory List<T>, so it runs unchanged in a console app. Keep using System.Linq; at the top.

    1. GroupBy — Bucket & Summarise

    GroupBy splits a collection into groups using the key its lambda returns. The clever part: each group is an IGrouping<K, T> — it carries a .Key and it is itself a sequence of the items in that bucket. That means you can call Count(), Sum(...), or Average(...) straight on a group. Pair it with Select to turn each group into a tidy summary object. Read this worked example, run it, then you'll write your own.

    Worked example: GroupBy with Count, Sum & Average

    Group orders by customer and by category, then summarise each bucket.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Order
    {
        public string Customer { get; set; }
        public string Product { get; set; }
        public decimal Price { get; set; }
        public string Category { get; set; }
    }
    
    class Program
    {
        static void Main()
        {
            var orders = new List<Order>
            {
                new Order { Customer = "Alice", Product = "Laptop",   Price = 999, Category = "Electronics" },
                new Order { Customer = "Bob",   Product = "
    ...

    Your turn. The program below groups sales by region — fill in the three blanks so it counts the sales and sums the amount per region, then run it and check the output.

    🎯 Your turn: group sales and total per region

    Fill in the GroupBy key, Count(), and the Sum selector.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Sale
    {
        public string Region { get; set; }
        public decimal Amount { get; set; }
    }
    
    class Program
    {
        static void Main()
        {
            // 🎯 YOUR TURN — fill in the blanks marked with ___, then run it.
    
            var sales = new List<Sale>
            {
                new Sale { Region = "North", Amount = 100 },
                new Sale { Region = "South", Amount = 250 },
                new Sale { Region = "North", Amount = 75  },
    
    ...

    2. SelectMany — Flatten Nested Data

    Select maps each item to something new — but if each item is itself a list, you end up with a list of lists. SelectMany solves this by flattening: it runs your lambda, then splices each returned sequence into one continuous stream. It's the go-to for departments→employees, posts→tags, or orders→line-items. A second overload takes a result selector so you can pair each child with its parent in one pass.

    Worked example: SelectMany flatten + Distinct

    Flatten employees from many departments, then collect unique tags.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Department
    {
        public string Name { get; set; }
        public List<string> Employees { get; set; }
    }
    
    class Program
    {
        static void Main()
        {
            var departments = new List<Department>
            {
                new Department { Name = "Engineering", Employees = new List<string> { "Alice", "Bob", "Carol" } },
                new Department { Name = "Marketing",   Employees = new List<string> { "Dave", "Eve" } },
               
    ...

    Now you try. Each shopping basket holds its own list of prices. Flatten them all into one sequence, then chain an aggregate to total them. Fill in the two blanks:

    🎯 Your turn: flatten baskets, then Sum

    Use SelectMany to flatten the price lists, then chain Sum() to total them.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Basket
    {
        public string Owner { get; set; }
        public List<int> Prices { get; set; }
    }
    
    class Program
    {
        static void Main()
        {
            // 🎯 YOUR TURN — fill in the two blanks, then run it.
    
            var baskets = new List<Basket>
            {
                new Basket { Owner = "Ada",  Prices = new List<int> { 10, 20, 5 } },
                new Basket { Owner = "Ben",  Prices = new List<int> { 40, 60 } },
                new B
    ...

    3. Join — Combine Two Sources

    Real data is rarely in one list. Join matches rows from two sequences on a shared key — exactly like an inner JOIN in SQL. You give it the second source, a key selector for each side, and a result selector that combines the matched pair. Items with no match on the other side are simply dropped (that's an inner join). It's how you stitch customers to their purchases, or orders to their products.

    Worked example: join customers to purchases

    Match purchases to customers by ID and build a combined report.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Customer
    {
        public int Id { get; set; }
        public string Name { get; set; }
    }
    
    class Purchase
    {
        public int CustomerId { get; set; }
        public string Product { get; set; }
        public decimal Amount { get; set; }
    }
    
    class Program
    {
        static void Main()
        {
            var customers = new List<Customer>
            {
                new Customer { Id = 1, Name = "Alice" },
                new Customer { Id = 2, Name = "Bob" },
     
    ...

    4. Aggregate — Custom Reduction

    Sum and Count are pre-built reductions; Aggregate is the general one — LINQ's reduce. It walks the sequence carrying an accumulator, applying your lambda (acc, item) => ... at each step. Always prefer the form that takes a seed (the starting accumulator): it lets the result be a different type from the items, and — crucially — it won't throw on an empty sequence the way the seedless form does.

    Worked example: Aggregate with and without a seed

    Fold numbers into a product, a running total, and a sentence.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Program
    {
        static void Main()
        {
            var numbers = new List<int> { 2, 3, 4 };
    
            // Aggregate is LINQ's "reduce": it folds a sequence into ONE value.
            // It walks left to right, carrying an accumulator (acc) along the way.
    
            // Without a seed: acc starts as the FIRST item.
            int product = numbers.Aggregate((acc, n) => acc * n);
            Console.WriteLine($"Product: {product}");   // 2 * 3
    ...

    5. Skip / Take — Paging

    When you can't show everything at once — search results, an admin table, an API endpoint — you page. Skip(n) jumps past the first n items; Take(n) keeps the next n. The classic formula is Skip((page - 1) * pageSize).Take(pageSize). Their cousins TakeWhile/SkipWhile work on a condition instead of a count, stopping at the first item that fails the test.

    Worked example: page 23 items, 5 per page

    Build pages with Skip/Take, then try condition-based TakeWhile.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Program
    {
        static void Main()
        {
            // 23 results we want to show 5 per page (like search results).
            var items = Enumerable.Range(1, 23)
                .Select(i => $"Item {i}")
                .ToList();
    
            int pageSize = 5;
    
            for (int page = 1; page <= 3; page++)
            {
                // Skip the items on earlier pages, Take this page's worth.
                // Page 1 -> Skip 0 Take 5 ; Page 2 -> Ski
    ...

    6. Deferred Execution & Multiple Enumeration

    This is the difference between a senior and a junior LINQ user. A query is deferred: defining it runs nothing — it's a recipe. The work happens every time you enumerate it (foreach, Count(), ToList()…). So enumerating the same query twice runs the whole pipeline twice — wasteful if it's expensive, and a real bug if the source changed in between. The fix is to materialise once with ToList() (or ToArray()) and reuse that snapshot. Watch the evaluation count in the example.

    Worked example: see a query re-run, then freeze it

    Watch the pipeline evaluate twice, then snapshot it once with ToList().

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Program
    {
        static void Main()
        {
            var numbers = new List<int> { 1, 2, 3, 4, 5 };
    
            // Defining a query runs NOTHING — it's just a stored recipe.
            var query = numbers.Where(n =>
            {
                Console.WriteLine($"  evaluating {n}");   // proves WHEN it runs
                return n > 2;
            });
    
            Console.WriteLine("Query defined — nothing has run yet.");
            Console.WriteLine("Fi
    ...

    🔎 Deep Dive: ToList vs ToArray vs ToDictionary vs ToLookup

    These four all force execution — they turn a lazy query into concrete, in-memory data. Which one you pick depends on how you'll use the result:

    query.ToList();        // List<T>  — the everyday choice; you'll add/index it
    query.ToArray();       // T[]      — fixed size, tiny bit leaner than a List
    query.ToDictionary(    // Dictionary<K,V> — O(1) lookups by a UNIQUE key
        x => x.Id);        // ⚠ throws if two items share the same key
    query.ToLookup(        // ILookup<K,V> — like a Dictionary but ONE key -> MANY
        x => x.Category);  // perfect when keys repeat; never throws on duplicates

    Rule of thumb: reach for ToList() by default; ToDictionary when you'll look items up by a unique key; and ToLookup when keys repeat (it's essentially a materialised GroupBy you can index into).

    7. Performance — Filter Early, Avoid Re-enumeration

    LINQ is readable, but order still matters. Filter early so later steps process fewer items; project late so you only carry the fields you need; limit with Take so you stop as soon as you have enough. Prefer Any() over Count() > 0Any stops at the first match while Count scans everything. And when you look the same data up repeatedly, build a ToDictionary once (O(1) lookups) instead of calling Where/First in a loop (O(n) every time — the classic N+1 trap).

    Worked example: an optimised 1,000-item pipeline

    Filter early, project late, Take, Any(), and a ToDictionary lookup.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Product
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public decimal Price { get; set; }
        public bool InStock { get; set; }
    }
    
    class Program
    {
        static void Main()
        {
            var products = Enumerable.Range(1, 1000).Select(i => new Product
            {
                Id = i,
                Name = $"Product {i}",
                Price = i * 1.5m,
                InStock = i % 4 != 0
            }).ToList();
    
    ...

    Pro Tips

    • 💡 Materialise once: if you'll iterate a query more than once, call .ToList() first and reuse it — otherwise the whole pipeline re-runs every time.
    • 💡 ToLookup beats repeated GroupBy: it runs immediately and lets you index by key, so it's ideal when you'll query the same buckets again and again.
    • 💡 Watch for null keys in GroupBy: items with a null key all land in a single group whose Key is null — coalesce first (o.Category ?? "Unknown") if that's not what you want.
    • 💡 Give Aggregate a seed: the seeded form is empty-safe and lets the result differ in type from the items.
    • 💡 .NET 6+ shortcuts: DistinctBy(x => x.Key), MaxBy/MinBy, and Chunk(100) (batch a sequence) often replace clunky GroupBy + First patterns.

    Common Errors (and the fix)

    • "Possible multiple enumeration of IEnumerable" (analyser warning): you iterate the same deferred query twice, re-running the whole pipeline each time. Call .ToList() once and reuse the result.
    • "System.ArgumentException: An item with the same key has already been added": ToDictionary hit a duplicate key. Use ToLookup for one-key-many-values, or de-duplicate with DistinctBy / GroupBy(...).Select(g => g.First()) first.
    • "System.InvalidOperationException: Sequence contains no elements" from Aggregate: the no-seed overload was called on an empty sequence. Use the Aggregate(seed, ...) form — a seed makes it empty-safe.
    • Group "lost" some items / a stray null bucket: a null grouping key sweeps every null-keyed item into one group. Coalesce the key: GroupBy(o => o.Category ?? "Unknown").
    • Trying to index a GroupBy result like a list: a group is an IGrouping, not a List. Enumerate it with foreach, or call .ToList() on the group if you need indexing.

    📋 Quick Reference

    TaskCodeResult
    Group + countitems.GroupBy(x => x.Cat)groups w/ .Key
    Sum per groupg.Sum(x => x.Amount)a number
    Flattenitems.SelectMany(x => x.Sub)flat sequence
    Joina.Join(b, x => x.Id, y => y.AId, ...)matched pairs
    Reduceitems.Aggregate(0, (a, x) => a + x)one value
    Pageitems.Skip(10).Take(10)items 11–20
    Snapshotquery.ToList()List<T>
    Key lookupitems.ToDictionary(x => x.Id)O(1) lookups

    Frequently Asked Questions

    Q: When should I use GroupBy versus ToLookup?

    Both bucket items by a key. GroupBy is deferred — it re-groups every time you enumerate it. ToLookup runs immediately and gives you an indexable ILookup<K, V> you can reuse. Use ToLookup when you'll hit the same buckets repeatedly; GroupBy when it's a one-pass summary.

    Q: What's the real difference between Select and SelectMany?

    Select gives you one output per input — if each input is a list, you get a list of lists. SelectMany flattens those inner lists into one continuous sequence. Reach for it whenever a Select would leave you with nested collections.

    Q: My query seems slow / runs twice — why?

    LINQ is lazy, so enumerating the same query object more than once re-executes the entire pipeline each time. Call .ToList() once to materialise the result, then iterate the list as often as you like.

    Q: When do I need a seed in Aggregate?

    Use a seed whenever the result type differs from the item type, or whenever the sequence might be empty. The seedless overload throws on an empty sequence; Aggregate(0, ...) just returns the seed.

    Mini-Challenge: Spend Per Account

    No blanks this time — just a brief and an outline to keep you on track. Starting from the list of transactions, GroupBy the account name, Sum each group's amount, and print one line per account. Then add the bonus: OrderByDescending the total so the biggest account leads. Run it and check your output against the expected lines in the comments.

    🎯 Mini-Challenge: total spend per account

    GroupBy account, Sum the amounts, then order biggest-first.

    Try it Yourself »
    C#
    using System;
    using System.Linq;
    using System.Collections.Generic;
    
    class Transaction
    {
        public string Account { get; set; }
        public decimal Amount { get; set; }
    }
    
    class Program
    {
        static void Main()
        {
            var transactions = new List<Transaction>
            {
                new Transaction { Account = "Savings", Amount = 200 },
                new Transaction { Account = "Current", Amount = 50  },
                new Transaction { Account = "Savings", Amount = 125 },
                new Transact
    ...

    🎉 Lesson Complete

    • GroupBy buckets by a key; each group is an IGrouping you can Count/Sum/Average
    • SelectMany flattens nested collections into one sequence
    • Join matches two sources on a shared key (inner join)
    • Aggregate is reduce — give it a seed to stay empty-safe and change types
    • Skip/Take page results; TakeWhile/SkipWhile page by condition
    • ✅ Queries are deferred — materialise once with ToList/ToArray/ToDictionary to avoid re-enumeration
    • Next lesson: Deep Dive Into Delegates — the function-passing power behind every LINQ lambda

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service