Courses/AI & ML/Semantic Segmentation

    Lesson 33 • Advanced

    Semantic Segmentation

    Label every pixel in an image — learn U-Net's encoder-decoder architecture, skip connections, and segmentation metrics like mIoU and Dice.

    ✅ What You'll Learn

    • • U-Net: encoder-decoder with skip connections
    • • Semantic vs instance vs panoptic segmentation
    • • Metrics: mIoU, Dice score, pixel accuracy
    • • DeepLab and dilated convolutions

    🎨 Colouring Every Pixel

    🎯 Real-World Analogy: Object detection draws rectangles around objects. Semantic segmentation is like colouring every pixel with a different colour for each class — imagine a colouring book where "sky" is blue, "road" is grey, and "car" is red, but you have to colour every single pixel correctly. Self-driving cars need this level of detail to know exactly where the road ends and the sidewalk begins.

    Segmentation is essential for autonomous driving, medical image analysis (tumour boundaries), satellite imagery, and video editing (background removal). U-Net (2015) remains the most influential architecture, especially in medical imaging where labelled data is scarce.

    Try It: U-Net Architecture

    See how the encoder-decoder structure preserves spatial details

    Try it Yourself »
    Python
    import numpy as np
    
    # U-Net: The Encoder-Decoder Architecture for Segmentation
    # Contracts to capture context, expands to precise localisation
    
    np.random.seed(42)
    
    def simulate_encoder(image, levels=4):
        """Encoder: progressively downsample and increase channels"""
        features = []
        x = image
        print("ENCODER (downsampling path):")
        for i in range(levels):
            h, w = x.shape[0] // 2, x.shape[1] // 2
            channels = 64 * (2 ** i)
            x = np.random.randn(h, w)  # Simulated f
    ...

    Try It: Segmentation Metrics

    Calculate IoU, Dice, and pixel accuracy per class

    Try it Yourself »
    Python
    import numpy as np
    
    # Segmentation Metrics: IoU, Dice, Pixel Accuracy
    # Evaluate how well each pixel is classified
    
    np.random.seed(42)
    
    def pixel_accuracy(pred, target):
        return np.mean(pred == target)
    
    def iou_per_class(pred, target, n_classes):
        """Intersection over Union per class"""
        ious = []
        for c in range(n_classes):
            pred_c = (pred == c)
            target_c = (target == c)
            intersection = np.sum(pred_c & target_c)
            union = np.sum(pred_c | target_c)
            
    ...

    ⚠️ Common Mistake: Using pixel accuracy as your primary metric. If 80% of your image is background, a model that predicts "background everywhere" gets 80% accuracy but is useless. Always use mIoU — it treats all classes equally regardless of their pixel count.

    💡 Pro Tip: For medical imaging, start with U-Net + Dice loss (not cross-entropy). Dice loss handles class imbalance naturally. For general segmentation, use DeepLabV3+ with an EfficientNet backbone — it's the best accuracy/speed tradeoff. SAM (Segment Anything) can segment any object with zero-shot prompts.

    📋 Quick Reference

    ArchitectureKey FeatureBest For
    U-NetSkip connectionsMedical, small datasets
    DeepLabV3+Atrous (dilated) convGeneral segmentation
    Mask R-CNNInstance masksInstance segmentation
    SegFormerTransformer-basedHigh accuracy
    SAMZero-shot promptsUniversal segmentation

    🎉 Lesson Complete!

    You can now label every pixel in an image! Next, learn how to process audio data for speech recognition.

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service