Lesson 4 • Beginner
Linear Regression
Your first real ML algorithm — predict continuous values by finding the best line through your data.
✅ What You'll Learn
- • Simple linear regression (one feature)
- • How gradient descent "learns" parameters
- • Multiple regression (many features)
- • Evaluation metrics: MAE, MSE, RMSE, R²
📈 What Is Linear Regression?
🎯 Real-World Analogy: Imagine you're a real estate agent who's seen hundreds of houses sell. Over time, you develop an intuition: "bigger houses sell for more." Linear regression formalises that intuition mathematically — it finds the exact formula: price = $150 × size + $50,000.
Linear regression draws the "best fit line" through your data points. "Best fit" means the line that minimises the total prediction error across all data points.
📐 The Equation
y = mx + b
- • y = prediction (house price)
- • m = slope (how much price changes per sqft)
- • x = feature (house size in sqft)
- • b = intercept (base price)
Try It: Simple Linear Regression
Predict house prices from size — your first ML model!
import numpy as np
# Simple Linear Regression: y = mx + b
# Find the best line through data points
# Dataset: House size (sq ft) → Price ($)
sizes = np.array([600, 800, 1000, 1200, 1400, 1600, 1800, 2000])
prices = np.array([150000, 180000, 220000, 260000, 290000, 330000, 360000, 410000])
# Calculate slope (m) and intercept (b) using the normal equation
n = len(sizes)
m = (n * np.sum(sizes * prices) - np.sum(sizes) * np.sum(prices)) / \
(n * np.sum(sizes**2) - np.sum(sizes)**2)
b = (np.su
...Try It: Gradient Descent
Watch how ML models learn by taking small steps toward the answer
import numpy as np
# Gradient Descent: How ML models actually LEARN
# Instead of solving directly, take small steps toward the answer
np.random.seed(42)
X = np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=float)
y = np.array([2.1, 4.3, 5.8, 8.2, 9.9, 12.1, 13.8, 16.2])
# Start with random guess
m = 0.0 # slope
b = 0.0 # intercept
learning_rate = 0.01
n = len(X)
print("=== Gradient Descent Learning ===")
print(f"Starting: m={m:.2f}, b={b:.2f}")
print()
# Train for several iterations
for epoch in
...Try It: Multiple Regression
Use multiple features (size, bedrooms, age) to predict price
import numpy as np
# Multiple Linear Regression: Multiple features → one prediction
# y = w1*x1 + w2*x2 + w3*x3 + b
# Dataset: Predict house price from multiple features
# Features: [size_sqft, bedrooms, age_years]
X = np.array([
[1200, 2, 10],
[1500, 3, 5],
[1800, 3, 15],
[2200, 4, 3],
[900, 1, 20],
[1600, 3, 8],
[2000, 4, 2],
[1100, 2, 12],
])
y = np.array([250000, 320000, 280000, 420000, 180000, 310000, 400000, 230000])
# Add bias column (column of 1s)
X_b =
...Try It: Evaluation Metrics
Measure how good your model is with MAE, RMSE, and R²
import numpy as np
# Model Evaluation: How do you know if your model is good?
# Simulate predictions vs actual values
np.random.seed(42)
actual = np.array([100, 150, 200, 250, 300, 350, 400, 450, 500, 550])
predicted = actual + np.random.randn(10) * 30 # predictions with noise
print("=== Actual vs Predicted ===")
print(f"{'Actual':>8} {'Predicted':>10} {'Error':>8}")
for a, p in zip(actual, predicted):
print(f"{a:>8.0f} {p:>10.0f} {p-a:>+8.0f}")
print()
# Metric 1: Mean Absolute Error (
...📋 Quick Reference
| Concept | Formula | Meaning |
|---|---|---|
| Simple LR | y = mx + b | One feature predicts target |
| Multiple LR | y = w₁x₁ + w₂x₂ + b | Many features predict target |
| MAE | mean(|y - ŷ|) | Average absolute error |
| RMSE | √mean((y-ŷ)²) | Root mean squared error |
| R² | 1 - SS_res/SS_tot | 0-1, variance explained |
🎉 Lesson Complete!
You've built your first ML model! Next, learn classification — predicting categories instead of numbers.
Sign up for free to track which lessons you've completed and get learning reminders.