Lesson 50 • Capstone

Final Project: Build & Deploy a Complete ML System 🚀

In this capstone you'll build one complete machine learning project end-to-end — loading data, training a model from scratch, evaluating it, and making predictions — so you finish able to ship real AI work.

What You'll Learn in This Lesson

✓Load and inspect a dataset using plain Python lists and dicts
✓Preprocess features and split data into train and test sets
✓Train a perceptron classifier from scratch — no libraries
✓Evaluate with accuracy, precision, recall, and a confusion matrix
✓Make predictions on brand-new, unseen data correctly
✓Map your from-scratch code onto the scikit-learn equivalent

Before you start: This capstone pulls together Classification, Model Evaluation, and Data Preprocessing. Every code block runs right here in your browser — no installs needed. To run the scikit-learn version locally, download Python and pip install scikit-learn.

🎯 Real-World Analogy: Teaching a model is like teaching a friend to sort apples from oranges by feel. You show them labelled examples (training), quiz them on fruit they haven't seen (testing), score how often they're right (evaluation), and only then trust them with the unsorted crate (prediction). Every ML project — from spam filters to fraud detection — follows this exact loop.

1The Project: A Fruit Classifier, Built Across 5 Milestones

You'll build one project the whole way through: a classifier that decides whether a fruit is an apple or an orange from two numbers — its weight and how bumpy its skin is. It's deliberately tiny so the data fits on screen and you can check every number by hand. The workflow, though, is identical to a million-row production system.

The five milestones — the universal ML workflow:

Load & inspect — get the data into Python and look at it
Preprocess & split — clean, scale, and hold out a test set
Train — fit a perceptron from scratch
Evaluate — accuracy, precision, recall, confusion matrix
Predict — classify brand-new fruit

All code below is plain Python — no numpy, no scikit-learn — so nothing is hidden. At the end you'll see the same project in scikit-learn to connect what you built to the tools the industry uses.

2Milestone 1 — Load & Inspect the Dataset

Every project starts by getting data into a shape you can work with and looking at it. Here the dataset is a list of dictionaries — one dict per fruit. Before modelling anything, you check how many samples there are and whether the classes are balanced. A lopsided dataset (say 95% apples) changes every decision that follows.

Run it and read the comments — the # Expected output at the bottom tells you exactly what you should see.

Milestone 1: Load & Inspect

Load the fruit dataset and check its size and class balance

Try it Yourself »

Python

# ============================================
# MILESTONE 1: LOAD & INSPECT THE DATASET
# ============================================
# A tiny fruit dataset. Each row = one fruit.
# Two features: weight (grams) and bumpiness (0 = smooth, 10 = very bumpy).
# Label: "apple" or "orange".

dataset = [
    {"weight": 150, "bumpiness": 2, "label": "apple"},
    {"weight": 170, "bumpiness": 3, "label": "apple"},
    {"weight": 140, "bumpiness": 1, "label": "apple"},
    {"weight": 130, "bumpiness": 2
...

3Milestone 2 — Preprocess & Split

Models work on numbers, so first you encode the text labels (apple → 0, orange → 1). Then you scale the features to a 0-1 range — without this, weight (around 200) would completely drown out bumpiness (around 10) and the model would barely notice the bumpiness at all.

Finally you split the data: most rows train the model, a few are held back to test it. The golden rule of ML is that you never test on data the model trained on — that's like giving students the exam answers in advance.

Fit the scaler on training data only. Here we use the full range for simplicity, but in real projects you compute min/max from the training set and reuse those exact numbers on the test set — otherwise information about the test data leaks into training.

Milestone 2: Preprocess & Split

Encode labels, min-max scale the features, and hold out a test set

Try it Yourself »

Python

# ============================================
# MILESTONE 2: PREPROCESS + TRAIN/TEST SPLIT
# ============================================
dataset = [
    {"weight": 150, "bumpiness": 2, "label": "apple"},
    {"weight": 170, "bumpiness": 3, "label": "apple"},
    {"weight": 140, "bumpiness": 1, "label": "apple"},
    {"weight": 130, "bumpiness": 2, "label": "apple"},
    {"weight": 180, "bumpiness": 7, "label": "orange"},
    {"weight": 200, "bumpiness": 8, "label": "orange"},
    {"weight": 19
...

4Milestone 3 — Train a Perceptron From Scratch

A perceptron is the simplest learning unit there is — the great-grandparent of every neural network. It multiplies each feature by a weight, adds a bias, and fires 1 if the total is positive, else 0.

"Training" just means: for each example, make a guess, and if it's wrong, nudge the weights a little in the direction that would have been right. Repeat over the whole dataset a few times (each pass is an epoch) until it stops making mistakes. That single line —weights[i] += learning_rate * error * features[i] — is the entire learning algorithm.

Milestone 3: Train the Perceptron

Implement the perceptron update rule and train until it converges

Try it Yourself »

Python

# ============================================
# MILESTONE 3: TRAIN A PERCEPTRON FROM SCRATCH
# ============================================
# A perceptron is the simplest neural unit: it multiplies each
# feature by a weight, adds a bias, and fires 1 if the total is
# positive, else 0. Training nudges the weights toward correct answers.

# Scaled training data from Milestone 2 (apple=0, orange=1)
X_train = [
    [0.25, 0.12], [0.50, 0.25], [0.12, 0.00],   # apples
    [0.62, 0.75], [0.88, 0.88]
...

5Milestone 4 — Evaluate the Model

A trained model is worthless until you know how good it is on data it has never seen. You run it on the held-out test set and count four outcomes into a confusion matrix:

TP (true positive): predicted orange, actually orange
FP (false positive): predicted orange, actually apple — a false alarm
TN (true negative): predicted apple, actually apple
FN (false negative): predicted apple, actually orange — a miss

From those four numbers you compute accuracy (overall correctness), precision (when it says orange, how often it's right), and recall (of all the real oranges, how many it caught).

Milestone 4: Evaluate

Build a confusion matrix and compute accuracy, precision, and recall

Try it Yourself »

Python

# ============================================
# MILESTONE 4: EVALUATE THE MODEL
# ============================================
# Trained perceptron from Milestone 3 (orange is the positive class)
weights = [0.087, 0.15]
bias = -0.1

# Held-out test set from Milestone 2 (apple=0, orange=1)
X_test = [[0.00, 0.25], [1.00, 0.75]]   # one apple, one orange
y_test = [0, 1]

def predict(features, weights, bias):
    total = bias
    for i in range(len(features)):
        total += features[i] * weights
...

6Milestone 5 — Predict on New Data

This is the payoff: feeding the model fruit it has never seen and getting a label back. The one trick beginners miss is that new data must be scaled with exactly the same min/max you used in training — re-fitting the scaler on new data would shift everything and corrupt the predictions.

Notice the third fruit (160g, bumpiness 6) sits near the boundary — a great reminder that real models output a decision even when the answer is genuinely uncertain.

Milestone 5: Predict New Fruit

Classify three brand-new fruits using the trained model

Try it Yourself »

Python

# ============================================
# MILESTONE 5: PREDICT ON NEW, UNSEEN FRUIT
# ============================================
weights = [0.087, 0.15]
bias = -0.1

# The SAME scaling we fit in Milestone 2 (do not re-fit on new data!)
w_min, w_max = 130, 210
b_min, b_max = 1, 9

def scale(weight, bumpiness):
    w = (weight - w_min) / (w_max - w_min)
    b = (bumpiness - b_min) / (b_max - b_min)
    return [w, b]

def classify(weight, bumpiness):
    features = scale(weight, bumpiness)
...

🏭 The Industry Version (scikit-learn) — Read-Only

You just built every piece by hand. In a real job you'd reach for scikit-learn, which collapses all five milestones into a handful of lines. Read this and match each call to the milestone it replaces — MinMaxScaler is Milestone 2, Perceptron().fit() is Milestone 3, the metric functions are Milestone 4, and.predict() is Milestone 5.

# ============================================
# THE SAME PROJECT, THE PROFESSIONAL WAY (scikit-learn)
# Read-only: this is what your from-scratch code maps to in industry.
# ============================================
from sklearn.linear_model import Perceptron
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score

# Features [weight, bumpiness] and labels (apple=0, orange=1)
X = [[150, 2], [170, 3], [140, 1], [130, 2],
     [180, 7], [200, 8], [190, 9], [210, 7]]
y = [0, 0, 0, 0, 1, 1, 1, 1]

# Split, scale, train — the four lines that replace ~80 of ours
X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.25, random_state=0)
scaler = MinMaxScaler().fit(X_tr)
model = Perceptron().fit(scaler.transform(X_tr), y_tr)

# Evaluate
pred = model.predict(scaler.transform(X_te))
print("Accuracy: ", round(accuracy_score(y_te, pred), 2))
print("Precision:", round(precision_score(y_te, pred, zero_division=0), 2))
print("Recall:   ", round(recall_score(y_te, pred, zero_division=0), 2))

# Predict a brand-new fruit
print("New fruit ->", model.predict(scaler.transform([[205, 8]]))[0])

# Expected output:
# Accuracy:  1.0
# Precision: 1.0
# Recall:    1.0
# New fruit -> 1

Same result, a fraction of the code. The point of building it from scratch first is that none of this is a black box to you anymore — you know precisely what each line is doing.

🎯 Your Turn 1 — Build a k-NN Classifier

You've built a perceptron. Now build the other classic classifier — k-Nearest Neighbours — by filling in two blanks. k-NN classifies a point by letting its closest labelled neighbours vote. Fill in the squared difference in the distance function and the majority threshold.

Your Turn 1: k-NN distance & voting

Fill in the blanks to complete a k-nearest-neighbours classifier

Try it Yourself »

Python

# 🎯 YOUR TURN 1 — k-NN: measure how "close" two fruits are
# k-Nearest Neighbours classifies a point by looking at the labelled
# points nearest to it. The engine is a distance function. Fill it in.

X_train = [[0.25, 0.12], [0.50, 0.25], [0.12, 0.00],
           [0.62, 0.75], [0.88, 0.88], [0.75, 1.00]]
y_train = [0, 0, 0, 1, 1, 1]   # apple=0, orange=1

def distance(a, b):
    total = 0
    for i in range(len(a)):
        diff = a[i] - b[i]
        total += ___          # 👉 add the SQUARED d
...

🎯 Your Turn 2 — Report the F1 Score

On imbalanced problems (fraud, disease, spam) accuracy is misleading and the F1 score rules. Finish the recall and F1 formulas below, then check your answer against the expected output.

Your Turn 2: precision, recall, F1

Complete the recall and F1 formulas

Try it Yourself »

Python

# 🎯 YOUR TURN 2 — report the F1 score
# Accuracy lies when classes are imbalanced. F1 balances precision
# and recall into one number. Finish the two blanks.

# Results from evaluating a model (positive class = fraud)
tp, fp, fn = 8, 2, 4

precision = tp / (tp + fp)             # of flagged items, how many were right
recall = tp / (tp + ___)              # 👉 of all real positives, how many we caught

# F1 is the harmonic mean of precision and recall
f1 = 2 * precision * recall / (precision + _
...

🏁 Stretch Challenge — Package It Into a Class

Support has faded — only a comment outline remains. Combine Milestones 2, 3, and 5 into a single reusableFruitClassifier class with fit() andpredict() methods. This mirrors exactly how scikit-learn estimators are structured.

Stretch Challenge: FruitClassifier class

Outline only — write the fit() and predict() logic yourself

Try it Yourself »

Python

# 🏁 STRETCH CHALLENGE — wrap the whole pipeline into one class
# Support has faded: only the outline is here. You write the logic.
#
# Goal: a FruitClassifier you can train once and reuse.
#
# class FruitClassifier:
#     def fit(self, X, y):
#         # 1. store min/max of each column for scaling (Milestone 2)
#         # 2. run the perceptron training loop (Milestone 3)
#         # 3. keep self.weights and self.bias
#         ...
#
#     def predict(self, sample):
#         # 1. scale the sam
...

7Common Pitfalls (And How to Fix Them)

These are the mistakes that quietly ruin ML projects. Spotting them is what separates a working model from a misleading one.

❌ Data leakage — testing on training data

Evaluating on rows the model already saw gives a fake high score:

model.fit(X, y)
score = evaluate(model, X, y)   # ❌ same data — score is meaningless

✅ Fix: always hold out a test set the model never trains on.

model.fit(X_train, y_train)
score = evaluate(model, X_test, y_test)   # ✅ honest score

❌ Forgetting to scale new data the same way

Re-fitting the scaler on new data shifts the numbers and breaks predictions:

new_min, new_max = min(new_X), max(new_X)   # ❌ different scale!

✅ Fix: reuse the training set's min/max for every prediction.

scaled = (value - train_min) / (train_max - train_min)   # ✅ same scale

❌ Trusting accuracy on imbalanced data

With 98% legitimate transactions, predicting "always legitimate" scores 98% and catches zero fraud.

✅ Fix: report precision, recall, and F1 — not accuracy alone.

❌ Expecting a perceptron to learn non-linear data

A single perceptron can only draw one straight line. If two classes can't be split by a line, it will never converge.

✅ Fix: cap the epochs, or switch to k-NN or a multi-layer network for tangled data.

📋 Quick Reference — The ML Workflow

Step	What You Do	scikit-learn equivalent
1. Load & inspect	Read data, check size & class balance	`pd.read_csv(), df.head()`
2. Preprocess & split	Encode labels, scale features, hold out test set	`MinMaxScaler, train_test_split`
3. Train	Fit the model's weights on training data	`model.fit(X_train, y_train)`
4. Evaluate	Confusion matrix, accuracy, precision, recall, F1	`accuracy_score, f1_score`
5. Predict	Scale new data, output a label	`model.predict(new_X)`

Where This Takes You — AI/ML Career Paths

Career Path	Core Skills	Average Salary (USD)
ML Engineer	Python, PyTorch, MLOps, Cloud	$120K – $200K
Data Scientist	Statistics, ML, SQL, Visualisation	$100K – $170K
AI Research Scientist	Deep Learning, Math, Publications	$150K – $300K+
NLP Engineer	Transformers, LLMs, RAG, Embeddings	$130K – $220K
Computer Vision Engineer	CNNs, Detection, Segmentation	$120K – $190K
MLOps Engineer	Pipelines, Docker, Kubernetes, CI/CD	$110K – $180K

🪙 Business Opportunities You Can Build

• AI-powered SaaS tools for content generation, analytics, or automation
• Custom chatbots & RAG systems for enterprises and e-commerce
• Computer vision APIs for quality control, medical imaging, or security
• Recommendation engines for e-commerce, music, or content platforms
• Fraud detection services for fintech and banking

Turn This Into a Portfolio Project 🧠

📚 Structure Your ML Project

project/
 ┣ data/
 ┃ ┣ raw/              # Original datasets
 ┃ ┗ processed/         # Cleaned data
 ┣ notebooks/
 ┃ ┗ exploration.ipynb  # EDA and analysis
 ┣ src/
 ┃ ┣ data_pipeline.py   # Preprocessing
 ┃ ┣ features.py        # Feature engineering
 ┃ ┣ model.py           # Model architecture
 ┃ ┣ train.py           # Training loop
 ┃ ┗ evaluate.py        # Metrics + reporting
 ┣ api/
 ┃ ┗ main.py            # FastAPI inference server
 ┣ tests/               # Unit tests
 ┣ models/              # Saved model weights
 ┣ requirements.txt
 ┗ README.md

🧩 Start Small, Iterate

Begin with a baseline model (logistic regression or random forest), evaluate properly, then improve step by step. Track every experiment with MLflow.

🧪 Evaluate Thoroughly

Use appropriate metrics — F1 for imbalanced data, RMSE for regression, NDCG for rankings. Never rely on accuracy alone.

🌐 Deploy & Showcase

• Deploy with FastAPI + Docker on AWS/GCP/Azure
• Upload to GitHub with README, screenshots, and metrics
• Write a blog post or LinkedIn article about your approach
• Record a 2-minute demo video for your portfolio

❓ Frequently Asked Questions

Q: Why build a machine learning model from scratch instead of just using scikit-learn?

A: Writing a perceptron and k-NN in plain Python forces you to understand the maths — weighted sums, the update rule, distance, and the evaluation metrics. Once that clicks, scikit-learn stops being magic: you know exactly what fit() and predict() are doing under the hood, which makes you far better at debugging real models.

Q: What is the difference between a perceptron and k-nearest neighbours?

A: A perceptron learns a single straight decision boundary by adjusting weights during training, so prediction is instant afterwards. k-NN learns nothing up front — it stores all the data and, at prediction time, votes among the closest stored points. Perceptron is fast but only handles linearly separable data; k-NN is flexible but slow on large datasets.

Q: Why do I scale features before training?

A: Without scaling, a feature with large values (weight in grams, ~200) dwarfs a feature with small values (bumpiness, ~10), so the model effectively ignores the small one. Min-max normalisation puts every feature in the same 0-1 range so each contributes fairly. Crucially, you fit the scaler on training data only, then reuse those same min/max values on test and new data.

Q: Why isn't accuracy enough to evaluate a classifier?

A: If 98% of transactions are legitimate, a model that always predicts 'legitimate' scores 98% accuracy while catching zero fraud. Precision (how many flagged items were truly positive) and recall (how many real positives you caught) expose that failure. F1 combines both into one number, which is why imbalanced problems are judged on F1, not accuracy.

Q: What should I build next to turn this into a portfolio project?

A: Swap the toy fruit data for a real CSV (the Iris or Titanic datasets are classics), rewrite the model with scikit-learn, then wrap predict() in a small FastAPI endpoint and deploy it. Add a README with your metrics and a confusion matrix image. That end-to-end story — data, model, evaluation, deployment — is exactly what hiring managers look for.

🎉

Course Complete — you built a full ML system end-to-end!

You loaded data, preprocessed and split it, trained a perceptron from scratch, evaluated it with a confusion matrix, predicted on unseen fruit, and connected all of it to scikit-learn. That five-step loop — load, preprocess, train, evaluate, predict — is the backbone of every machine learning project, from spam filters to self-driving cars.

Where to go next

• Swap the toy data for a real dataset (Iris, Titanic, or a Kaggle competition).
• Rebuild it in scikit-learn, then try a random forest and compare F1 scores.
• Wrap predict() in a FastAPI endpoint and deploy it with Docker.
• Push it to GitHub with a README, metrics, and a confusion-matrix chart — your first portfolio project.

🏆 Congratulations — you've completed the entire AI & Machine Learning course. From your first linear regression to LLMs, computer vision, reinforcement learning, and production MLOps, you've mastered the full AI engineering stack. Now go build something real.

Previous Back to Course

Final Project: Build & Deploy a Complete ML System 🚀

What You'll Learn in This Lesson

1The Project: A Fruit Classifier, Built Across 5 Milestones

2Milestone 1 — Load & Inspect the Dataset

Milestone 1: Load & Inspect

3Milestone 2 — Preprocess & Split

Milestone 2: Preprocess & Split

4Milestone 3 — Train a Perceptron From Scratch

Milestone 3: Train the Perceptron

5Milestone 4 — Evaluate the Model

Milestone 4: Evaluate

6Milestone 5 — Predict on New Data

Milestone 5: Predict New Fruit

🏭 The Industry Version (scikit-learn) — Read-Only

🎯 Your Turn 1 — Build a k-NN Classifier

Your Turn 1: k-NN distance & voting

🎯 Your Turn 2 — Report the F1 Score

Your Turn 2: precision, recall, F1

🏁 Stretch Challenge — Package It Into a Class

Stretch Challenge: FruitClassifier class

7Common Pitfalls (And How to Fix Them)

📋 Quick Reference — The ML Workflow

Where This Takes You — AI/ML Career Paths

🪙 Business Opportunities You Can Build

Turn This Into a Portfolio Project 🧠

📚 Structure Your ML Project

🧩 Start Small, Iterate

🧪 Evaluate Thoroughly

🌐 Deploy & Showcase

❓ Frequently Asked Questions

Course Complete — you built a full ML system end-to-end!

Cookie & Privacy Settings