Lesson 1 • Beginner
Introduction to AI & Machine Learning
By the end of this lesson you'll be able to explain what machine learning is, name the three types, walk through the ML workflow, and run your very first "learning" program in plain Python.
What You'll Learn in This Lesson
- ✓Explain what ML is: learning from data vs hand-written rules
- ✓Tell AI, ML, and Deep Learning apart correctly
- ✓Name the 3 types: supervised, unsupervised, reinforcement
- ✓Walk the ML workflow: data → features → train → evaluate → deploy
- ✓Recognise the core Python ML stack: NumPy, pandas, scikit-learn
- ✓Write a tiny model that learns a pattern from data
🌍 Real-World Analogy: Teaching a Child
Imagine teaching a child to tell cats from dogs. You don't hand them a rulebook ("if pointy ears and whiskers, then cat"). You just show them lots of labelled photos. After enough examples, they recognise animals they've never seen before.
That is machine learning. The "child" is a computer, the photos are data, and the recognition skill it builds is the model. The big shift from normal programming: instead of you writing the rules, the computer learns the rules from the data.
1What Is Machine Learning?
Machine learning is a way of writing programs where, instead of coding the rules yourself, you feed the computer examples and let it discover the rules. You give it data; it gives you back a pattern it can reuse on new data.
📜 Traditional programming
You write the rules. The data follows them.
# YOU write the rule:
def is_spam(email):
if "free money" in email:
return True
return False🤖 Machine learning
You give examples. The computer finds the rule.
# DATA teaches the rule:
examples = [
("free money!", "spam"),
("lunch at 1pm", "not_spam"),
]
# model.fit(examples) -> learns it2AI vs ML vs Deep Learning
These three words get mixed up constantly. They are nested, not synonyms — picture three boxes, each inside the last:
- • AI — the broad goal: any machine that mimics human intelligence (biggest box).
- • ML — a way to reach AI: machines that learn from data (inside AI).
- • Deep Learning — a kind of ML using neural networks with many layers (inside ML).
So every deep-learning system is ML, and every ML system is AI — but not the other way round. A simple if-statement chatbot is "AI" but isn't ML; a spam filter trained on emails is ML but isn't necessarily deep learning.
Worked Example: A Model That Learns
Read every comment, then press Run. This uses plain Python only (no libraries) so it works right here in the browser. Watch it figure out a pattern nobody coded.
Worked Example: Your First ML Model
A line-fitting model that learns the link between study hours and exam scores — plain Python.
# Your first taste of machine learning — PLAIN Python, no libraries.
# The idea: instead of writing the rule, you let DATA reveal the rule.
# Data we observed: hours studied -> exam score
hours = [1, 2, 3, 4, 5, 6, 7, 8]
scores = [30, 35, 50, 55, 65, 70, 80, 85]
# A model fits a line: score = slope * hours + intercept
# This is the same maths scikit-learn does for you later — done by hand here.
n = len(hours)
sum_x = sum(hours)
sum_y = sum(scores)
sum_xy = sum(h * s for h, s in zip(ho
...3The Three Types of Machine Learning
Almost every ML problem falls into one of three buckets. Knowing which one you're in tells you what data you need.
📊 Supervised
Learn from labelled examples (input + correct answer). "Here are 1,000 emails marked spam / not-spam — learn the pattern."
🔍 Unsupervised
Find hidden groups in unlabelled data. "Here are 10,000 customers — group them by behaviour, no labels given."
🎮 Reinforcement
Learn from rewards by trial and error. "Play a million games — a win is good, a loss is bad — find the winning strategy."
Worked Example: The Three Types in Code
Each ML paradigm modelled with plain lists and dicts — fully runnable.
# The three types of machine learning — modelled with plain lists and dicts.
# 1) SUPERVISED — learn from LABELLED examples (input + correct answer)
labelled_emails = [
{"text": "Free money, claim now!", "label": "spam"},
{"text": "Meeting at 3pm tomorrow", "label": "not_spam"},
{"text": "You WON a prize!!!", "label": "spam"},
{"text": "Project deadline Friday", "label": "not_spam"},
]
spam_count = sum(1 for e in labelled_emails if e["label"] == "spam")
print("SUPERVISED: l
...4The ML Workflow (5 Steps)
Every ML project, from a hobby script to a self-driving car, follows the same five steps:
- 1. Data — gather raw observations (rows of measurements).
- 2. Features — pick the inputs (
X) and the answer to predict (y). - 3. Train — show the model the training data so it learns the pattern.
- 4. Evaluate — test it on data it never saw to get an honest score.
- 5. Deploy — reuse the trained model on brand-new, real-world input.
Worked Example: The Full Workflow
data → features → split → train → evaluate → deploy, end to end in plain Python.
# The ML workflow: data -> features -> train -> evaluate -> deploy
# Done with plain Python so it runs here. Same shape as a real project.
# 1) DATA: raw observations (square metres, price in thousands)
data = [(50, 150), (60, 180), (70, 205), (80, 240), (90, 265)]
# 2) FEATURES: split each row into input (X) and target (y)
X = [row[0] for row in data] # the feature we learn from
y = [row[1] for row in data] # the answer we want to predict
# 3) TRAIN/TEST SPLIT: hide the last row so we ca
...🎯 Your Turn #1: Build Features (X and y)
Fill in the two blanks marked ___. Follow each # 👉 hint, run it, and check your output against the # ✅ Expected output at the bottom.
🎯 Your Turn: Split Data into Features and Target
Fill in the blanks to separate inputs (X) from labels (y) — the 'features' step.
# 🎯 YOUR TURN — split raw data into features (X) and target (y).
# This is the "features" step every ML project starts with.
# Each row is (hours_studied, passed?) — 1 means passed, 0 means failed.
data = [(1, 0), (2, 0), (4, 1), (6, 1), (8, 1)]
# 👉 Build X: a list of just the hours (the FIRST item in each row)
X = [row[___] for row in data] # 👉 replace ___ with the index 0
# 👉 Build y: a list of just the labels (the SECOND item in each row)
y = [row[___] for row in data] # 👉 r
...🎯 Your Turn #2: Make a Train/Test Split
Now the most important habit in ML: holding data back for honest testing. Replace each ___, run, and confirm the expected output.
🎯 Your Turn: Train/Test Split
Slice a dataset into a training set and a hidden test set.
# 🎯 YOUR TURN — make a train/test split.
# You train on most of the data and keep some HIDDEN to test honestly.
scores = [55, 62, 70, 81, 90, 44, 67, 73]
# 👉 Use the first 6 scores for TRAINING (slice from the start up to 6)
train = scores[:___] # 👉 replace ___ so train has the first 6 items
# 👉 Use the remaining scores for TESTING (slice from index 6 onward)
test = scores[___:] # 👉 replace ___ so test has the rest
print("Train set:", train, "->", len(train), "rows")
print("T
...5The Python ML Ecosystem
You did the maths by hand above to understand it. In real projects, three libraries do the heavy lifting so you write two lines instead of twenty:
🔢 NumPy
Fast numeric arrays and maths. The number-crunching engine everything else is built on.
🐼 pandas
Loads and cleans table-shaped data — like a spreadsheet you control with code (the DataFrame).
🧪 scikit-learn
Ready-made models you train with .fit() and use with .predict().
pip install numpy pandas scikit-learn (you'll set this up in the next lesson).Read-Only Example: The Real Library Stack
Here's the same study-hours model written the professional way. Read it to see how little code real ML takes — the # Expected output comments show what it prints when you run it locally.
# THE REAL ML ECOSYSTEM — read-only worked example.
# NOTE: numpy, pandas and scikit-learn may NOT be installed in this
# in-browser editor. Read this to see what real ML code looks like;
# run it later on your own machine after "pip install numpy pandas scikit-learn".
import numpy as np # fast numeric arrays + maths
import pandas as pd # table-shaped data (like a spreadsheet)
from sklearn.linear_model import LinearRegression # a ready-made model
# pandas: load data into a DataFrame (rows and named columns)
df = pd.DataFrame({"hours": [1, 2, 3, 4, 5, 6, 7, 8],
"score": [30, 35, 50, 55, 65, 70, 80, 85]})
# numpy: shape the feature column the way scikit-learn expects (2D)
X = df[["hours"]].values # features -> shape (8, 1)
y = df["score"].values # target -> shape (8,)
# scikit-learn: the entire "training" is two lines
model = LinearRegression()
model.fit(X, y) # learns slope + intercept for you
print(f"Slope: {model.coef_[0]:.1f}, Intercept: {model.intercept_:.1f}")
print(f"Predict 10 hours: {model.predict(np.array([[10]]))[0]:.0f}")
# Expected output (approximately):
# Slope: 7.9, Intercept: 23.6
# Predict 10 hours: 1036Common Mistakes (And How to Avoid Them)
These four trip up almost every beginner. Spotting them early saves hours of confusion.
❌ Confusing AI, ML, and Deep Learning
Treating them as the same thing, or calling every chatbot "deep learning".
✅ Fix: remember the nesting — Deep Learning ⊂ ML ⊂ AI. A rule-based bot is AI but not ML; a spam model is ML but may not be deep learning.
❌ No train/test split
Testing the model on the exact data it trained on. It scores ~100% and then flops on real data.
# Bad: trained AND tested on the same rows model.fit(X, y) print(model.score(X, y)) # looks perfect, means nothing
✅ Fix: hold data back. Train on one slice, test on a hidden slice (you did this in Your Turn #2).
❌ Garbage in, garbage out
Expecting a great model from messy, biased, or tiny data. The model can only be as good as what you feed it.
✅ Fix: clean and check your data first. Five clean rows beat five thousand wrong ones. Most real ML time is spent on data, not models.
❌ Expecting magic
Believing ML "just knows" things or thinks like a human. It only finds statistical patterns in the numbers it was shown.
✅ Fix: treat models as pattern-matchers, not minds. They make mistakes, reflect their data's biases, and can't reason beyond what they saw.
🎯 Mini-Challenge: Predict Tomorrow
Time to fade the scaffolding. The starter has only a comment outline — no filled-in logic. Use plain Python (no libraries) to build a tiny trend predictor.
Mini-Challenge: Trend Predictor
Write it yourself from the outline — predict tomorrow's step count.
# 🎯 MINI-CHALLENGE: a tiny "predict tomorrow" model in plain Python
#
# Daily steps walked this week:
steps = [4000, 5000, 6000, 7000, 8000]
#
# 1. Compute the average DAILY INCREASE between consecutive days
# (hint: 5000-4000 is 1000; average those gaps)
# 2. Predict tomorrow = last day's steps + that average increase
# 3. Print: "Predicted tomorrow: 9000 steps"
#
# ✅ Expected output:
# Predicted tomorrow: 9000 steps
# your code here📋 Quick Reference
| Term | What It Means | Example |
|---|---|---|
| AI | Machines mimicking intelligence | Siri, self-driving cars |
| ML | Learning the rules from data | Spam filter, recommendations |
| Deep Learning | ML with many-layer neural nets | Image recognition, chatbots |
| Supervised | Learns from labelled data | Email → spam / not-spam |
| Unsupervised | Finds groups, no labels | Customer segmentation |
| Reinforcement | Learns from rewards | Game playing, robotics |
| Feature (X) | An input the model learns from | Hours studied |
| Target (y) | The answer to predict | Exam score |
| Train/test split | Hold data back for honest testing | 6 rows train, 2 rows test |
| NumPy / pandas / scikit-learn | The core Python ML stack | Arrays / tables / models |
❓ Frequently Asked Questions
Q: What is the difference between AI, machine learning, and deep learning?
A: AI is the broad goal of making machines act smart. Machine learning (ML) is one way to get there — the machine learns patterns from data instead of following hand-written rules. Deep learning is a type of ML that uses neural networks with many layers. Picture three nested boxes: AI contains ML, and ML contains deep learning.
Q: What are the three types of machine learning?
A: Supervised learning learns from labelled examples (input plus the correct answer), like emails tagged spam or not-spam. Unsupervised learning finds hidden groups in data that has no labels, like customer segments. Reinforcement learning learns by trial and error using rewards, like an AI mastering a game.
Q: Do I need advanced maths to start machine learning?
A: No. To begin you only need basic Python — lists, dicts, loops, and simple arithmetic. The maths (statistics and linear algebra) helps later when you want to understand models deeply, but libraries like scikit-learn handle the hard maths for you so you can build working models first.
Q: What is a train/test split and why does it matter?
A: You split your data into a training set (used to teach the model) and a test set (kept hidden, used to check it). Testing on data the model already saw gives a falsely high score. The test set is the only honest measure of how the model performs on new, unseen data.
Q: Which Python libraries do I need for machine learning?
A: Three form the core. NumPy gives fast numeric arrays and maths. pandas loads and cleans table-shaped data (think a spreadsheet in code). scikit-learn provides ready-made models you train in two lines. You will install and use all three later in this course.
Lesson 1 complete — you speak the language of ML!
You can now explain learning-from-data vs hand-written rules, separate AI / ML / Deep Learning, name the three types of ML, walk the five-step workflow, and recognise the NumPy / pandas / scikit-learn stack. You even built a model that learned a pattern by itself — in plain Python.
🚀 Up next: Python for ML — install NumPy, pandas, and scikit-learn, and turn the read-only example above into code you can run for real.
Sign up for free to track which lessons you've completed and get learning reminders.