Skip to main content
    Courses/AI & ML/Neural Networks Introduction

    Lesson 7 • Intermediate

    Neural Networks Introduction

    By the end of this lesson you'll be able to compute a single neuron's output by hand, write ReLU and sigmoid in plain Python, and explain how layers learn by adjusting weights.

    What You'll Learn in This Lesson

    • You'll be able to describe a neuron as weighted sum + bias + activation
    • You'll be able to compute a neuron's output in plain Python with lists
    • You'll be able to write the ReLU and sigmoid activation functions yourself
    • You'll be able to explain what a layer is and how layers stack
    • You'll be able to trace a forward pass from inputs to prediction
    • You'll be able to explain how learning nudges weights to cut error

    🧠 Real-World Analogy: Brain Neurons

    Your brain has billions of neurons. Each one receives little electrical signals from its neighbours through connections called synapses. Some connections are strong and some are weak — they decide how much each incoming signal counts. When the combined signal crosses a threshold, the neuron fires and passes a signal on to the next neurons.

    An artificial neuron copies this idea with arithmetic. The "synapse strengths" become numbers called weights, the firing threshold becomes a bias, and the "fire or not" decision becomes an activation function. Learning is just gradually turning the strength of each connection up or down until the whole network responds the way you want.

    1The Neuron — Weighted Sum + Bias + Activation

    A neuron (also called a perceptron when it's on its own) is the smallest building block of a neural network. It takes some inputs and produces a single number. It does this in three tiny steps:

    1. Weighted sum — multiply each input by its weight and add the results together.
    2. Add a bias — add one extra number that shifts the total up or down.
    3. Activation — pass the total through a function that decides the output.

    z = (x1·w1 + x2·w2 + … + xn·wn) + bias

    output = activation(z)

    The weights and bias are the neuron's knowledge. Everything a network learns ends up stored as weights and biases. Here is a single neuron written in plain Python — read each comment, then run it.

    Worked Example: One Neuron in Plain Python

    Compute weighted sum + bias + a step activation by hand

    Try it Yourself »
    Python
    # A single neuron (a "perceptron") — plain Python, no libraries
    # A neuron does 3 tiny steps:
    #   1. weighted sum:  multiply each input by its weight and add them up
    #   2. add a bias:    a number that shifts the result up or down
    #   3. activation:    squash the result through a function (here: a step)
    
    # The neuron's "knowledge" lives in these numbers.
    inputs  = [1.0, 0.0, 1.0]      # 3 features going in (e.g. yes/no signals)
    weights = [0.7, 0.3, 0.9]      # how much each input matters
    bias   
    ...

    2Activation Functions — ReLU and Sigmoid

    The activation function is what makes a network powerful. It introduces non-linearity — a fancy way of saying "the output can bend and curve instead of being one straight line." Two activations cover almost everything a beginner needs:

    • ReLU (Rectified Linear Unit): returns the input if it's positive, otherwise 0. It's fast and is the default choice for hidden layers.
    • Sigmoid: squashes any number into the range 0 to 1 using an S-shaped curve. Perfect for an output that represents a probability ("how likely is this a cat?").

    Both are just a couple of lines of plain Python — the only thing you need from the standard library is math.exp for sigmoid.

    Worked Example: ReLU and Sigmoid

    See how each activation reshapes the same inputs

    Try it Yourself »
    Python
    import math   # only the standard library — no numpy
    
    # Activation functions decide what a neuron "fires".
    # Without them, stacking neurons would just be one big straight line.
    
    def relu(x):
        # ReLU = "Rectified Linear Unit": keep positives, zero out negatives
        return x if x > 0 else 0.0
    
    def sigmoid(x):
        # Sigmoid squashes ANY number into the range 0..1 (an S-curve)
        return 1 / (1 + math.exp(-x))
    
    # Try them on a few numbers so you can see the shape
    for x in [-2.0, -0.5, 0.0, 0.5, 2
    ...

    🎯 Your Turn: Finish the Neuron's Forward Pass

    Fill in the blanks to compute z and the neuron's output

    Try it Yourself »
    Python
    # 🎯 YOUR TURN — finish the neuron's forward pass
    # Fill in each ___ . Expected output is at the bottom so you can self-check.
    
    inputs  = [2.0, 3.0]
    weights = [0.5, -1.0]
    bias    = 1.0
    
    # 1) Start z at the bias value
    z = ___                       # 👉 replace ___ with the bias variable
    
    # 2) Add input * weight for each pair
    for x, w in zip(inputs, weights):
        z = z + ___               # 👉 replace ___ with  x * w
    
    # 3) Apply a step activation: 1 if z is positive, else 0
    output = 1 if z ___ 0 e
    ...

    🎯 Your Turn: Implement ReLU and Sigmoid

    Write the two activation functions from scratch

    Try it Yourself »
    Python
    import math
    
    # 🎯 YOUR TURN — implement the two activations from scratch.
    # Fill in the ___ blanks, then run to compare against the expected output.
    
    def relu(x):
        # Return x when it is positive, otherwise return 0.0
        return ___ if x > 0 else 0.0     # 👉 replace ___ with  x
    
    def sigmoid(x):
        # The S-curve: 1 / (1 + e^(-x)).  math.exp(n) computes e^n.
        return 1 / (1 + math.exp(___))   # 👉 replace ___ with  -x
    
    print("relu(-3)   =", relu(-3))
    print("relu(4)    =", relu(4))
    print("sig
    ...

    3Layers and the Forward Pass

    One neuron can only draw a single straight boundary, which is too weak for most real problems. The fix is to use many neurons arranged in layers:

    • Input layer — your raw features (e.g. pixel values, sensor readings).
    • Hidden layer(s) — neurons that each look for a different pattern in the inputs.
    • Output layer — produces the final prediction.

    The forward pass is simply running data through the network from left to right: every neuron in a layer computes weighted sum + bias + activation, and its outputs become the inputs to the next layer. Stack enough layers and the network can approximate almost any function — that's the whole magic.

    inputs → [hidden layer: many neurons] → [output layer] → prediction

    4How Learning Adjusts Weights

    A fresh network starts with random weights, so its first predictions are basically guesses. Training fixes that with a repeating loop:

    1. Forward pass: run an example through the network to get a prediction.
    2. Measure error: compare the prediction to the correct answer (the "loss").
    3. Backpropagation: work out how much each weight contributed to the error.
    4. Update: nudge every weight and bias a little in the direction that lowers the error.

    That nudge size is controlled by the learning rate. Repeat this loop over thousands of examples and the weights slowly settle into values that make good predictions. You don't have to compute the gradients by hand — libraries like TensorFlow and PyTorch do it for you. Here's the same neuron idea written the professional way, plus a tiny Keras network, shown as a read-only reference.

    Worked Example: numpy & Keras Version (reference)

    The same maths the way professionals write it — read it, the expected output is in comments

    Try it Yourself »
    Python
    # The SAME idea, written the way professionals do it.
    # numpy does the weighted sum for a whole layer in one line.
    import numpy as np
    
    inputs  = np.array([1.0, 0.0, 1.0])
    weights = np.array([0.7, 0.3, 0.9])
    bias    = -1.0
    
    z = np.dot(inputs, weights) + bias   # weighted sum + bias, vectorised
    output = 1 / (1 + np.exp(-z))        # sigmoid activation
    print(round(float(output), 3))       # 0.646
    
    # Expected output:
    # 0.646
    
    
    # A whole network in a few lines with Keras (TensorFlow).
    # This builds 2
    ...

    5Common Errors (And How to Fix Them)

    These four mistakes trip up nearly everyone who builds their first network.

    ❌ No non-linearity = a glorified linear model

    If every layer uses no activation (or only a linear one), stacking layers collapses into a single straight line. The network can't learn curves and will fail on problems like XOR.

    ✅ Fix: put a non-linear activation (ReLU) on every hidden layer:

    # ❌ hidden = weighted_sum            # linear — no power
    # ✅ hidden = relu(weighted_sum)       # adds the curve the network needs

    ❌ Vanishing gradients

    Sigmoid and tanh flatten out for large inputs, so their slope (gradient) becomes almost 0. In deep networks the update signal shrinks to nothing and early layers stop learning.

    ✅ Fix: use ReLU in hidden layers; keep sigmoid for the final output only.

    # hidden layers -> relu(z)            # gradient stays healthy
    # output layer  -> sigmoid(z)         # 0..1 probability is fine here

    ❌ Not normalizing inputs

    Feeding raw values on wildly different scales (e.g. age 0–100 next to salary 0–100000) makes training unstable — the big numbers dominate the weighted sum.

    ✅ Fix: scale features to a similar range before training.

    # scale each feature to roughly 0..1
    normalized = [(v - low) / (high - low) for v in feature]

    ❌ Learning rate too big

    If each weight update is too large, the network overshoots the good values and the loss bounces around or explodes to nan instead of going down.

    ✅ Fix: start small (e.g. 0.01) and only increase if learning is too slow.

    # learning_rate = 10.0    # ❌ loss jumps around / becomes nan
    learning_rate = 0.01      # ✅ steady, reliable improvement

    📋 Quick Reference

    TermWhat it isIn code / formula
    WeightHow much an input mattersx * w
    BiasShifts the sum up/downz = ... + bias
    Weighted sum (z)Inputs·weights + biassum(x*w) + bias
    ReLUKeep positives, zero negativesx if x > 0 else 0
    SigmoidSquash into 0..11/(1+math.exp(-x))
    LayerA group of neuronsinput / hidden / output
    Forward passInputs → predictionlayer by layer
    Learning rateSize of each weight nudgew += lr * ...

    ❓ Frequently Asked Questions

    Q: What is a neuron (perceptron) in a neural network?

    A: It is a tiny function that multiplies each input by a weight, adds those products together, adds a bias number, and passes the result through an activation function. The output is the neuron's decision.

    Q: What is the bias for?

    A: The bias shifts the weighted sum up or down before activation, so the neuron can fire at a different threshold. Without it, every neuron would be forced to pass through zero, which limits what the network can learn.

    Q: Why do neural networks need activation functions?

    A: Activation functions add non-linearity. If you remove them, stacking layers collapses into a single straight-line (linear) model that can only solve linearly separable problems. ReLU and sigmoid let the network bend and curve to fit complex data.

    Q: What is the difference between ReLU and sigmoid?

    A: ReLU returns the input if it is positive and 0 otherwise — fast and the default for hidden layers. Sigmoid squashes any number into 0..1, which is ideal for an output that represents a probability, but it causes vanishing gradients in deep hidden layers.

    Q: What is a forward pass?

    A: The forward pass is running inputs through the network layer by layer to produce a prediction: for each neuron you compute weighted sum + bias, apply the activation, then feed the results into the next layer.

    Q: How does a neural network learn?

    A: It compares its prediction to the correct answer to measure error, then nudges every weight and bias a little in the direction that reduces that error. Repeating this over many examples (using backpropagation and gradient descent) is what we call training.

    🎯 Mini-Challenge: A Neuron with a Real Activation

    Time to fly with less support. Build a 2-input neuron that ends with a sigmoid activation. Only a comment outline is given — fill in the logic yourself, then check against the expected output in the comments.

    Mini-Challenge

    Write the whole neuron from the brief — no scaffolding

    Try it Yourself »
    Python
    import math
    
    # 🎯 MINI-CHALLENGE: a 2-input neuron with a real activation
    # Brief:
    #   1. Define sigmoid(x)  ->  1 / (1 + math.exp(-x))
    #   2. inputs  = [1.0, 1.0]   weights = [2.0, 2.0]   bias = -3.0
    #   3. Compute z = bias + sum of input*weight   (use a loop or zip)
    #   4. Pass z through sigmoid to get the output (a probability 0..1)
    #   5. print("z =", z)  and  print("output =", round(output, 3))
    #
    # ✅ Expected (inputs 1,1):
    # z = 1.0
    # output = 0.731
    
    # your code here
    🎉

    Lesson complete — you understand how neurons think!

    You can now compute a neuron as weighted sum + bias + activation, write ReLU and sigmoid in plain Python, describe how layers stack into a forward pass, and explain how training nudges weights to shrink error. These are the exact foundations every deep learning model is built on.

    🚀 Up next: Deep Learning Fundamentals — stack many layers, train with backpropagation, and tackle real datasets.

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service