🧠Feedforward Neural Networks
From neurons to layered predictions
Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.
Pause and experiment as you go.
Before We Begin
What we are learning today
A feedforward neural network is a layered function builder. Each neuron computes a weighted combination of its inputs, applies an activation, and passes the result forward. When enough of these transformations are stacked together, the network can capture patterns that a simple linear model would miss entirely.
How this lesson fits
This module introduces the core architecture behind much of modern AI. Students follow information as it moves through layers, is transformed by weights and activations, and eventually becomes a prediction that can be improved through feedback.
The big question
How do large collections of simple numerical operations combine into a model that can recognize patterns humans struggle to hand-code?
Why You Should Care
Modern vision, speech, and language systems all depend on the basic idea that useful internal representations can be built layer by layer. Students who understand the forward pass will have a much easier time understanding later deep-learning architectures.
Where this is used today
- ✓Digit and character recognition tasks where simple visual patterns must be mapped to labels
- ✓Function approximation problems where the relationship between input and output is highly nonlinear
- ✓Basic control and prediction systems in robotics, forecasting, and sensor processing
Think of it like this
Imagine an assembly line where each station refines the material a little further. The early stations do simple transformations, but the later ones combine those partial results into something much more meaningful.
Easy mistake to make
Neural networks borrow vocabulary from biology, but they are still mathematical models, not faithful simulations of real brains.
By the end, you should be able to say:
- Identify inputs, hidden layers, weights, biases, activations, and outputs in a simple network
- Explain why activation functions are necessary if we want networks to learn nonlinear relationships
- Trace a small forward pass numerically or conceptually from input to prediction
Think about this first
Why might several simple transformations stacked in sequence describe a pattern better than one single straight-line rule? Give a real-world example if you can.
Words we will keep using
Feedforward Neural Networks
A feedforward neural network is a bucket brigade of information. Each layer takes the data, mixes it up, transforms it, and hands it to the next layer. If you understand this forward flow, you understand the skeleton of deep learning.
Activation Functions
relu
sigmoid
tanh
gelu
linear
max(0, x)Used in: ResNets, most modern CNNsDifferent activations change how flexible the network can be. Modern language models often use GELU because it behaves smoothly and trains well at scale.
Interactive Forward Pass
Node colour: Green = active (firing), Red = inactive (suppressed). Values shown inside.
Input values
Architecture
2 → 4 → 3 → 1 — Activation: relu
Decision Boundary
The decision boundary is the line where the network changes its mind. On one side, it says "Yes"; on the other, "No". This is the best place to see why non-linearity matters—try switching to Linear and see how the boundary gets stuck as a straight line.
The model can only draw straight boundaries, no matter how many layers you stack.
These activations bend the model away from a straight line, which is why the network can handle richer patterns.
Depth alone is not enough. You need depth and non-linearity together.
Layer Computation Trace
This table shows the first hidden layer in slow motion. Each neuron multiplies the inputs by weights, adds them up, adds a bias, and then sends the result through the activation function.
| Neuron | w1·x1 | w2·x2 | +bias | = z |
|---|---|---|---|---|
| h1 | 0.30·0.80 | -0.90·-0.30 | 0.14 | 0.645 |
| h2 | -0.68·0.80 | -0.25·-0.30 | -0.01 | -0.477 |
| h3 | -0.80·0.80 | 0.71·-0.30 | -0.15 | -0.998 |
The shading shows how much each piece contributes. This is the arithmetic hidden inside the network diagram above.