πRegression & Classification
Predicting numbers and choosing categories
Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.
Pause and experiment as you go.
Before We Begin
What we are learning today
A large share of introductory machine learning reduces to two questions. Regression asks 'how much?' and classification asks 'which category?'. Even when later models become more complex, they often still boil down to one of these two prediction styles.
How this lesson fits
This module is where the course shifts from explicit rules to learned patterns. Instead of telling the machine exactly what to do in every case, we give it examples, define success, and let it infer a decision rule from the data.
The big question
How can a machine study examples, extract useful patterns, and make predictions on cases it has never seen before?
Why You Should Care
Students who can clearly separate these two task types are much less likely to misuse models or misread outputs. It also creates a sturdy foundation for understanding loss functions, metrics, and later neural-network examples.
Where this is used today
- βPredicting prices, temperatures, wait times, or energy usage as numerical outputs
- βClassifying tumors, emails, images, or transactions into discrete categories
- βEstimating probabilities for decision support before a final yes-no action is taken
Think of it like this
If you are estimating the selling price of a house, you are doing regression. If you are deciding whether an email is spam or not spam, you are doing classification. One output is continuous, the other is categorical.
Easy mistake to make
Logistic regression is confusingly named. In most practical settings it is used as a classification model because it estimates class probabilities rather than arbitrary numeric values.
By the end, you should be able to say:
- Tell the difference between regression targets and classification labels without hesitation
- Interpret a fitted line, a score, and a decision boundary at a conceptual level
- Relate different output types to common evaluation metrics such as error, accuracy, and probability
Think about this first
Which task is regression and which is classification: predicting a student's exact exam score, or predicting whether they will pass the course? Explain the difference in the expected output.
Words we will keep using
Linear Regression
Linear regression asks: "What is the number?" (e.g., price, temperature). It tries to draw a straight line that passes as close as possible to all your data points.
To find the best line, the computer plays a game of "hot or cold." It nudges the line slightly, checks if the error gets smaller, and repeats. This process is called gradient descent.
Gradient Descent on MSE Loss
MSE: 0.000
Try a large learning rate (Ξ± β 0.04) and watch the loss. Too large β oscillation; too small β slow convergence.
Logistic Regression (Classification)
Logistic regression asks: "Yes or No?" (e.g., Spam or Not Spam). Instead of a raw number, it gives you a probability between 0% and 100%.
Logistic Regression β Decision Boundary
Drag the sliders and watch the decision boundary move. That boundary is the place where the model is exactly undecided, with .
- Blue points belong to one class, red points to the other.
- The background color shows what the model currently believes.
- The live score changes as soon as your boundary moves.
Notice the limitation: logistic regression can only draw a straight dividing line. If the pattern is curved, we need a more flexible model.
Model Evaluation Metrics
Accuracy is a trap. If 99% of emails are safe, a model that says "Safe" every time is 99% accurate but 100% useless at catching spam. We need better scoreboards.
The four cells
TP (True Positive) β correctly predicted positive
FP (False Positive) β predicted positive, actually negative (Type I error)
FN (False Negative) β predicted negative, actually positive (Type II error)
TN (True Negative) β correctly predicted negative
Accuracy = (TP+TN) / N. Fine when the classes are balanced, but risky when one class is rare.
Precision = TP / (TP+FP). When you say βpositive,β how often are you right?
Recall (TPR) = TP / (TP+FN). Of the real positives, how many did you actually catch?
F1 combines precision and recall into one score when both matter.
ROC-AUC measures ranking quality across many thresholds, not just one fixed cutoff.
Drag threshold β watch the orange dot move along the curve
Model parameters
Confusion Matrix
| Pred + | Pred β | |
|---|---|---|
| Actual + | TP = 17 | FN = 23 |
| Actual β | FP = 20 | TN = 20 |
Live metrics at t = 0.50
Accuracy
46%
Precision
46%
Recall
43%
F1 Score
44%
When classes are imbalanced
If one class is rare, accuracy can hide failure. In those cases, precision, recall, F1, and PR-AUC usually tell a more honest story.
Threshold trade-off
If you lower the threshold, the model says βpositiveβ more often. That usually helps recall but hurts precision. You are trading one kind of mistake against another.
Beyond binary classification
Different tasks need different scoreboards. There is no single metric that is best for every problem.
Regression Evaluation Metrics
When the output is a number, the question becomes: how far off were we? That is why regression uses error-based metrics instead of a confusion matrix.
Mean Absolute Error β robust to outliers, interpretable in original units
Mean Squared Error β penalises large errors heavily; used as training loss
Root MSE β same units as target, more interpretable than MSE
RΒ² (coefficient of determination) β proportion of variance explained. 1.0 = perfect, 0 = no better than predicting the mean