Back to all lessons
Machine LearningIntermediate

✂️Support Vector Machines

Finding the safest separating line

Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.

25 min- Explore at your own pace

Before We Begin

What we are learning today

A support vector machine does not settle for just any separator. It looks for the boundary that leaves the widest possible buffer between classes, because that extra margin often leads to more stable performance on unseen data.

How this lesson fits

This module is where the course shifts from explicit rules to learned patterns. Instead of telling the machine exactly what to do in every case, we give it examples, define success, and let it infer a decision rule from the data.

The big question

How can a machine study examples, extract useful patterns, and make predictions on cases it has never seen before?

Distinguish supervised, unsupervised, and reward-driven learning setupsInterpret the output of common models in plain English instead of opaque jargonCompare the tradeoffs between accuracy, interpretability, flexibility, and speed

Why You Should Care

This lesson builds geometric intuition about classification. Students see that a model can be right for the wrong reasons, and that confident separation often matters more than barely separating the training points.

Where this is used today

  • Handwriting and optical-character-recognition systems from the pre-deep-learning era
  • Image and text classification tasks with smaller, cleaner feature sets
  • Bioinformatics problems such as classifying proteins or gene-expression patterns

Think of it like this

Imagine two teams standing on opposite sides of a gym floor. You are not just trying to place tape between them; you want the tape positioned so the empty space on both sides is as generous as possible, reducing the chance of accidental overlap.

Easy mistake to make

SVMs are not universal winners. They can be elegant and strong on the right dataset, but they are not always the most scalable or flexible choice.

By the end, you should be able to say:

  • Define support vectors, margin, and separating hyperplane in simple language
  • Explain why a wider margin often improves generalization to new data
  • Compare the SVM mindset with logistic regression and other linear classifiers

Think about this first

If two groups are standing apart in a room, where would you place a divider so small mistakes or noise are least likely to cause confusion? Why not place it just barely between them?

Words we will keep using

marginsupport vectorboundarykernelgeneralization

Support Vector Machines

An SVM is a perfectionist. It doesn't just want to separate the red dots from the blue dots; it wants to build the widest possible street between them. The wider the street (margin), the safer the model is from making mistakes.

Maximize  2wsubject toyi(wxi+b)1\text{Maximize} \; \frac{2}{\|w\|} \quad \text{subject to} \quad y_i(w \cdot x_i + b) \geq 1
MarginThe safety zone. The empty space separating the two teams.
Support VectorsThe VIPs. The specific points that touch the edge of the street. They decide everything.
Kernel TrickA math hack that lets the model draw curvy boundaries by pretending to be in higher dimensions.

Drag Points → Watch Boundary Adapt

🔴 Class -1   🔵 Class +1   🟡 Support vectors   Dashed lines = margin boundaries

Kernel

Tip: Drag a red point close to the blue side and watch what changes. The SVM mostly cares about the points near the boundary, not the ones far away.
w₁=0.57 w₂=0.32
b=-4.31
Margin=3.050

SVM vs Logistic Regression

Logistic Regression focuses on probabilities and works well when you want a simple, interpretable classifier.
SVM focuses on geometry and margin, which can make it very strong on some hard classification problems.