🌳Decision Trees & Random Forests
Learning by asking better questions
Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.
Pause and experiment as you go.
Before We Begin
What we are learning today
Decision trees work by repeatedly asking the question that best separates the data at the current step. Each split reduces uncertainty, and the final leaf gives a prediction. Random forests improve on this by averaging across many slightly different trees so one unlucky split does not dominate the result.
How this lesson fits
This module is where the course shifts from explicit rules to learned patterns. Instead of telling the machine exactly what to do in every case, we give it examples, define success, and let it infer a decision rule from the data.
The big question
How can a machine study examples, extract useful patterns, and make predictions on cases it has never seen before?
Why You Should Care
Trees are one of the best bridges between human reasoning and machine learning. Students can see the decision process, critique it, and understand why ensembling often beats a single overconfident model.
Where this is used today
- ✓Loan and risk systems that split applicants by measurable financial factors
- ✓Clinical triage workflows that route patients based on symptoms and severity
- ✓Business analytics models where stakeholders need a relatively interpretable decision path
Think of it like this
Think of a triage nurse narrowing possibilities: Do you have a fever? Has it lasted more than two days? Are you having trouble breathing? Each answer rules some outcomes in and others out.
Easy mistake to make
A deeper tree is not automatically a better tree. If it keeps splitting until every edge case is memorized, it may fit the training data beautifully and still fail on new examples.
By the end, you should be able to say:
- Explain how a tree chooses a split using impurity reduction or information gain
- Interpret branches, leaves, and terminal predictions with confidence
- Explain why averaging many trees can reduce variance and overfitting
Think about this first
If you were deciding whether to approve a loan, what first question would you ask, and why would that question separate applicants better than others?
Words we will keep using
How Decision Trees Work
A decision tree is just a game of "20 Questions." The computer learns which questions to ask to split the data into clean groups. It is one of the few AI models you can print out and read like a manual.
Loan Approval Tree — Walk-through
Move the sliders and follow the highlighted path. You can literally watch the model reason its way to a decision.
(Only used if Age < 30)
(Only used if Age ≥ 30)
Random Forests
A single tree can be shaky—change one data point, and the whole structure might flip. A Random Forest solves this by training hundreds of different trees and letting them vote.