Advanced TopicsAdvanced

🔒Federated Learning

Training together without sharing raw data

Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.

25 min

Pause and experiment as you go.

25 min- Explore at your own pace

Before We Begin

What we are learning today

Federated learning changes where training happens. Instead of sending all raw data to one central server, each device or organization trains locally and shares model updates that can be aggregated into a stronger global model.

How this lesson fits

This module looks beyond the standard supervised-learning workflow. Students explore systems that learn from delayed rewards and systems that train collaboratively while keeping raw data distributed, which introduces the real-world constraints of strategy, privacy, and deployment.

The big question

How can AI systems keep improving in realistic environments where feedback is delayed, data is sensitive, and decisions have long-term consequences?

Interpret reward-driven learning in terms of long-term payoff rather than immediate correctnessExplain the exploration-versus-exploitation tradeoff with concrete examplesDescribe privacy-aware distributed training across many devices or organizations

Why You Should Care

Modern AI does not live only in ideal laboratory settings. It has to respect privacy, bandwidth, regulation, and uneven device quality. Federated learning is a practical response to those constraints, and it shows students how engineering tradeoffs shape model design.

Where this is used today

✓On-device predictive text systems that learn from typing behavior without uploading every message
✓Cross-hospital or cross-clinic modeling where patient data cannot be freely centralized
✓Distributed consumer devices that adapt locally while contributing to a shared global model

Think of it like this

It is like a study group where every student works problems at home, then the group compares lessons learned without photocopying everyone's notebook. Useful knowledge is shared, but the private raw material stays local.

Easy mistake to make

Federated learning improves privacy but does not solve every security or fairness problem by itself. Updates can still leak information, and participating clients may contribute unevenly.

By the end, you should be able to say:

Explain the core workflow of local training followed by global aggregation
Describe why privacy, regulation, and data ownership motivate federated approaches
Identify practical challenges such as non-identically distributed data, dropped clients, and device limitations

Think about this first

Why might a hospital, school, or phone user refuse to upload raw data to a central server even if doing so would make training simpler? What risks are they trying to avoid?

Words we will keep using

federatedlocal updateaggregationprivacyclient

Federated Learning

Imagine a hospital wants to train an AI to spot diseases, but it can't share patient records because of privacy laws. Federated Learning is the solution: bring the model to the data, not the data to the model.

Privacy FirstYour data never leaves your device. Only the math (model updates) gets shared.

TeamworkThousands of devices work together to build one smart brain.

Messy DataEveryone's data looks different, which makes training tricky but robust.

FedAvg Algorithm

FedAvg works like a potluck dinner. Everyone cooks a dish at home (trains on local data), brings it to the party, and mixes it all together into one giant feast (the global model).

The server sends the current shared model to selected clients
Each client trains on its own private data for a short time
The clients send back updated model weights
The server averages those updates into a new global model

w_{t+1} = \sum_{k=1}^{K} \frac{n_k}{n} w_t^k

where nₖ is the number of samples at client k and n = Σnₖ.

Interactive Federated Training

Select participating clients:

Hospital A1200 samples

Hospital B850 samples

Hospital C2100 samples

Hospital D640 samples

Apply Differential Privacy (add noise to gradients)

Global model — Round 0Loss: 1.0000

w₁

0.800

w₂

-0.500

w₃

0.300

w₄

0.100

0.600

Training loss curve

Round 0Round 0

Privacy Enhancements

Differential PrivacyAdding a little bit of noise so that even the math updates can't be traced back to a specific person.

Secure AggregationMixing the updates in a lockbox so the server sees the total, but not who contributed what.

Homomorphic EncryptionDoing math on encrypted data without unlocking it first. Yes, that's possible.

Real-world deployments: keyboard prediction, voice assistants, and multi-hospital medical models are good examples because they all involve useful learning plus sensitive personal data.