Teach a Tiny AI — Lab

Click to drop coloured dots. Hit Train. Watch a real linear classifier learn the boundary between them — the same gradient-descent loop that runs inside every modern neural network.

An interactive companion to Teach a Tiny AI — the smallest possible classifier learning in real time. Drop coloured dots on the canvas, hit Train, and watch the decision boundary update on every step. Same gradient-descent loop running inside every modern neural network — just stripped to its smallest possible form.

Insightful AI World · Lab #2

Teach a Tiny AI

A real linear classifier learning in your browser. Drop coloured dots, click Train, watch the decision boundary rotate as the weights update. Same gradient-descent loop that runs inside every modern neural network — just stripped to its smallest possible form.

Before you start

Read this short panel first. It tells you what the lab is, what it is trying to make you see, and how you will know if you got there.

🎯 Purpose

This lab is the smallest possible piece of a real neural network: a linear classifier with two weights and one bias. You drop coloured dots on a 2D canvas; click Train; and watch the dashed decision boundary update on every step. The maths is identical to what runs inside every modern deep network — only the number of weights is different. You can change the learning rate, clear the data, and load preset scenarios (clean gap, overlapping clouds, imbalanced data, and a non-linear XOR case the model cannot solve).

💡 What it is trying to make you see

That modern AI is not coded with rules — it is nudged toward correctness by gradient descent, one tiny step at a time, until the random initial weights settle into a useful pattern. The companion article describes this in words; the lab lets you watch the same loop happen in front of you with three weights instead of a trillion. The behaviour you see (smooth convergence when data is clean, oscillation when learning rate is too high, total failure on XOR) is the same behaviour you would see inside a trillion-parameter model — just at a scale you can perceive.

✅ What you should understand after playing

After a minute of clicking, you should leave able to:

Describe in one sentence what each training step does to the weights: nudge them in the direction that reduces error on the current batch.
Predict what will happen when you set the learning rate very high (the boundary oscillates and loss can explode) or very low (training slows to a crawl).
See first-hand why a linear model cannot solve XOR — and intuit why this single limitation is why we eventually had to invent multi-layer neural networks.

If those three are true for you when you leave, the lab did its job. If not, re-read the worked example below and try one more scenario.

How to use it — 30 seconds

Click on the canvas to drop dots. Toggle the active class (Red / Blue) on the right.
Click Train (1 step or 100 steps). Watch the dashed boundary rotate. The accuracy bar fills as the model learns.
Try a preset scenario on the right to see how data shape changes what happens. Move the learning-rate slider to break the model deliberately.

A worked example — try the XOR preset

Click the "Non-linear (XOR)" preset. The dots form four clusters in the corners — two red on one diagonal, two blue on the other.

Click Train 100. Then click it again. And again. Watch the accuracy hover around 50% and the boundary spin uselessly.

The model cannot solve it — no straight line separates the four corners correctly. This is a hard limit of linear classifiers, and it is the single observation that forced researchers to invent multi-layer neural networks in the first place.

Now switch to "Easy: clear gap" and Train 100. The boundary snaps into place in seconds and accuracy reaches 100%. Same algorithm. Same loop. Different problem shape — different result.

The classifier

Click anywhere on the canvas to drop a dot of the active colour. Place a few of each. Then hit Train. The dashed line is the model's current decision boundary — everything on one side is classified red, everything on the other is classified blue.

Red class (label = 0) Blue class (label = 1)

Active class

Learning rate0.10

Step

Loss

—

Accuracy

—

Points

Try a scenario

What you're actually looking at

The model has three numbers: two weights (w₁, w₂) and one bias (b). For every point (x, y), it computes w₁·x + w₂·y + b, passes that through a sigmoid to get a probability between 0 and 1, and compares to the true label (0 for red, 1 for blue).

Each training step nudges the three numbers in the direction that would have reduced the error for every point at once. This is gradient descent. The dashed line is where the probability equals exactly 0.5 — the boundary the model uses to decide red vs blue.

Try the XOR preset and try to train it. The model cannot solve it, no matter how long you train. The decision boundary it can draw is a straight line, and no straight line separates the four corners correctly. That single observation is why we eventually had to invent multi-layer neural networks. A two-layer network with three hidden neurons solves XOR easily — but the algorithm running the learning is still gradient descent, just on more parameters.

Lab #2. Companion to What is artificial intelligence? A complete intuitive guide. Feedback welcome.