promptdojo_

The training loop: forward, loss, backward, step

Every training script — from a 10-line demo to a frontier model — runs the same four-beat loop, over and over:

  1. Forward — run the model on the input: pred = w * x.
  2. Loss — measure how wrong the prediction is: (pred - y)².
  3. Backward — compute the gradient (which way to nudge each parameter to lower the loss). In PyTorch this is loss.backward().
  4. Step — update the parameters against the gradient: w = w - lr * grad. In PyTorch, optimizer.step().

Then repeat. Run the editor: the loss falls from 36 toward 0 as the loop repeats — that downward march is training.

The order matters, and so does repeating

The four steps must run in that order, every iteration: you can't update before you've computed the gradient, and you must recompute pred each loop (a stale prediction is the classic "loss won't move" bug). In real PyTorch you'll also see optimizer.zero_grad() at the top of the loop — it clears last iteration's gradients so they don't accumulate. We skip it here because our by-hand grad is freshly computed each time, but recognize it when you read real code.

Why a builder reads loops, not writes them

You won't hand-roll training loops at work, but you'll read them constantly when an AI writes one — and the bugs are almost always loop bugs: wrong order, a missing zero_grad, a prediction computed once outside the loop, or a learning rate that's too big. Knowing the four beats and that they repeat is enough to spot every one of them.