promptdojo_

Autograd, the mental model (no calculus required)

loss.backward() is the line every PyTorch training script runs, and it feels like magic. It isn't. Autograd automates one loop you can fully understand from a one-parameter example. Five words: forward, loss, gradient, backward, update.

Take the simplest model: pred = w * x. You have a known answer y, and you want to tune w so pred gets close to y.

  1. Forward — run the model: pred = w * x.
  2. Loss — measure how wrong it is. Squared error: (pred - y)². Bigger when you're more wrong, never negative.
  3. Gradient — the slope of the loss with respect to w: "if I nudge w up a hair, does the loss go up or down, and how fast?" For this model it works out to 2 * (pred - y) * x. A negative gradient means increasing w lowers the loss.
  4. Backward — in real code, loss.backward() computes that gradient for every parameter automatically and stashes it in w.grad. That automation is autograd. You never differentiate by hand.
  5. Update — step the parameter against the gradient: w = w - learning_rate * grad. Going opposite the slope walks the loss downhill.

Why a builder needs only this much

You will not implement autograd. You need to read training loops and know what each line is for — and to debug the classic mistakes: a loss that never drops (often a sign-flipped update), or a learning rate so big the loss explodes. Run the editor: one forward/loss/gradient/update step, all by hand, so loss.backward() stops being a black box. In real code the same idea is loss.backward(); optimizer.step() — autograd fills in the gradient, the optimizer does the subtraction.