Write train_loop(w, x, y, lr, steps) for the model pred = w * x. Run
the four-beat loop steps times — each iteration: forward, gradient
(2*(pred-y)*x), step (w - lr*grad) — recomputing pred every
iteration. Return the final loss (w*x - y)**2, rounded to 2 decimals.