Broadcasting and vectorization
Two words show up everywhere in tensor code, and both are simpler than they sound.
Vectorization: one operation over the whole tensor
Instead of looping element-by-element, you express the operation once
and it applies to the entire tensor. In PyTorch you'd write prices * 1.1; here we model it with a comprehension. The payoff is twofold: the
code reads like the math (scores = features @ weights), and the real
libraries run it far faster than a Python for loop because the work
happens in optimized native code, not one element at a time.
Broadcasting: stretching a smaller shape to fit
You often want to combine tensors of different shapes — add one bias number to every feature, or add a row of biases to every row of a batch. Broadcasting is the rule that stretches the smaller operand to match the bigger one:
vector + scalar→ the scalar is applied to every element.matrix + row_vector→ the row is added to every row of the matrix (the row's length must equal the matrix's column count).
matrix = [[1, 2, 3], bias = [10, 20, 30]
[4, 5, 6]]
matrix + bias -> [[11, 22, 33],
[14, 25, 36]] # bias added to each row
Why a builder cares
Broadcasting is the rule behind both the magic ("why did adding a length-3 vector to a 2×3 matrix work?") and the bugs ("why did adding a length-2 vector to it crash?"). The shapes have to be compatible: the trailing dimensions must match or be 1. When AI-written tensor code throws a broadcasting error, you're not debugging calculus — you're checking that the shapes line up, exactly like the last lesson.