CNNs and local patterns
A convolutional neural network (CNN) is built on one move: slide a small filter (a "kernel") across the input and compute a number at each position. The filter only sees a small local window at a time — that's the whole idea.
- On an image, a 3×3 filter might fire on edges, corners, or textures — small local patterns — and deeper layers combine them into shapes.
- The same filter weights are reused at every position ("weight sharing"), so a CNN has few parameters and finds a pattern wherever it appears (move the cat in the photo, still detected).
Run the editor: a length-2 edge-detector kernel [1, -1] slides over the
signal and lights up where neighboring values differ. The output [0, -5, 0]
says "the only edge is between positions 1 and 2."
The shape rule (you already know it)
For a signal of length n and a kernel of length k, the output has
length n - k + 1 — you can only place the window where it fully fits.
That's the same shape-arithmetic from the tensors chapter, and the #1 CNN
bug an AI ships is an off-by-one on that range.
Why a builder cares
You won't hand-derive convolutions, but you'll read CNN code and need the
intuition: small filter, local window, same weights everywhere. When
a model is great at "is there a defect in this product photo?" but you
don't know why it's a CNN, this is why — local patterns, detected
anywhere. Real PyTorch spells it nn.Conv2d(...); the sliding-window idea
is identical.