Train/inference skew (step 1/7) · feature pipelines, experiment tracking, and registries

Train/inference skew

Train/inference skew (a.k.a. training-serving skew) is the silent killer of ML systems: the model learns from one set of features at training time, but gets a different set at inference time. The accuracy you measured offline evaporates in production — and nothing errors.

Run the editor: the model trained with an age feature, but production can't supply it. The model now sees age as missing on every live request.

How features drift between train and serve

A feature is missing live — it was in your training table but the production request doesn't carry it (like age above).
A feature is renamed — country at train, region at serve. Same count, different name, so a naive "did the number of features change?" check misses it.
A value is computed differently — total is in dollars at training but cents at serving, or a default changed. Same name, skewed meaning.

The cheapest detector compares the feature sets (and ideally types) between training and serving: train - serve shows what's missing live, serve - train shows unexpected extras. Real systems also compare value distributions; set comparison catches the most common breaks.

Why a builder cares

"It worked in the notebook but the live model is garbage" is almost always skew. Comparing the train and serve feature contracts before you ship — and on a schedule after — turns a mysterious accuracy drop into a precise "age is missing at serve." You'll write that comparison next.