Prediction IDs and inference logs
When a model is live, someone will eventually say "it gave a wrong answer." Your ability to investigate depends entirely on what you logged at inference time. The key field is a prediction ID (or request ID): a unique handle for one specific call.
A useful inference log record has:
- id — the prediction/request id (so you can find this call),
- timestamp — when it happened,
- model version — which model produced it (v2? the new v3?),
- input (or a summary) — what the model actually saw,
- output — what it returned, plus maybe a score.
Run the editor: a customer disputes prediction p2, and the id lets you
pull exactly that record and see which model said what.
Why IDs make tracing possible
Without IDs, "the model was wrong yesterday" is unanswerable — you can't tie a complaint to a specific input and model version. With them, you follow the thread: id → the input it saw → the model version → the output. That's the difference between debugging and guessing. (This is the same trace-reading skill from this chapter, applied to production inference.)
Why a builder cares
The first question in any "the AI got it wrong" incident is "show me that exact request." If you logged a prediction id, timestamp, model version, input, and output, you can answer it in seconds — and often discover it was an old model version, or a malformed input, not the model's fault. You'll build and trace an inference record next.