promptdojo_

Structured logs and alerts

Once your tool is live, you operate it by what it tells you — its logs. A structured log is a record with named fields (level, event, latency_ms, status), not a blob of free text. Structure is the whole point: you can count errors, filter by event, and compute a p95 latency, which you can't do reliably by grepping prose.

Run the editor: structured logs make "how many requests errored?" a one-line filter.

What to log (for an ML-ish service)

  • the request id and timestamp (trace a single call),
  • the outcome — prediction/label, or an error with its type,
  • latency (how long it took),
  • the model version (so you can tell which model produced what).

When to alert

An alert fires when a metric crosses a threshold you set in advance — e.g. error rate > 5%, or p95 latency > 1s. The rule is: compute the metric from the logs, compare to the threshold, page someone if it's exceeded. Alert on symptoms users feel (errors, slowness), not on every log line, or you'll train yourself to ignore the pager.

Why a builder cares

"Is the thing healthy right now?" is unanswerable without structured logs, and "we found out from an angry customer" is what happens without alerts. Logging queryable fields and alerting on a couple of real thresholds is the minimum that turns a deployed tool into an operated one. You'll compute an error rate and an alert decision next.