Five postmortems — three public, two composite — and the one fix that would have caught all of them (step 8/9) · debugging broken ai output

promptdojo_›phase 04 · shipping discipline›ch 24 · debugging broken ai output

lesson 3 of 3 · five postmortems — three public, two composite — and the one fix that would have caught all of themstep 8 / 9

Write classify_failure(case) that takes a postmortem dict with three fields:

user_saw: str (the complaint as the user reported it)
trace_snippet: str (the relevant trace excerpt)
system_state: str (one-line description of the system at failure)

It must return a dict with two keys, in this exact order: {"class": int, "fix": str}.

Apply these string-match heuristics on trace_snippet, in priority order. First match wins:

If "no retrieval" is in trace_snippet → class 3, fix "add retrieval-with-citations + out-of-domain refusal".
Else if "schema mismatch" is in trace_snippet → class 4, fix "add Pydantic validation at the consumer boundary".
Else if "stale chunk" is in trace_snippet → class 1, fix "add freshness metadata + superseded_by filter".
Else if "ambiguous" is in trace_snippet → class 2, fix "tighten prompt with explicit constraint + negative example".
Otherwise → class 0, fix "unclassified — read the full trace".

Two cases run for you. Expected output:

air_canada: {'class': 3, 'fix': 'add retrieval-with-citations + out-of-domain refusal'}
recruiter:  {'class': 4, 'fix': 'add Pydantic validation at the consumer boundary'}

⌘↵ runs the editor.read, then continue.

Write classify_failure(case) that takes a postmortem dict with three fields:

user_saw: str (the complaint as the user reported it)
trace_snippet: str (the relevant trace excerpt)
system_state: str (one-line description of the system at failure)

It must return a dict with two keys, in this exact order: {"class": int, "fix": str}.

Apply these string-match heuristics on trace_snippet, in priority order. First match wins:

If "no retrieval" is in trace_snippet → class 3, fix "add retrieval-with-citations + out-of-domain refusal".
Else if "schema mismatch" is in trace_snippet → class 4, fix "add Pydantic validation at the consumer boundary".
Else if "stale chunk" is in trace_snippet → class 1, fix "add freshness metadata + superseded_by filter".
Else if "ambiguous" is in trace_snippet → class 2, fix "tighten prompt with explicit constraint + negative example".
Otherwise → class 0, fix "unclassified — read the full trace".

Two cases run for you. Expected output:

air_canada: {'class': 3, 'fix': 'add retrieval-with-citations + out-of-domain refusal'}
recruiter:  {'class': 4, 'fix': 'add Pydantic validation at the consumer boundary'}

this step needs the editor

on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.

save my spot follow @TFisPython for the app launch

open this same url on a laptop to keep going today.

Five postmortems — three public, two composite — and the one fix that would have caught all of them — step 8 of 9

this step needs the editor