The four breakage classes — sort any LLM failure before you touch the prompt — step 6 of 9
The classify_failure(trace) below is the lazy default: it
returns "hallucination" for every failure. That's the worst
habit on a debugging team — "the model hallucinated" is what you
say when you don't want to read the trace.
Fix the function to check the trace fields in priority order:
- If
retrieved_chunksis non-empty ANDretrieved_chunks_match_queryisFalse, return"retrieval". - Else if
raw_output != output_after_postprocess, return"parse". - Else if
retrieved_chunksis empty, return"hallucination". - Else return
"prompt".
The trace in the editor is a class-1 retrieval bug (retrieval
returned the wrong account's ticket). The fixed function should
return "retrieval".
Expected output:
class: retrieval
The classify_failure(trace) below is the lazy default: it
returns "hallucination" for every failure. That's the worst
habit on a debugging team — "the model hallucinated" is what you
say when you don't want to read the trace.
Fix the function to check the trace fields in priority order:
- If
retrieved_chunksis non-empty ANDretrieved_chunks_match_queryisFalse, return"retrieval". - Else if
raw_output != output_after_postprocess, return"parse". - Else if
retrieved_chunksis empty, return"hallucination". - Else return
"prompt".
The trace in the editor is a class-1 retrieval bug (retrieval
returned the wrong account's ticket). The fixed function should
return "retrieval".
Expected output:
class: retrieval
this step needs the editor
on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.