promptdojo_

The triage code is hardcoded to blame "true_hallucination" no matter what the trace shows. But the trace says the agent's search tool ran AND returned data — so this is a retrieval problem (wrong data came back), not a hallucination (model inventing from nothing).

Fix the classifier to check trace[-1]["tools_called"]. If it's non-empty, this is "retrieval". Otherwise it's "true_hallucination".

Expected output:

class: retrieval
The break is on line 6 — but read the whole snippet first.

this step needs the editor

on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.

open this same url on a laptop to keep going today.