promptdojo_

The confusion matrix tells the truth faster than accuracy

You asked an AI to build a tool that flags refund requests as fraud or legit. It tells you "92% accurate." That single number hides the only thing that matters: which kind of mistake is it making?

A model that approved every refund would still be 95% accurate if only 5% of refunds are fraud — and it would catch zero fraud. Accuracy can't see that. A confusion matrix can.

Four outcomes, not one

Every prediction lands in one of four buckets. Call the flagged class ("fraud") the positive:

  • True Positive (TP) — flagged fraud, was fraud. Caught it. (R1)
  • False Positive (FP) — flagged fraud, was actually legit. A false alarm that annoys a real customer. (R2)
  • True Negative (TN) — approved, was legit. Correct and quiet. (R3)
  • False Negative (FN) — approved, was actually fraud. The expensive miss — money walked out the door. (R4)

Run the editor. Each row is one of the four outcomes.

Why a builder counts all four

"92% accurate" could mean the model nails the easy legit cases and quietly misses half the fraud. Counting TP/FP/TN/FN separately turns a vibe ("seems good") into evidence ("catches 8 of 10 fraud cases but false-alarms on 1 in 12 legit ones"). The rest of this chapter is built on these four numbers — precision and recall come straight out of them.