promptdojo_
Checkpoint

One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.

Final drill. Build a pairwise judge that detects its OWN position bias and returns tie when it disagrees with itself across orders.

Write pairwise(question, output_a, output_b) that:

  • Calls fake_judge(question, first, second) twice:
    • Once with first=output_a, second=output_b — read result as "the output named output_a sits in slot A this round."
    • Once with first=output_b, second=output_a — output_b is now in slot A.
  • Each call returns "A" or "B" (the slot that won).
  • Translate slot wins back to which physical output won that round.
  • If the SAME physical output won both rounds, return that output's label ("a" or "b").
  • If the rounds disagreed, return "tie".

Then run a multi-case suite. Expected output:

case 1: a wins  (consistent)
case 2: tie     (position-biased — judge disagrees with itself)
case 3: b wins  (consistent)

this step needs the editor

on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.

open this same url on a laptop to keep going today.