Evaluator-optimizer — write a draft, let a judge critique it, revise — step 9 of 9
Checkpoint
One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.
Final drill. Build a production-grade evaluator-optimizer that
tracks the full revision history. Write
optimize(task, max_iters) that:
- Maintains
feedback = Noneandhistory = []. - Loops up to
max_iterstimes. Each iteration:- Calls
gen(task, feedback=feedback)to produce a draft. - Calls
judge(draft)to get a verdict dict. - Appends one entry to
history:{"iter": <1-based>, "draft": <draft>, "status": <verdict status>, "feedback": <verdict feedback>}. - On PASS: return
{"final_draft": draft, "iters": <iteration that passed>, "passed": True, "history": history}. - On REJECT: update
feedback = verdict["feedback"]and continue.
- Calls
- On cap exhaustion without a PASS: return
{"final_draft": <last draft>, "iters": max_iters, "passed": False, "history": history}.
The script will run two cases. Expected output:
passed=True iters=3 history_len=3 final=v3-good
passed=False iters=2 history_len=2 final=stub2
⌘↵ runs the editor.read, then continue.
Checkpoint
One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.
Final drill. Build a production-grade evaluator-optimizer that
tracks the full revision history. Write
optimize(task, max_iters) that:
- Maintains
feedback = Noneandhistory = []. - Loops up to
max_iterstimes. Each iteration:- Calls
gen(task, feedback=feedback)to produce a draft. - Calls
judge(draft)to get a verdict dict. - Appends one entry to
history:{"iter": <1-based>, "draft": <draft>, "status": <verdict status>, "feedback": <verdict feedback>}. - On PASS: return
{"final_draft": draft, "iters": <iteration that passed>, "passed": True, "history": history}. - On REJECT: update
feedback = verdict["feedback"]and continue.
- Calls
- On cap exhaustion without a PASS: return
{"final_draft": <last draft>, "iters": max_iters, "passed": False, "history": history}.
The script will run two cases. Expected output:
passed=True iters=3 history_len=3 final=v3-good
passed=False iters=2 history_len=2 final=stub2
this step needs the editor
on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.