promptdojo_

Write eval_readiness(team) that takes a team profile (dict) and returns a dict with two fields:

  • score: integer 0-100, higher means MORE eval discipline (good)
  • verdict: string, one of:
    • "eval-mature" if score >= 80
    • "eval-aware" if score >= 50
    • "eval-curious" if score >= 20
    • "vibes era" if score < 20

Score the team on these signals (each adds points to the readiness total):

  • has_test_set is True: add 25 (Hamel: you do not have a product without one)
  • has_judge_prompt is True: add 15 (rubric or LLM-as-judge defined somewhere)
  • ci_runs_evals is True: add 25 (the regression gate)
  • tracks_eval_history is True: add 15 (can compare runs over time)
  • eval_count_per_feature >= 20: add 20 (Anthropic's 50-is-plenty floor)

Two teams run. Expected output:

EvalMatureCo: {'score': 100, 'verdict': 'eval-mature'}
VibeCo:       {'score': 0, 'verdict': 'vibes era'}

this step needs the editor

on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.

open this same url on a laptop to keep going today.