promptdojo_
Checkpoint

One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.

Final drill. Invert the rubric. Given a product spec AND a fork someone picked, decide whether the fork fits the product.

Write evaluate_strategy(product, fork) that returns a tuple (ok: bool, reason: str).

The product spec has the same fields as in step 08: corpus_size_tokens, update_frequency_days, style_critical, cost_per_call_max, latency_budget_ms.

The fork is one of: "rag", "long_context", "fine_tune", "hybrid".

Apply these checks (first failure wins; if none fail, return ok=True):

  • If fork is "long_context" AND corpus_size_tokens > 2_000_000: return (False, "corpus too large for long-context window")
  • If fork is "fine_tune" AND update_frequency_days <= 7: return (False, "freshness too high for fine-tune")
  • If fork is "rag" AND corpus_size_tokens <= 200_000 AND update_frequency_days >= 90: return (False, "RAG overkill for small stable corpus")
  • If fork is "hybrid" AND NOT style_critical: return (False, "hybrid wasted without style demand")
  • Otherwise return (True, "fork fits product")

Three product-fork pairs run. Expected output:

Harvey + fine_tune:       (False, 'freshness too high for fine-tune')
Glean + long_context:     (False, 'corpus too large for long-context window')
Handbook bot + rag:       (False, 'RAG overkill for small stable corpus')

this step needs the editor

on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.

open this same url on a laptop to keep going today.