RAG vs long context vs fine-tune — the decision that's killed more AI startups than any model swap — step 8 of 9
Write pick_fork(spec) that takes a product spec (dict) and
returns one of: "rag", "long_context", "fine_tune",
"hybrid".
Spec fields you'll be given:
corpus_size_tokens: int (how big the corpus is in tokens)update_frequency_days: int (how often the corpus changes; 9999 means "essentially never")style_critical: bool (does the output need to read like a domain expert wrote it?)cost_per_call_max: float (USD ceiling per query)latency_budget_ms: int (max acceptable latency)
Apply the rubric in this order (first match wins):
- If
corpus_size_tokens > 2_000_000: return"rag"(doesn't fit in any context window, RAG by physics). - If
style_criticalANDupdate_frequency_days <= 7: return"hybrid"(style + freshness both matter). - If
style_critical: return"fine_tune"(style is the product, corpus is stable enough). - If
update_frequency_days <= 7: return"rag"(freshness pressure, no style demand). - If
corpus_size_tokens <= 2_000_000: return"long_context"(stable, fits in window). - Fallback: return
"rag".
Two products demoed. Expected output:
Glean (enterprise search): rag
Harvey (legal AI): hybrid
⌘↵ runs the editor.read, then continue.
Write pick_fork(spec) that takes a product spec (dict) and
returns one of: "rag", "long_context", "fine_tune",
"hybrid".
Spec fields you'll be given:
corpus_size_tokens: int (how big the corpus is in tokens)update_frequency_days: int (how often the corpus changes; 9999 means "essentially never")style_critical: bool (does the output need to read like a domain expert wrote it?)cost_per_call_max: float (USD ceiling per query)latency_budget_ms: int (max acceptable latency)
Apply the rubric in this order (first match wins):
- If
corpus_size_tokens > 2_000_000: return"rag"(doesn't fit in any context window, RAG by physics). - If
style_criticalANDupdate_frequency_days <= 7: return"hybrid"(style + freshness both matter). - If
style_critical: return"fine_tune"(style is the product, corpus is stable enough). - If
update_frequency_days <= 7: return"rag"(freshness pressure, no style demand). - If
corpus_size_tokens <= 2_000_000: return"long_context"(stable, fits in window). - Fallback: return
"rag".
Two products demoed. Expected output:
Glean (enterprise search): rag
Harvey (legal AI): hybrid
this step needs the editor
on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.