RAG vs long context vs fine-tune — the decision that's killed more AI startups than any model swap (step 8/9) · context and retrieval

Write pick_fork(spec) that takes a product spec (dict) and returns one of: "rag", "long_context", "fine_tune", "hybrid".

Spec fields you'll be given:

corpus_size_tokens: int (how big the corpus is in tokens)
update_frequency_days: int (how often the corpus changes; 9999 means "essentially never")
style_critical: bool (does the output need to read like a domain expert wrote it?)
cost_per_call_max: float (USD ceiling per query)
latency_budget_ms: int (max acceptable latency)

Apply the rubric in this order (first match wins):

If corpus_size_tokens > 2_000_000: return "rag" (doesn't fit in any context window, RAG by physics).
If style_critical AND update_frequency_days <= 7: return "hybrid" (style + freshness both matter).
If style_critical: return "fine_tune" (style is the product, corpus is stable enough).
If update_frequency_days <= 7: return "rag" (freshness pressure, no style demand).
If corpus_size_tokens <= 2_000_000: return "long_context" (stable, fits in window).
Fallback: return "rag".

Two products demoed. Expected output:

Glean (enterprise search):       rag
Harvey (legal AI):               hybrid

⌘↵ runs the editor.read, then continue.

Write pick_fork(spec) that takes a product spec (dict) and returns one of: "rag", "long_context", "fine_tune", "hybrid".

Spec fields you'll be given:

corpus_size_tokens: int (how big the corpus is in tokens)
update_frequency_days: int (how often the corpus changes; 9999 means "essentially never")
style_critical: bool (does the output need to read like a domain expert wrote it?)
cost_per_call_max: float (USD ceiling per query)
latency_budget_ms: int (max acceptable latency)

Apply the rubric in this order (first match wins):

If corpus_size_tokens > 2_000_000: return "rag" (doesn't fit in any context window, RAG by physics).
If style_critical AND update_frequency_days <= 7: return "hybrid" (style + freshness both matter).
If style_critical: return "fine_tune" (style is the product, corpus is stable enough).
If update_frequency_days <= 7: return "rag" (freshness pressure, no style demand).
If corpus_size_tokens <= 2_000_000: return "long_context" (stable, fits in window).
Fallback: return "rag".

Two products demoed. Expected output:

Glean (enterprise search):       rag
Harvey (legal AI):               hybrid

this step needs the editor

on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.

open this same url on a laptop to keep going today.