Prompt caching correctly — the variable input goes LAST — step 6 of 9
This message has the user's question (variable) BEFORE the few-shot examples (stable). The cache prefix tries to match starting from the user's question — which differs every call — so the cache never fires. Hit rate: 0%.
Fix the message so the stable few-shot examples come FIRST (with
cache_control), then the user's question comes last with no
cache_control. Now the prefix is stable and cache fires on every
call after the first.
Expected output:
first block: text? True, cached? True
second block: text? True, cached? False
This message has the user's question (variable) BEFORE the few-shot examples (stable). The cache prefix tries to match starting from the user's question — which differs every call — so the cache never fires. Hit rate: 0%.
Fix the message so the stable few-shot examples come FIRST (with
cache_control), then the user's question comes last with no
cache_control. Now the prefix is stable and cache fires on every
call after the first.
Expected output:
first block: text? True, cached? True
second block: text? True, cached? False
this step needs the editor
on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.