Retrieval that finds the right thing — top-k, thresholds, and the rerank step everyone skips — step 9 of 9
Checkpoint
One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.
Final drill. Build rag_retrieve(query_vec, indexed, k, threshold)
that:
- Takes
indexedas a list of(chunk_id, doc_vec)pairs. - Computes cosine similarity between
query_vecand eachdoc_vec(usecosineprovided in the starter). - Filters by
score >= threshold. - Dedupes by
chunk_id, keeping the highest-scoring occurrence. - Returns the top-k chunk_ids in descending score order.
Three cases run. Expected output:
['policy/refund_terms', 'policy/refund_eligibility']
[]
['policy/refund_terms']
⌘↵ runs the editor.read, then continue.
Checkpoint
One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.
Final drill. Build rag_retrieve(query_vec, indexed, k, threshold)
that:
- Takes
indexedas a list of(chunk_id, doc_vec)pairs. - Computes cosine similarity between
query_vecand eachdoc_vec(usecosineprovided in the starter). - Filters by
score >= threshold. - Dedupes by
chunk_id, keeping the highest-scoring occurrence. - Returns the top-k chunk_ids in descending score order.
Three cases run. Expected output:
['policy/refund_terms', 'policy/refund_eligibility']
[]
['policy/refund_terms']
this step needs the editor
on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.