Retrieval that finds the right thing — top-k, thresholds, and the rerank step everyone skips (step 9/9) · context and retrieval

Checkpoint

One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.

Final drill. Build rag_retrieve(query_vec, indexed, k, threshold) that:

Takes indexed as a list of (chunk_id, doc_vec) pairs.
Computes cosine similarity between query_vec and each doc_vec (use cosine provided in the starter).
Filters by score >= threshold.
Dedupes by chunk_id, keeping the highest-scoring occurrence.
Returns the top-k chunk_ids in descending score order.

Three cases run. Expected output:

['policy/refund_terms', 'policy/refund_eligibility']
[]
['policy/refund_terms']

Checkpoint

One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.

Final drill. Build rag_retrieve(query_vec, indexed, k, threshold) that:

Takes indexed as a list of (chunk_id, doc_vec) pairs.
Computes cosine similarity between query_vec and each doc_vec (use cosine provided in the starter).
Filters by score >= threshold.
Dedupes by chunk_id, keeping the highest-scoring occurrence.
Returns the top-k chunk_ids in descending score order.

Three cases run. Expected output:

['policy/refund_terms', 'policy/refund_eligibility']
[]
['policy/refund_terms']

full-screen editor opens — close anytime to keep reading.

Retrieval that finds the right thing — top-k, thresholds, and the rerank step everyone skips — step 9 of 9