Checkpoint
One last thing before we move on. Same surface as a write step — but the lesson doesn't complete until this passes.
Final drill. Build rag_retrieve(query_vec, indexed, k, threshold)
that:
- Takes
indexedas a list of(chunk_id, doc_vec)pairs. - Computes cosine similarity between
query_vecand eachdoc_vec(usecosineprovided in the starter). - Filters by
score >= threshold. - Dedupes by
chunk_id, keeping the highest-scoring occurrence. - Returns the top-k chunk_ids in descending score order.
Three cases run. Expected output:
['policy/refund_terms', 'policy/refund_eligibility']
[]
['policy/refund_terms']