Embedding that fits the budget — pick a model that matches your corpus (step 8/9) · context and retrieval

promptdojo_

Write rank_by_similarity(query_vec, doc_vecs) that returns a list of doc INDICES sorted by descending cosine similarity to the query. The highest-scoring doc comes first.

Input: a query vector and a list of doc vectors (all the same dim).
Output: list of indices into doc_vecs, sorted best-first.
Use cosine similarity (dot product / product of norms).

A query and four docs run for you. The query points heavily in the first dimension. Doc 1 matches it closely. Doc 3 is the next-closest. Docs 2 and 0 point elsewhere.

Expected output:

[1, 3, 2, 0]

Write rank_by_similarity(query_vec, doc_vecs) that returns a list of doc INDICES sorted by descending cosine similarity to the query. The highest-scoring doc comes first.

Input: a query vector and a list of doc vectors (all the same dim).
Output: list of indices into doc_vecs, sorted best-first.
Use cosine similarity (dot product / product of norms).

A query and four docs run for you. The query points heavily in the first dimension. Doc 1 matches it closely. Doc 3 is the next-closest. Docs 2 and 0 point elsewhere.

Expected output:

[1, 3, 2, 0]

full-screen editor opens — close anytime to keep reading.

Embedding that fits the budget — pick a model that matches your corpus — step 8 of 9