promptdojo_

Chunking that respects structure — don't shred your own documents — step 7 of 9

Sentences are split cleanly into chunks, but the chunks have NO overlap. The user asks "Acme makes WHAT — and when was it founded?" — that fact spans both chunks. Retrieval might pull only chunk 0 ("founded in 1958") OR only chunk 1 ("makes running shoes"), missing half the answer.

Fix the chunks to add 10 characters of overlap from the END of chunk 0 INTO chunk 1, so retrievers that match either chunk see both facts.

Expected output:

['Acme was founded in 1958.', 'd in 1958. The company makes running shoes.']
The break is on line 7 — but read the whole snippet first.

full-screen editor opens — close anytime to keep reading.