Chunking that respects structure — don't shred your own documents — step 6 of 9
The naive splitter is cutting mid-sentence. Two sentences fit in one budget but the cut lands in the middle of the second. The user later asks "what does the second sentence say?" — retrieval gets half of it.
Fix the split to break on ". " (sentence boundary) instead of a
hard character cut. The two resulting chunks should each contain
exactly one complete sentence.
Expected output:
['Acme was founded in 1958.', 'The company makes running shoes.']
The naive splitter is cutting mid-sentence. Two sentences fit in one budget but the cut lands in the middle of the second. The user later asks "what does the second sentence say?" — retrieval gets half of it.
Fix the split to break on ". " (sentence boundary) instead of a
hard character cut. The two resulting chunks should each contain
exactly one complete sentence.
Expected output:
['Acme was founded in 1958.', 'The company makes running shoes.']
this step needs the editor
on desktop today; in the app (coming soon). save your spot and we'll bring you back here when you're ready.