Chunking strategies greatly affect RAG quality: import { RecursiveCharacterTextSplitter } from ‘langchain/text_splitter’; // Basic chunking const splitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 200, separators: [‘ ‘, ‘ ‘, ‘ ‘, ”] }); const chunks = await splitter.splitText(document); Semantic Chunking async function semanticChunk(text, maxTokens = 500) { const sentences = text.match(/[^.!?]+[.!?]+/g) || [text]; const chunks…
Text Chunking Strategies for RAG Applications
Posted on
