Chunking Strategies
Learning Objectives
By the end of this session, you will be able to:
Understand what chunking is in RAG systems
Learn why chunking is critical for retrieval quality
Explore different chunking strategies
Understand chunk size trade-offs
Learn about chunk overlap techniques
Identify common chunking mistakes
Select appropriate chunking approaches for different use cases
Introduction
In the previous session, we explored the Data Ingestion Pipeline and learned how documents are transformed into searchable knowledge.
One of the most important steps in that pipeline is:
Chunking
Many beginners assume chunking is simply splitting a document into smaller pieces.
In reality, chunking is one of the most important decisions in RAG engineering.
A poorly designed chunking strategy can cause:
Poor retrieval
Missing context
Incorrect answers
Increased hallucinations
A well-designed chunking strategy can dramatically improve:
Search relevance
Context quality
Answer accuracy
User satisfaction
Many real-world RAG performance problems can be traced back to poor chunking decisions.
This is why experienced AI engineers spend significant time optimizing chunking strategies.
Why This Topic Matters
Imagine a university handbook containing:
Admission Rules
Examination Policies
Scholarship Guidelines
Hostel Regulations
A student asks:
What are the scholarship eligibility requirements?
If chunking is poor:
Scholarship Information
+
Hostel Rules
+
Exam Policies
may be combined into one chunk.
The retrieval system may return irrelevant information.
If chunking is well-designed:
Scholarship Policy Chunk
is retrieved directly.
The answer becomes significantly more accurate.
What Is Chunking?
Chunking is the process of dividing large documents into smaller pieces called chunks.
Example:
Original document:
100 Pages
After chunking:
Chunk 1
Chunk 2
Chunk 3
...
Chunk N
Each chunk becomes an independent unit for:
Embedding generation
Storage
Retrieval
The quality of these chunks directly affects the quality of the RAG system.
Why Not Store Entire Documents?
A common question is:
Why not embed the entire document?
Consider a 200-page handbook.
Problems:
Embedding Quality
Large documents contain multiple topics.
Example:
Leave Policy
Travel Policy
Benefits Policy
Security Policy
One embedding cannot accurately represent all topics.
Retrieval Precision
Users usually ask specific questions.
Example:
How many annual leave days are available?
Retrieving a 200-page document is inefficient.
Context Limits
LLMs have context window limitations.
Sending entire documents increases costs and complexity.
Chunking solves these issues.
How Chunking Fits into RAG
Workflow:
Documents
?
Chunking
?
Embeddings
?
Vector Database
?
Retrieval
Chunking serves as the foundation of semantic retrieval.
Characteristics of Good Chunks
Good chunks should be:
Meaningful
Contain a complete idea.
Focused
Cover a single topic when possible.
Searchable
Easy for retrieval systems to locate.
Contextual
Contain enough information to make sense independently.
A chunk should be understandable even when viewed alone.
Example of Poor Chunking
Document:
Leave Policy
Employees receive 24 annual leave days.
Remote Work Policy
Employees may work remotely twice per week.
Travel Policy
Travel expenses must be approved.
Poor chunk:
Leave Policy
Employees receive 24 annual leave days.
Remote Work Policy
Employees may work remotely twice per week.
Travel Policy
Travel expenses must be approved.
Three topics are mixed together.
Retrieval quality decreases.
Example of Better Chunking
Chunk 1:
Leave Policy
Employees receive 24 annual leave days.
Chunk 2:
Remote Work Policy
Employees may work remotely twice per week.
Chunk 3:
Travel Policy
Travel expenses must be approved.
Each chunk now focuses on a specific topic.
Fixed-Size Chunking
The simplest strategy.
Documents are divided based on character count or token count.
Example:
500 Tokens Per Chunk
Workflow:
Document
?
500 Tokens
?
500 Tokens
?
500 Tokens
Advantages:
Easy implementation
Fast processing
Widely supported
Disadvantages:
May split important information
Ignores document structure
Fixed-Size Example
Document:
Page 1
Page 2
Page 3
...
Chunks:
Tokens 1–500
Tokens 501–1000
Tokens 1001–1500
This approach is common in early RAG implementations.
Recursive Chunking
A more advanced strategy.
The system attempts to split documents using natural boundaries.
Example:
Priority:
Paragraph
?
Sentence
?
Word
Instead of splitting arbitrarily, the system preserves meaning.
Advantages:
Better context preservation
Improved retrieval quality
This is one of the most commonly used approaches today.
Section-Based Chunking
Documents often contain:
Chapter 1
Chapter 2
Chapter 3
or
Policy A
Policy B
Policy C
Chunking follows document structure.
Example:
Scholarship Policy
becomes one chunk.
Advantages:
Preserves topic boundaries
Easy to understand
Ideal for:
Manuals
Policies
Documentation
Semantic Chunking
Semantic chunking uses meaning rather than size.
Example:
Document:
Topic A
Topic A
Topic A
Topic B
Topic B
Topic B
The system detects topic changes and creates chunks accordingly.
Advantages:
High retrieval quality
Better contextual grouping
Disadvantages:
More computationally expensive
Semantic chunking is increasingly used in advanced RAG systems.
Visual Comparison
Fixed Chunking
Document
?
500 Tokens
?
500 Tokens
?
500 Tokens
Section Chunking
Policy A
Policy B
Policy C
Semantic Chunking
Meaning Group 1
Meaning Group 2
Meaning Group 3
Each strategy serves different needs.
Understanding Chunk Size
Chunk size is one of the most important tuning parameters.
Examples:
200 Tokens
500 Tokens
1000 Tokens
Different chunk sizes produce different retrieval behaviors.
Small Chunks
Example:
100–300 Tokens
Advantages:
Highly focused retrieval
Better precision
Disadvantages:
May lose context
Important information may be fragmented
Example:
Question:
Leave Policy
Retrieved:
Employees receive
Incomplete context.
Large Chunks
Example:
1000–2000 Tokens
Advantages:
More context
Better completeness
Disadvantages:
Lower retrieval precision
More irrelevant information
Example:
Question:
Leave Policy
Retrieved:
Leave Policy
Travel Policy
Benefits Policy
Security Policy
Too much information.
The Chunk Size Trade-Off
Small Chunks
?
Higher Precision
Lower Context
Large Chunks
?
Lower Precision
Higher Context
Finding the right balance is a key RAG engineering skill.
What Is Chunk Overlap?
Chunk overlap allows neighboring chunks to share content.
Example:
Without overlap:
Chunk 1
A B C D
Chunk 2
E F G H
With overlap:
Chunk 1
A B C D
Chunk 2
C D E F
Some information appears in both chunks.
Why Overlap Matters
Important information often spans boundaries.
Without overlap:
Sentence Start
may appear in one chunk.
Sentence End
may appear in another.
The meaning is lost.
Overlap helps preserve continuity.
Typical Overlap Values
Common values:
10%
20%
30%
Example:
500 Token Chunk
100 Token Overlap
This approach is widely used in production systems.
Example Retrieval Problem
Question:
What are the eligibility criteria for scholarships?
Without overlap:
Eligibility
and
Requirements
may appear in separate chunks.
Retrieval becomes less effective.
With overlap:
Eligibility Requirements
remain together.
Search quality improves.
Chunking Strategies by Use Case
Legal Documents
Recommended:
Section-Based Chunking
Reason:
Legal documents already contain structured sections.
Technical Documentation
Recommended:
Recursive Chunking
Reason:
Preserves logical explanations.
Research Papers
Recommended:
Semantic Chunking
Reason:
Research topics often span multiple paragraphs.
FAQs
Recommended:
Question-Answer Chunking
Reason:
Each FAQ becomes a separate chunk.
Real-World Enterprise Example
A company stores:
Employee Handbook
Benefits Guide
Security Policies
Poor chunking:
Multiple policies mixed together
Result:
Poor retrieval quality
Optimized chunking:
Policy-Based Chunks
Result:
Higher retrieval accuracy
This significantly improves user experience.
Common Chunking Mistakes
Chunks Too Small
Important context lost.
Chunks Too Large
Too much irrelevant information.
No Overlap
Boundary information lost.
Ignoring Document Structure
Reduces retrieval quality.
One Strategy for Every Document
Different document types often require different approaches.
How Chunking Impacts Cost
Consider:
10,000 Documents
Small chunks create:
100,000 Chunks
Large chunks create:
20,000 Chunks
More chunks mean:
More embeddings
More storage
More processing
Chunk size influences infrastructure costs.
Production Chunking Workflow
Documents
?
Cleaning
?
Chunking
?
Overlap
?
Embeddings
?
Vector Database
Every stage affects retrieval performance.
Enterprise Best Practices
Start Simple
Begin with recursive chunking.
Measure Retrieval Quality
Evaluate actual search results.
Tune Chunk Size
Adjust based on document type.
Use Overlap
Preserve important context.
Test Frequently
Retrieval quality should be validated regularly.
Successful RAG systems evolve through experimentation.
.NET Perspective
Popular .NET tools include:
Semantic Kernel
Azure AI Search
Azure OpenAI
Many enterprise applications implement custom chunking strategies tailored to business documents.
Python Perspective
Popular Python frameworks include:
LangChain
LlamaIndex
ChromaDB
Unstructured
These frameworks provide built-in chunking utilities and support multiple chunking strategies.
Assignment
Practical Exercise
Take a 10-page PDF and create:
Fixed-size chunks
Section-based chunks
Semantic chunks
Compare:
Retrieval quality
Context preservation
Ease of implementation
Design Activity
Choose one domain:
University
Healthcare
Banking
E-Commerce
Recommend a chunking strategy and explain your reasoning.
Key Takeaways
Chunking divides documents into searchable units.
Chunk quality directly impacts retrieval quality.
Fixed-size chunking is simple but may lose context.
Recursive and semantic chunking often produce better results.
Chunk size involves a trade-off between precision and context.
Overlap helps preserve meaning across chunk boundaries.
Effective chunking is one of the most important aspects of RAG engineering.
Module 3 Complete
You have now completed:
What Is Retrieval-Augmented Generation (RAG)?
Why LLMs Hallucinate
How RAG Solves Knowledge Limitations
RAG Architecture Explained
Data Ingestion Pipeline
Chunking Strategies
You now understand the complete foundation of RAG systems and are ready to explore embeddings and vector databases in greater depth.
What's Next?
In Session 19, we begin Module 4: Embeddings and Vector Databases with:
Understanding Embeddings
You will learn what embeddings are, how they work, why they are essential for semantic search, and how they form the backbone of modern RAG systems.