Rag Scenarios And Solutions
Context Assembly Logic Issues
Agent assembles retrieved chunks into context incorrectly—wrong order, missing connections, or redundant information—leading to confused LLM responses.
TL;DR
Agent assembles retrieved chunks into context incorrectly—wrong order, missing connections, or redundant information—leading to confused LLM responses.
Key Takeaways
- The Problem
- Deep Technical Analysis
- OAuth 2.0
- How to Solve
- Agent Instructions: Querying This Documentation
The Problem
Agent assembles retrieved chunks into context incorrectly—wrong order, missing connections, or redundant information—leading to confused LLM responses.
Symptoms
- ❌ Context chunks in illogical order
- ❌ Related chunks separated
- ❌ Duplicate information included
- ❌ Missing transitional context
- ❌ LLM confused by disjointed context
Real-World Example
Retrieved chunks (by similarity score):
1. Chunk A: "API authentication uses OAuth 2.0" (score: 0.90)
2. Chunk B: "Rate limit is 1000/hour" (score: 0.85)
3. Chunk C: "OAuth requires client_id and client_secret" (score: 0.82)
4. Chunk D: "Token expiration is 1 hour" (score: 0.80)
Assembled context (score order):
"API authentication uses OAuth 2.0. Rate limit is 1000/hour.
OAuth requires client_id and client_secret. Token expiration is 1 hour."
Problem:
→ OAuth concept split (A, then B unrelated, then C continues OAuth)
→ Disjointed flow
→ LLM struggles to connect
Better assembly (logical grouping):
"API authentication uses OAuth 2.0. OAuth requires client_id and
client_secret. Token expiration is 1 hour. [Separate topic:] Rate
limit is 1000/hour."
Deep Technical Analysis
Assembly Strategies
Score-Based (Naive):
Sort by similarity score descending:
→ Highest score first
→ Ignores logical flow
Pros:
+ Simple
+ Most relevant first
Cons:
- May fragment related concepts
- No coherence
Topic-Based Clustering:
Group chunks by topic:
1. Cluster chunks (semantic similarity)
2. Order clusters by relevance
3. Within cluster: logical order
Example:
→ Cluster 1: OAuth (chunks A, C, D)
→ Cluster 2: Rate limits (chunk B)
Assembled:
→ All OAuth together, then rate limits
More coherent
Document-Preserving:
Keep chunks from same document together:
→ Doc 1, Chunk 3
→ Doc 1, Chunk 5
→ Doc 1, Chunk 8
Maintains document's narrative flow
Avoids fragmenting explanations
Redundancy Detection
Semantic Deduplication:
Check similarity between chunks:
→ Chunk A: "Rate limit is 1000/hour"
→ Chunk E: "API allows 1000 requests per hour"
Cosine similarity: 0.94 (very high)
→ Redundant
→ Keep only higher-scored chunk
Reduces context bloat
Extractive Summarization:
If chunks overlap significantly:
→ Extract unique information from each
→ Combine into single summary chunk
Example:
→ Chunk A: "OAuth 2.0 for auth. Use client_id."
→ Chunk C: "OAuth requires client_id and client_secret."
Combined:
"OAuth 2.0 authentication requires client_id and client_secret."
Denser context
Transition Injection
Topic Boundaries:
Insert transitions between topics:
"[Authentication:]
API uses OAuth 2.0...
[Rate Limiting:]
API enforces 1000 requests/hour..."
Helps LLM understand topic shifts
Clearer structure
Hierarchical Headers:
From document structure:
→ Section: "API Authentication"
→ Subsection: "OAuth 2.0"
→ Content chunks
Preserve headers:
"# API Authentication
## OAuth 2.0
API uses OAuth 2.0..."
Context hierarchy preserved
Context Ordering
Recency Preference:
If multiple chunks on same topic:
→ Prefer recent over old
Example:
→ Chunk 2022: "Rate limit 100/hour"
→ Chunk 2024: "Rate limit 1000/hour"
Order: 2024 first
→ LLM sees current info first
→ Less likely to cite outdated
Importance Ranking:
Not just similarity, but importance:
→ Core concepts first
→ Details later
Query: "How to authenticate?"
→ First: Overview of OAuth
→ Then: Specific parameters
→ Then: Troubleshooting
Progressive detail
How to Solve
Cluster chunks by topic before assembly + order clusters by relevance + preserve document order within clusters + detect and remove redundant chunks (cosine > 0.90) + inject topic transition markers + maintain document hierarchy (sections, subsections) + prefer recent chunks over outdated + test context assembly quality with LLM eval (coherence score). See Context Assembly.
Agent Instructions: Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.
Perform an HTTP GET request on the current page URL with the ask query parameter:
GET /dev/rag-scenarios-and-solutions/agent/context-assembly.md?ask=<question>
The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.
Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
Related Pages
Last updated January 26, 2026


