Hallucination Despite Retrieved Context

The Problem

LLM adds fabricated details even when relevant context is provided, mixing real retrieved information with invented facts.

Symptoms

❌ Adds details not in context
❌ Embellishes with plausible but false info
❌ Correct facts + wrong details combined
❌ Cannot distinguish source of claims
❌ Confident delivery of mixed truth/fiction

Real-World Example

Retrieved context:
"Premium plan includes 5 team members and 100GB storage"

User query: "What's in premium plan?"

AI response: "Premium plan includes 5 team members, 100GB storage,
priority email support (24h response), and access to beta features."

Context ONLY mentioned: 5 members, 100GB
AI INVENTED: Priority support, beta access

Deep Technical Analysis

Retrieval-Generation Gap

Incomplete Context:

Context silent on some aspects:
→ User asks about support
→ Context doesn't mention support
→ LLM fills gap with "typical" support model
→ Hallucinates based on training data patterns

The Helpful Assistant Dilemma:

LLM trained to:
→ Be complete and helpful
→ Answer fully
→ Avoid "I don't know"

Conflicts with:
→ "Only use retrieved context"
→ Admit knowledge gaps

Helpfulness bias → hallucination

Pattern Completion

Training Data Influence:

LLM saw thousands of "Premium plan" descriptions:
→ Usually include: Support, features, storage
→ Pattern: Premium = better support

Applies pattern even if not in YOUR docs:
→ Invents "priority support"
→ Sounds plausible
→ But factually wrong for your product

Weak Grounding

Instruction Adherence Limits:

System prompt: "Only use provided context"

But:
→ LLM follows ~85-90% of time
→ 10-15% drifts to training knowledge
→ Cannot 100% guarantee grounding

Stronger models (GPT-4) better than weaker (GPT-3.5)

Citation as Constraint:

Forcing citations helps:
"For each claim, cite source: [chunk_id]"

AI must justify each fact:
→ "5 members [chunk_12]"
→ "100GB storage [chunk_12]"
→ Cannot cite invented facts
→ Reduces hallucination

How to Solve

Require citations for all claims + use explicit prompts: "If not in context, say 'not available in documentation'" + implement two-stage: extract facts first, then answer using only extracted + use models fine-tuned for RAG (instruction-following) + apply post-generation fact-checking against context + penalize hallucination in eval metrics. See Hallucination Prevention.

Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the ask query parameter:

GET /dev/rag-scenarios-and-solutions/accuracy/hallucination.md?ask=<question>

The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.

Hallucination Despite Retrieved Context

Key Takeaways

The Problem

Symptoms

Real-World Example

Deep Technical Analysis

Retrieval-Generation Gap

Pattern Completion

Weak Grounding

How to Solve

Agent Instructions: Querying This Documentation

Related Pages

Integrations

Industries

Comparisons

Compliance

Investors

Industry