Rag Scenarios And Solutions
Agent-Level Data Isolation
Multiple AI agents share the same knowledge base without proper isolation, causing agents to access data they shouldn't see.
TL;DR
Multiple AI agents share the same knowledge base without proper isolation, causing agents to access data they shouldn't see.
Key Takeaways
- The Problem
- Deep Technical Analysis
- How to Solve
- Agent Instructions: Querying This Documentation
The Problem
Multiple AI agents share the same knowledge base without proper isolation, causing agents to access data they shouldn't see.
Symptoms
- ❌ Agent A sees Agent B's private data
- ❌ Cross-agent data leakage
- ❌ Cannot restrict knowledge by agent
- ❌ Shared vector DB exposes all data
- ❌ No tenant isolation
Real-World Example
Company has two agents:
→ HR Agent: Access to employee records
→ Customer Support Agent: Access to help docs
Shared vector DB with all data:
→ Customer asks Support Agent: "What's the CEO's salary?"
→ Retrieval finds HR document with salary info
→ Support Agent responds with CEO salary
Data isolation failure
Deep Technical Analysis
Shared Knowledge Base Risks
No Filtering Layer:
All chunks in one vector DB:
→ HR docs embedded
→ Customer docs embedded
→ No metadata distinguishing them
Any query retrieves anything:
→ Agent identity not checked
→ Data access unrestricted
→ Privacy violation
Metadata Filtering:
Solution: Tag chunks with access control:
{
vector: [0.234, ...],
metadata: {
agent_id: "hr_agent",
department: "hr",
sensitivity: "confidential"
}
}
Query with filter:
→ agent_id = "support_agent"
→ Only retrieve support_agent tagged chunks
Multi-Tenancy Patterns
Namespace Isolation:
Pinecone/Weaviate:
→ Create separate namespaces per agent
→ hr_agent namespace
→ support_agent namespace
Queries scoped to namespace:
→ Cannot cross namespace boundary
→ Strong isolation
Separate Indexes:
One index per agent:
→ hr_agent_index
→ support_agent_index
Complete separation:
+ Strongest isolation
+ Independent scaling
- Higher infrastructure cost
- More operational complexity
Row-Level Security:
PostgreSQL + pgvector:
→ Use database roles
→ Row-level security policies
→ Query: "Show only rows where agent_id = current_user"
Database-enforced isolation
Access Control Logic
Pre-Retrieval Filtering:
Before vector search:
1. Identify requesting agent
2. Add metadata filter:
WHERE metadata.agent_id = 'support_agent'
3. Execute search with filter
Ensures:
→ Only authorized chunks retrieved
→ No leakage
Post-Retrieval Filtering:
Alternative: Filter after retrieval:
1. Retrieve top-K chunks (e.g., 20)
2. Check each chunk's agent_id
3. Remove unauthorized
4. Return remaining (e.g., 12)
Problem:
→ Reduces effective K
→ May not have enough results
→ Prefer pre-retrieval
How to Solve
Tag all chunks with agent_id/tenant_id metadata + implement pre-retrieval filtering (metadata.agent_id = current_agent) + use namespace isolation (separate vector DB namespaces) + consider separate indexes for strong isolation + apply row-level security if using PostgreSQL. See Data Isolation.
Agent Instructions: Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.
Perform an HTTP GET request on the current page URL with the ask query parameter:
GET /dev/rag-scenarios-and-solutions/privacy/data-isolation.md?ask=<question>
The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.
Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
Related Pages
Integrations
Last updated January 26, 2026


