Product
Cypress - Advanced
Cypress is the most sophisticated RAG strategy, combining query expansion, tier-based source retrieval, and automatic reranking to deliver the highest accuracy.
TL;DR
Cypress is the most sophisticated RAG strategy, combining query expansion, tier-based source retrieval, and automatic reranking to deliver the highest accuracy.
Key Takeaways
- Overview
- How Cypress Works
- Unique Features
- Performance Characteristics
- When to Use Cypress
- Configuration
Cypress is the most sophisticated RAG strategy, combining query expansion, tier-based source retrieval, and automatic reranking to deliver the highest accuracy.
Overview
Cypress implements multiple optimization techniques:
- Query Expansion: Adds synonyms and related terms for better semantic matching
- Tier-Based Retrieval: Organizes sources by priority while treating all equally in reranking
- Automatic Reranking: Uses cross-encoder model to improve precision
- Higher Retrieval Volume: Fetches more candidates (50 vs 10) before filtering
Performance: ~3-4 seconds Ideal for: Complex queries requiring maximum accuracy, diverse terminology, high-stakes decisions
How Cypress Works
Processing Flow
User Query: "password reset"
↓
[1] Analyze Conversation Memory (if available)
↓
[2] Query Expansion for Vector Retrieval
→ "password reset, change password, recover account,
reset credentials, account recovery, password recovery"
↓
[3] Embed Expanded Query → Vector
↓
[4] Tier 1 Retrieval (Official docs, high-priority)
→ Fetch top 50 results per source
↓
[5] Tier 2 Retrieval (Community content, secondary)
→ Fetch top 50 results per source
↓
[6] Combine All Results (up to 100 documents)
↓
[7] Automatic Reranking (bge-reranker-v2-m3)
→ Rerank to top 10 most relevant
↓
[8] Build Context from Top 10
↓
[9] Rewrite Prompt with Context (for LLM)
↓
[10] LLM Completion
↓
[11] Clean Response (remove artifacts)
↓
Response: "To reset your password..."
Unique Features
1. Query Expansion for Retrieval
Cypress expands queries before vector search:
Original Query:
"reset password"
Expanded Query:
"reset password, change password, recover account access,
password reset process, account recovery, reset credentials,
forgotten password, password recovery, unlock account"
How it works:
- Uses
gpt-4o-minifor fast expansion - Adds synonyms and related terms
- Includes alternative phrasings
- Adds domain-specific terminology
Why it matters: Improves recall by matching documents that use different terminology than the user's query.
Example Impact:
User Query: "API authentication"
Without Expansion:
→ Matches: "API authentication" (exact)
→ Results: 5 documents
With Expansion:
→ "API authentication, API auth, API security, API keys,
bearer tokens, OAuth, authentication methods"
→ Matches: All variations
→ Results: 50 documents → Reranked to best 10
2. Tier-Based Source Retrieval
Cypress organizes data sources into tiers:
Tier Structure:
Tier 1 (High Priority)
├─ Official Documentation
├─ Product Knowledge Base
├─ API Reference
└─ Admin-Approved Content
Tier 2 (Secondary)
├─ Community Content
├─ Blog Posts
├─ External References
└─ Supplementary Materials
Retrieval Process:
- Query Tier 1 sources (topK = 50 per source)
- Query Tier 2 sources (topK = 50 per source)
- Combine all results
- Rerank all together (both tiers treated equally)
- Top 10 most relevant selected
Important: Both tiers receive equal treatment in reranking. The tier organization is for source management, not quality weighting.
Configuration:
{
"dataSources": [
{
"id": "ds-1",
"name": "Official Docs",
"tier": 1 // High priority
},
{
"id": "ds-2",
"name": "Community Forums",
"tier": 2 // Secondary
}
]
}
3. Automatic Reranking
After retrieval, Cypress reranks using a sophisticated model:
Reranking Model: bge-reranker-v2-m3
- Type: Cross-encoder (more accurate than vector similarity)
- Input: Query + full document text
- Output: Relevance score (0-1)
- Method: Considers full semantic relationship
Vector Search vs Reranking:
Vector Search:
→ Fast approximate similarity
→ Based on embeddings only
→ Good recall
Reranking:
→ Slower but more accurate
→ Analyzes full query-document relationship
→ Excellent precision
Performance Impact:
Before Reranking:
Top 50 results, relevance: 0.65-0.85
After Reranking:
Top 10 results, relevance: 0.85-0.98
→ 20-30% improvement in precision
4. Higher Retrieval Volume
Cypress retrieves more candidates:
| Mode | topKPerSource | Total Retrieved | Final Output |
|---|---|---|---|
| Standard | 50 | Up to 100 | Top 10 after rerank |
| Agentic | Agent.topK (default 5) | Variable | All reranked results |
Why more is better:
- More candidates for reranking = better final selection
- Captures edge cases and variations
- Reduces chance of missing relevant content
5. Response Cleaning
Cypress includes specialized response cleaning:
Removes:
- Original prompt text (if echoed)
- Markdown code block markers
- Extra whitespace
- Formatting artifacts
Example:
LLM Output (raw):
```html
User asked: "What is pricing?"
Our pricing plans are...
Cleaned Output: Our pricing plans are...
## Performance Characteristics
### Latency Breakdown
Memory Analysis: ~100ms Query Expansion: ~500ms ← Unique to Cypress Query Embedding: ~100ms Tier 1 Retrieval (50): ~300ms ← More than Redwood/Cedar Tier 2 Retrieval (50): ~300ms ← Additional tier Reranking (100→10): ~500ms ← Unique to Cypress Context Building: ~50ms Prompt Rewriting: ~400ms LLM Completion: ~800ms Response Cleaning: ~50ms ───────────────────────────────── Total: ~3.1s
### Token Usage
| Component | Tokens | Notes |
|-----------|--------|-------|
| Query Expansion | 50-100 | Expansion prompt |
| Memory Context | 100-300 | Conversation history |
| System Prompt | 150-300 | Agent instructions |
| Retrieved Context | 1200-2000 | Top 10 (higher quality) |
| User Query | 10-50 | Original question |
| Rewriting Prompt | 50-100 | Context-aware rewriting |
| Response | 150-400 | Generated answer |
| **Total** | **~2,200-3,000** | Per request |
### Cost Implications
**Per 1,000 Requests (GPT-4o):**
- Embedding: ~$0.01
- Query Expansion: ~$0.15
- Prompt Rewriting: ~$0.10
- LLM Completion: ~$0.50
- Vector Search: ~$0.08 (higher volume)
- Reranking: ~$0.06
- **Total**: ~$0.90 (+150% vs Redwood, +60% vs Cedar)
## When to Use Cypress
### ✅ Ideal Use Cases
**1. Medical/Legal Q&A (High Accuracy Critical)**
Query: "contraindications for medication X" → Cannot afford mistakes → Diverse medical terminology → Need highest precision ✅ Use Cypress
**2. Complex Technical Documentation**
Query: "configure OAuth with SAML SSO" → Multiple concepts → Various terminology (OAuth 2.0, OAuth2, etc.) → Need comprehensive results ✅ Use Cypress
**3. Multi-Domain Knowledge Bases**
Sources:
- Official API docs (Tier 1)
- Engineering blog (Tier 2)
- Community tutorials (Tier 2) Query: "best practices for API rate limiting" → Benefits from tier organization → Reranking selects best across all sources ✅ Use Cypress
**4. Compliance-Sensitive Queries**
Query: "GDPR data retention requirements" → High-stakes information → Must be accurate and cited → Regulatory compliance ✅ Use Cypress
**5. Queries with Diverse Terminology**
Query: "machine learning model training" Also needs to match:
- "ML model development"
- "training neural networks"
- "model fine-tuning" → Query expansion helps significantly ✅ Use Cypress
### ❌ Not Ideal For
**1. Simple FAQ Queries**
Query: "What are your business hours?" → Straightforward question → No terminology variations → Redwood is 2x faster ❌ Cypress is overkill
**2. High-Volume, Cost-Sensitive**
100,000+ queries/day, tight budget → Cypress costs 2.5x more than Redwood → Consider hybrid approach ❌ Use Cypress selectively
**3. Real-Time Requirements**
Need: < 1 second response → Cypress averages 3-4s → Too slow for real-time ❌ Use Redwood instead
## Configuration
### Agent Settings
```typescript
{
"strategyCode": "CYPRESS",
"topK": 5, // For agentic mode
"topKPerSource": 50, // Standard mode retrieval
"temperature": 0.7,
"maxTokens": 500,
"model": "gpt-4o",
"rewritingModel": "gpt-4o-mini",
"enableQueryExpansion": true,
"enableReranking": true,
"rerankingModel": "bge-reranker-v2-m3",
"tierBased": true
}
Data Source Tier Assignment
// Assign tiers to data sources
{
"dataSources": [
{
"id": "official-docs",
"tier": 1, // Official documentation
"priority": "HIGH"
},
{
"id": "api-reference",
"tier": 1, // API docs
"priority": "HIGH"
},
{
"id": "blog-posts",
"tier": 2, // Supplementary content
"priority": "MEDIUM"
},
{
"id": "community-content",
"tier": 2, // User-generated
"priority": "MEDIUM"
}
]
}
Optimization Tips
1. Tune Retrieval Volume
Too low (20): May miss relevant docs
Sweet spot (50): Good candidate pool
Too high (100): Slower reranking, no benefit
2. Query Expansion Quality
Good Expansion:
"Include synonyms, related terms, and common phrasings.
Focus on terminology variations used in documentation."
Bad Expansion:
"Expand the query" ← Too vague
3. Tier Organization
Tier 1: Authoritative, frequently updated
Tier 2: Supplementary, less critical
Don't: Put everything in Tier 1
Do: Thoughtfully organize by importance
4. Reranking Threshold
Keep top 10 after reranking (default)
Consider top 5 for very high precision
Consider top 15 for comprehensive coverage
Comparison with Other Strategies
Complete Comparison Table
| Feature | Redwood | Cedar | Cypress |
|---|---|---|---|
| Performance | |||
| Average Latency | 1-2s | 2-3s | 3-4s |
| Cost per 1k (GPT-4o) | $0.50 | $0.70 | $0.90 |
| Token Usage | 1,500 | 2,000 | 2,500 |
| Accuracy | |||
| Simple Queries | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Complex Queries | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Terminology Variations | ⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Features | |||
| Query Rewriting | ❌ | ✅ | ✅ |
| Query Expansion | ❌ | ❌ | ✅ |
| Reranking | ❌ | ❌ | ✅ |
| Tier-Based | ❌ | ❌ | ✅ |
| Memory Support | ✅ | ✅ | ✅ |
| Best For | |||
| Clear questions | ✅ Best | ✅ Good | ⚠️ Overkill |
| Conversational | ❌ Poor | ✅ Best | ✅ Excellent |
| High accuracy | ❌ Adequate | ⚠️ Good | ✅ Best |
| High volume | ✅ Best | ⚠️ Good | ❌ Expensive |
| Complex terminology | ❌ Limited | ⚠️ Good | ✅ Best |
Migration Paths
Redwood → Cedar: When conversational queries increase Cedar → Cypress: When accuracy becomes critical Cypress → Cedar: When cost/speed matters more than max accuracy
Real-World Performance
Case Study: Medical Knowledge Base
Setup:
- 50,000 medical articles
- Complex terminology (anatomical, pharmaceutical)
- High accuracy requirements
- Average query: "symptoms of X" or "treatment for Y"
Cedar Results (Before):
- Latency: 2.3s ✅
- Accuracy: 82% ⚠️
- User complaints: "Missing alternative terms"
- Cost: $0.68/1k
Cypress Results (After):
- Latency: 3.6s (acceptable)
- Accuracy: 96% ✅
- User satisfaction: +41%
- Cost: $0.89/1k
- ROI: Worth it for medical accuracy
Key Insight: Query expansion captured terminology variations like:
- "heart attack" → "myocardial infarction", "cardiac arrest", "MI"
- "fever" → "pyrexia", "elevated temperature", "hyperthermia"
Case Study: Enterprise Software Documentation
Setup:
- 10,000 technical docs
- API references + guides + tutorials
- 3 document tiers (official, community, archived)
- Queries: Mix of simple and complex
Strategy Mix (Optimized):
Simple queries (40%): Redwood → 1.2s avg
Conversational (30%): Cedar → 2.1s avg
Complex/Critical (30%): Cypress → 3.5s avg
Blended Performance:
→ Average latency: 2.1s
→ Average cost: $0.65/1k
→ Overall accuracy: 91%
Key Insight: Hybrid approach optimizes for both cost and quality.
Advanced Configuration
Dynamic Strategy Selection
Automatically choose strategy based on query:
def select_strategy(query, conversation_history):
# Simple, clear query
if is_simple(query) and not conversation_history:
return "REDWOOD"
# Conversational or follow-up
if conversation_history and is_ambiguous(query):
return "CEDAR"
# Complex or high-stakes
if is_complex(query) or requires_high_accuracy(query):
return "CYPRESS"
# Default
return "CEDAR"
Custom Reranking Parameters
{
"reranking": {
"model": "bge-reranker-v2-m3",
"topN": 10, // Results to keep
"scoreThreshold": 0.7, // Minimum relevance
"truncation": "END", // How to handle long docs
"batchSize": 50 // Rerank in batches
}
}
Query Expansion Control
{
"queryExpansion": {
"enabled": true,
"maxTerms": 15, // Max expansion terms
"includeSynonyms": true,
"includeAbbreviations": true,
"includeDomainTerms": true,
"expansionPrompt": "..." // Custom prompt
}
}
Monitoring Cypress
Key Metrics
1. Reranking Effectiveness
Metric: Improvement in relevance scores post-reranking
Target: +20% improvement
Track: Average score before vs after
2. Query Expansion Impact
Metric: % increase in retrieved candidates
Target: 2-3x more candidates than unexpanded
Track: Results with vs without expansion
3. Tier Distribution
Metric: % of final results from Tier 1 vs Tier 2
Target: Varies by use case
Track: Ensure both tiers contributing
4. Overall Performance
Latency: < 4s (p95)
Accuracy: > 90% (user ratings)
Cost: Within budget
Common Issues
Very Slow Responses (> 5s):
Cause: Retrieving too many documents
Fix: Reduce topKPerSource to 30-40
Poor Reranking:
Cause: Reranking model mismatch
Fix: Verify model compatibility with your content type
High Costs:
Cause: Query expansion generating too many tokens
Fix: Limit maxTerms to 10-12
Best Practices
1. Tier Organization
✅ Tier 1: Official, verified content ✅ Tier 2: Supplementary, community content ✅ Review tier assignments quarterly ❌ Don't put everything in Tier 1
2. Query Expansion
✅ Focus on domain-specific terminology ✅ Include common abbreviations ✅ Test expansion quality ❌ Don't over-expand (diminishing returns)
3. Performance Monitoring
✅ Track reranking effectiveness ✅ Monitor latency by query type ✅ Review cost vs quality trade-offs ✅ A/B test against Cedar ❌ Don't assume it's working optimally
4. Hybrid Approach
✅ Use Cypress for critical queries ✅ Use Cedar for conversational ✅ Use Redwood for simple queries ✅ Route intelligently based on context ❌ Don't use one strategy for everything
Code Examples
Using Cypress via API
curl -X POST https://api.twig.so/api/chat \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "explain OAuth 2.0 flow",
"agentId": "agent-123",
"strategyCode": "CYPRESS",
"sessionId": "session-456",
"enableQueryExpansion": true,
"enableReranking": true
}'
Response Format
{
"response": "OAuth 2.0 is an authorization framework...",
"expandedQuery": "OAuth 2.0 flow, OAuth2 authorization, authentication flow, access token flow, authorization code grant",
"sources": [
{
"title": "OAuth 2.0 Specification",
"url": "https://example.com/oauth",
"relevance": 0.97,
"tier": 1,
"rerankScore": 0.95
}
],
"metadata": {
"strategy": "CYPRESS",
"latency": 3.4,
"tokensUsed": 2687,
"retrievedCount": 87,
"rerankedCount": 10,
"expansionLatency": 0.52,
"rerankingLatency": 0.48
}
}
Next Steps
- Redwood Strategy - Fastest option for clear queries
- Cedar Strategy - Balanced conversational strategy
- Performance Tuning - Optimize for your use case
- Cost Optimization - Reduce expenses while maintaining quality
Agent Instructions: Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.
Perform an HTTP GET request on the current page URL with the ask query parameter:
GET /dev/product/overview-1/cypress.md?ask=<question>
The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.
Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
Last updated January 25, 2026


