Product
Redwood - Standard RAG
Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting
TL;DR
Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting. It's optimized for speed and cost-effectiveness.
Key Takeaways
- Takes the user's original query as-is
- Converts directly to vector embedding
- Retrieves top matching documents
- Generates response with retrieved context
Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting. It's optimized for speed and cost-effectiveness.
Overview
Redwood uses the simplest, most straightforward RAG approach:
- Takes the user's original query as-is
- Converts directly to vector embedding
- Retrieves top matching documents
- Generates response with retrieved context
Performance: ~1-2 seconds Ideal for: Clear, well-formed questions
How Redwood Works
Processing Flow
User Query: "What is the product pricing?"
↓
[1] Embed Query → Vector [0.12, -0.45, 0.78, ...]
↓
[2] Vector Search (Pinecone/TigrisDB)
↓
[3] Retrieve Top 5-10 Documents
↓
[4] Build Context from Documents
↓
[5] LLM Completion (with context)
↓
Response: "Our pricing plans are..."
Technical Details
Step 1: Query Embedding
- Model:
text-embedding-ada-002 - No preprocessing or rewriting
- Original user text is embedded directly
Step 2: Vector Search
- Searches namespace:
org-{orgId} - Returns topK results (default: 5-10)
- Cosine similarity ranking
- Filters by metadata (if configured)
Step 3: Context Building
- Retrieved documents formatted as context
- Includes title, content, and URL
- Preserves source citations
- Respects max context window
Step 4: LLM Completion
- Model: GPT-4o, GPT-4, or GPT-3.5-turbo
- System prompt + context + user query
- Temperature: 0.7 (default)
- Streams response (optional)
Performance Characteristics
Latency Breakdown
Query Embedding: ~100ms
Vector Search: ~200ms
Context Building: ~50ms
LLM Completion: ~800ms
Response Streaming: ~200ms
────────────────────────────
Total: ~1.4s
Token Usage
| Component | Tokens | Notes |
|---|---|---|
| System Prompt | 150-300 | Agent instructions |
| Retrieved Context | 800-1500 | Top 5-10 documents |
| User Query | 10-50 | Original question |
| Response | 150-400 | Generated answer |
| Total | ~1,500-2,000 | Per request |
Cost Implications
Per 1,000 Requests (GPT-3.5-turbo):
- Embedding: ~$0.01
- LLM Completion: ~$0.30
- Vector Search: ~$0.05
- Total: ~$0.36
When to Use Redwood
✅ Ideal Use Cases
1. FAQ Bots
Q: "What are your business hours?"
→ Direct, clear question
→ No context needed
→ Fast response critical
2. Product Information Lookup
Q: "Does the Pro plan include SSO?"
→ Straightforward fact query
→ Answer is in docs
→ Speed matters
3. Quick Reference Tools
Q: "How do I export a CSV report?"
→ Clear task-based question
→ Step-by-step answer in docs
→ Users want fast help
4. API Documentation Queries
Q: "What parameters does the /users endpoint accept?"
→ Technical, precise question
→ Documented information
→ Developers value speed
❌ Not Ideal For
1. Ambiguous Questions
Q: "How much does it cost?"
→ Missing context: which product? which plan?
→ Redwood can't infer missing information
→ Cedar or Cypress would rewrite to clarify
2. Follow-up Questions
User: "What are your pricing plans?"
Agent: "We have Free, Pro, and Enterprise..."
User: "What's included in that one?" ❌
→ "that one" requires conversation context
→ Cedar would rewrite: "What's included in the Enterprise plan?"
3. Complex Multi-Part Queries
Q: "I need to migrate from Competitor X, what's the process and will my data transfer automatically?"
→ Multiple sub-questions
→ Requires breaking down and context
→ Cypress would handle better
Configuration
Agent Settings
When using Redwood strategy:
{
"strategyCode": "STANDARD_RAG",
"topK": 5-10, // Number of documents to retrieve
"temperature": 0.7, // LLM creativity
"maxTokens": 500, // Response length limit
"model": "gpt-3.5-turbo" // Or gpt-4o for higher quality
}
Optimization Tips
1. Optimize topK
Too low (3): May miss relevant context
Sweet spot (5-10): Balance of context and speed
Too high (20+): Slower, noisier context
2. Document Quality
- Well-structured source documents = better retrieval
- Clear headings and sections
- Avoid very long documents (chunk effectively)
3. Query Quality Training
- Educate users to ask clear questions
- Provide example questions
- Use suggested prompts
Comparison with Other Strategies
vs. Cedar (Context-Aware)
Redwood Advantages:
- ⚡ Faster (~1s faster than Cedar)
- 💰 Cheaper (one less LLM call)
- 📊 Simpler to debug
Cedar Advantages:
- 🧠 Better for conversational queries
- 🔄 Handles follow-ups better
- 📝 Clarifies ambiguous questions
Example:
Query: "What about pricing?"
Redwood: Searches for "what about pricing"
→ May return generic results
Cedar: Rewrites to "What are the pricing plans for [Product X]?"
→ More targeted results
vs. Cypress (Advanced)
Redwood Advantages:
- ⚡⚡ Much faster (~2-3s faster)
- 💰💰 Much cheaper (no reranking, expansion)
- 🎯 Simpler implementation
Cypress Advantages:
- 🎯 Higher accuracy
- 🔍 Better semantic matching through query expansion
- 📊 Tier-based source organization
- 🏆 Reranking improves precision
Example:
Query: "password reset"
Redwood: Searches for "password reset"
→ Finds exact matches
Cypress: Expands to "password reset, change password,
recover account, reset credentials, account recovery"
→ Finds more variations
→ Reranks to top 10 best matches
Real-World Performance
Case Study: Developer Documentation Site
Setup:
- 5,000 API documentation pages
- Average query: "How to use [endpoint]"
- 10,000 queries/day
Redwood Performance:
- Average latency: 1.2 seconds
- 95th percentile: 1.8 seconds
- User satisfaction: 4.2/5
- Cost: $3.60/day
Result: Perfect fit for clear, technical queries
Case Study: E-commerce FAQ
Setup:
- 500 FAQ articles
- Average query: "What is [policy]?"
- 5,000 queries/day
Redwood Performance:
- Average latency: 1.0 seconds
- 95th percentile: 1.5 seconds
- User satisfaction: 4.5/5
- Cost: $1.80/day
Result: Fast, accurate for straightforward questions
Monitoring Redwood
Key Metrics to Track
1. Response Time
Target: < 2 seconds (95th percentile)
Alert: > 3 seconds
2. Retrieval Quality
Metric: Citation relevance rate
Target: > 85% of responses have relevant citations
3. Answer Rate
Metric: % of queries with an answer
Target: > 90%
4. Cost per Query
Track: Token usage trends
Optimize: Adjust topK, max response tokens
Common Issues
Slow Responses:
- Check vector DB latency
- Verify network connectivity
- Consider caching frequent queries
Irrelevant Results:
- Improve document chunking
- Add metadata filters
- Consider switching to Cedar for ambiguous queries
Low Answer Rate:
- Ensure knowledge base has sufficient coverage
- Check data source connectivity
- Review unanswered queries for patterns
Best Practices
1. Document Preparation
✅ Clear, well-structured documents ✅ Good titles and headings ✅ Logical chunking (200-500 tokens) ✅ Updated regularly
2. User Guidance
✅ Provide example questions ✅ Show suggested prompts ✅ Educate on asking clear questions
3. Performance Optimization
✅ Monitor latency metrics ✅ Cache common queries ✅ Use appropriate topK ✅ Right-size context window
4. Quality Assurance
✅ Regular testing with sample queries ✅ Review low-confidence responses ✅ A/B test against Cedar for borderline cases
Migration Path
When to Switch from Redwood
Switch to Cedar if:
-
30% of queries are follow-ups
- Users report ambiguous results
- Conversational use increases
Switch to Cypress if:
- Accuracy is more important than speed
- Budget allows for higher costs
- Query complexity increases significantly
Code Example
Using Redwood via API
curl -X POST https://api.twig.so/api/chat \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "What is the refund policy?",
"agentId": "agent-123",
"strategyCode": "STANDARD_RAG",
"topK": 5
}'
Response Format
{
"response": "Our refund policy allows returns within 30 days...",
"sources": [
{
"title": "Refund Policy",
"url": "https://example.com/refund-policy",
"relevance": 0.95
}
],
"latency": 1.2,
"tokensUsed": 1523,
"strategy": "STANDARD_RAG"
}
Next Steps
- Cedar Strategy - When to upgrade for conversational queries
- Cypress Strategy - Maximum accuracy with reranking
- Performance Tuning
- Cost Optimization
Agent Instructions: Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.
Perform an HTTP GET request on the current page URL with the ask query parameter:
GET /dev/product/overview-1/redwood.md?ask=<question>
The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.
Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
Related Pages
Last updated January 25, 2026


