Product

Redwood - Standard RAG

Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting

TL;DR

Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting. It's optimized for speed and cost-effectiveness.

Key Takeaways

  • Takes the user's original query as-is
  • Converts directly to vector embedding
  • Retrieves top matching documents
  • Generates response with retrieved context

Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting. It's optimized for speed and cost-effectiveness.

Overview

Redwood uses the simplest, most straightforward RAG approach:

  • Takes the user's original query as-is
  • Converts directly to vector embedding
  • Retrieves top matching documents
  • Generates response with retrieved context

Performance: ~1-2 seconds Ideal for: Clear, well-formed questions

How Redwood Works

Processing Flow

User Query: "What is the product pricing?"
     ↓
[1] Embed Query → Vector [0.12, -0.45, 0.78, ...]
     ↓
[2] Vector Search (Pinecone/TigrisDB)
     ↓
[3] Retrieve Top 5-10 Documents
     ↓
[4] Build Context from Documents
     ↓
[5] LLM Completion (with context)
     ↓
Response: "Our pricing plans are..."

Technical Details

Step 1: Query Embedding

  • Model: text-embedding-ada-002
  • No preprocessing or rewriting
  • Original user text is embedded directly

Step 2: Vector Search

  • Searches namespace: org-{orgId}
  • Returns topK results (default: 5-10)
  • Cosine similarity ranking
  • Filters by metadata (if configured)

Step 3: Context Building

  • Retrieved documents formatted as context
  • Includes title, content, and URL
  • Preserves source citations
  • Respects max context window

Step 4: LLM Completion

  • Model: GPT-4o, GPT-4, or GPT-3.5-turbo
  • System prompt + context + user query
  • Temperature: 0.7 (default)
  • Streams response (optional)

Performance Characteristics

Latency Breakdown

Query Embedding:      ~100ms
Vector Search:        ~200ms
Context Building:     ~50ms
LLM Completion:       ~800ms
Response Streaming:   ~200ms
────────────────────────────
Total:                ~1.4s

Token Usage

ComponentTokensNotes
System Prompt150-300Agent instructions
Retrieved Context800-1500Top 5-10 documents
User Query10-50Original question
Response150-400Generated answer
Total~1,500-2,000Per request

Cost Implications

Per 1,000 Requests (GPT-3.5-turbo):

  • Embedding: ~$0.01
  • LLM Completion: ~$0.30
  • Vector Search: ~$0.05
  • Total: ~$0.36

When to Use Redwood

✅ Ideal Use Cases

1. FAQ Bots

Q: "What are your business hours?"
→ Direct, clear question
→ No context needed
→ Fast response critical

2. Product Information Lookup

Q: "Does the Pro plan include SSO?"
→ Straightforward fact query
→ Answer is in docs
→ Speed matters

3. Quick Reference Tools

Q: "How do I export a CSV report?"
→ Clear task-based question
→ Step-by-step answer in docs
→ Users want fast help

4. API Documentation Queries

Q: "What parameters does the /users endpoint accept?"
→ Technical, precise question
→ Documented information
→ Developers value speed

❌ Not Ideal For

1. Ambiguous Questions

Q: "How much does it cost?"
→ Missing context: which product? which plan?
→ Redwood can't infer missing information
→ Cedar or Cypress would rewrite to clarify

2. Follow-up Questions

User: "What are your pricing plans?"
Agent: "We have Free, Pro, and Enterprise..."
User: "What's included in that one?" ❌
→ "that one" requires conversation context
→ Cedar would rewrite: "What's included in the Enterprise plan?"

3. Complex Multi-Part Queries

Q: "I need to migrate from Competitor X, what's the process and will my data transfer automatically?"
→ Multiple sub-questions
→ Requires breaking down and context
→ Cypress would handle better

Configuration

Agent Settings

When using Redwood strategy:

{
  "strategyCode": "STANDARD_RAG",
  "topK": 5-10,              // Number of documents to retrieve
  "temperature": 0.7,         // LLM creativity
  "maxTokens": 500,           // Response length limit
  "model": "gpt-3.5-turbo"    // Or gpt-4o for higher quality
}

Optimization Tips

1. Optimize topK

Too low (3):   May miss relevant context
Sweet spot (5-10): Balance of context and speed
Too high (20+): Slower, noisier context

2. Document Quality

  • Well-structured source documents = better retrieval
  • Clear headings and sections
  • Avoid very long documents (chunk effectively)

3. Query Quality Training

  • Educate users to ask clear questions
  • Provide example questions
  • Use suggested prompts

Comparison with Other Strategies

vs. Cedar (Context-Aware)

Redwood Advantages:

  • ⚡ Faster (~1s faster than Cedar)
  • 💰 Cheaper (one less LLM call)
  • 📊 Simpler to debug

Cedar Advantages:

  • 🧠 Better for conversational queries
  • 🔄 Handles follow-ups better
  • 📝 Clarifies ambiguous questions

Example:

Query: "What about pricing?"

Redwood: Searches for "what about pricing"
→ May return generic results

Cedar: Rewrites to "What are the pricing plans for [Product X]?"
→ More targeted results

vs. Cypress (Advanced)

Redwood Advantages:

  • ⚡⚡ Much faster (~2-3s faster)
  • 💰💰 Much cheaper (no reranking, expansion)
  • 🎯 Simpler implementation

Cypress Advantages:

  • 🎯 Higher accuracy
  • 🔍 Better semantic matching through query expansion
  • 📊 Tier-based source organization
  • 🏆 Reranking improves precision

Example:

Query: "password reset"

Redwood: Searches for "password reset"
→ Finds exact matches

Cypress: Expands to "password reset, change password, 
         recover account, reset credentials, account recovery"
→ Finds more variations
→ Reranks to top 10 best matches

Real-World Performance

Case Study: Developer Documentation Site

Setup:

  • 5,000 API documentation pages
  • Average query: "How to use [endpoint]"
  • 10,000 queries/day

Redwood Performance:

  • Average latency: 1.2 seconds
  • 95th percentile: 1.8 seconds
  • User satisfaction: 4.2/5
  • Cost: $3.60/day

Result: Perfect fit for clear, technical queries

Case Study: E-commerce FAQ

Setup:

  • 500 FAQ articles
  • Average query: "What is [policy]?"
  • 5,000 queries/day

Redwood Performance:

  • Average latency: 1.0 seconds
  • 95th percentile: 1.5 seconds
  • User satisfaction: 4.5/5
  • Cost: $1.80/day

Result: Fast, accurate for straightforward questions

Monitoring Redwood

Key Metrics to Track

1. Response Time

Target: < 2 seconds (95th percentile)
Alert: > 3 seconds

2. Retrieval Quality

Metric: Citation relevance rate
Target: > 85% of responses have relevant citations

3. Answer Rate

Metric: % of queries with an answer
Target: > 90%

4. Cost per Query

Track: Token usage trends
Optimize: Adjust topK, max response tokens

Common Issues

Slow Responses:

  • Check vector DB latency
  • Verify network connectivity
  • Consider caching frequent queries

Irrelevant Results:

  • Improve document chunking
  • Add metadata filters
  • Consider switching to Cedar for ambiguous queries

Low Answer Rate:

  • Ensure knowledge base has sufficient coverage
  • Check data source connectivity
  • Review unanswered queries for patterns

Best Practices

1. Document Preparation

✅ Clear, well-structured documents ✅ Good titles and headings ✅ Logical chunking (200-500 tokens) ✅ Updated regularly

2. User Guidance

✅ Provide example questions ✅ Show suggested prompts ✅ Educate on asking clear questions

3. Performance Optimization

✅ Monitor latency metrics ✅ Cache common queries ✅ Use appropriate topK ✅ Right-size context window

4. Quality Assurance

✅ Regular testing with sample queries ✅ Review low-confidence responses ✅ A/B test against Cedar for borderline cases

Migration Path

When to Switch from Redwood

Switch to Cedar if:

  • 30% of queries are follow-ups

  • Users report ambiguous results
  • Conversational use increases

Switch to Cypress if:

  • Accuracy is more important than speed
  • Budget allows for higher costs
  • Query complexity increases significantly

Code Example

Using Redwood via API

curl -X POST https://api.twig.so/api/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is the refund policy?",
    "agentId": "agent-123",
    "strategyCode": "STANDARD_RAG",
    "topK": 5
  }'

Response Format

{
  "response": "Our refund policy allows returns within 30 days...",
  "sources": [
    {
      "title": "Refund Policy",
      "url": "https://example.com/refund-policy",
      "relevance": 0.95
    }
  ],
  "latency": 1.2,
  "tokensUsed": 1523,
  "strategy": "STANDARD_RAG"
}

Next Steps


Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the ask query parameter:

GET /dev/product/overview-1/redwood.md?ask=<question>

The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.

Related Pages

Last updated January 25, 2026