Rate Limits

Understand API rate limits and best practices for optimizing your usage.

Overview

Rate limits prevent abuse and ensure fair usage for all customers. Twig AI enforces limits on:

API requests per minute
API requests per day
Tokens processed per day
Concurrent requests

Rate Limit Tiers

Plan	Requests/Minute	Requests/Day	Tokens/Day	Concurrent
Free	20	1,000	100,000	2
Pro	100	10,000	1,000,000	10
Enterprise	1,000+	Custom	Custom	50+

Rate Limit Headers

Every API response includes rate limit information:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200
X-RateLimit-Used: 5

Header Meanings:

Limit: Total requests allowed in window
Remaining: Requests left in current window
Reset: Unix timestamp when limit resets
Used: Requests consumed in window

Rate Limit Exceeded

When you exceed limits:

Response:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Retry after 45 seconds.",
    "retryAfter": 45
  },
  "status": 429
}

Status Code: 429 Too Many Requests

Handling Rate Limits

Exponential Backoff

async function requestWithBackoff(fn, maxRetries = 5) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.min(1000 * Math.pow(2, i), 32000);
        await sleep(delay);
        continue;
      }
      throw error;
    }
  }
}

// Usage
const response = await requestWithBackoff(() => 
  twig.chat.create({ prompt, agentId })
);

Respect Retry-After Header

async function handleRateLimit(error) {
  if (error.status === 429) {
    const retryAfter = error.retryAfter || 60;
    console.log(`Rate limited. Waiting ${retryAfter}s...`);
    await sleep(retryAfter * 1000);
    return true; // Retry
  }
  return false;
}

Queue-Based Processing

import { Queue } from 'bull';

const chatQueue = new Queue('twig-chat', {
  limiter: {
    max: 100,        // Max jobs per interval
    duration: 60000  // 1 minute
  }
});

// Add job
await chatQueue.add('chat', {
  prompt: 'What is pricing?',
  agentId: 'agent-123'
});

// Process with rate limiting
chatQueue.process('chat', async (job) => {
  return await twig.chat.create(job.data);
});

Optimization Strategies

1. Caching

Cache responses for common queries:

const cache = new Map();
const CACHE_TTL = 300000; // 5 minutes

async function getCachedResponse(prompt, agentId) {
  const key = `${agentId}:${prompt}`;
  const cached = cache.get(key);
  
  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.response;
  }
  
  const response = await twig.chat.create({ prompt, agentId });
  cache.set(key, { response, timestamp: Date.now() });
  
  return response;
}

2. Batching

Batch multiple requests:

const requests = [
  { prompt: 'Query 1', agentId: 'agent-123' },
  { prompt: 'Query 2', agentId: 'agent-123' },
  { prompt: 'Query 3', agentId: 'agent-123' }
];

// Process in batches
const batchSize = 10;
for (let i = 0; i < requests.length; i += batchSize) {
  const batch = requests.slice(i, i + batchSize);
  await Promise.all(batch.map(req => 
    twig.chat.create(req)
  ));
  
  // Wait between batches
  if (i + batchSize < requests.length) {
    await sleep(1000);
  }
}

3. Request Prioritization

class PriorityQueue {
  async add(request, priority) {
    // Higher priority = processed first
    // Stay within rate limits
  }
}

// Critical user-facing request
await queue.add(userRequest, priority: 'HIGH');

// Background analytics
await queue.add(analyticsRequest, priority: 'LOW');

4. Distributed Rate Limiting

For multi-server deployments:

import Redis from 'ioredis';

const redis = new Redis();

async function checkRateLimit(userId) {
  const key = `ratelimit:${userId}`;
  const current = await redis.incr(key);
  
  if (current === 1) {
    await redis.expire(key, 60); // 1 minute window
  }
  
  if (current > 100) {
    throw new RateLimitError('Limit exceeded');
  }
  
  return current;
}

Monitoring Usage

Track Consumption

const usage = await twig.usage.get({
  startDate: '2024-01-01',
  endDate: '2024-01-31'
});

console.log(usage.totalRequests);
console.log(usage.totalTokens);
console.log(usage.avgRequestsPerDay);

Set Alerts

Configure alerts for high usage:

{
  "alerts": {
    "dailyRequests": {
      "threshold": 8000,      // 80% of daily limit
      "notify": "team@company.com"
    },
    "minuteRequests": {
      "threshold": 80,        // 80% of per-minute limit
      "notify": "alerts@company.com"
    }
  }
}

Upgrading Plans

Need higher limits?

View current usage: Settings → Usage
Compare with limits
Upgrade plan: Settings → Billing
New limits apply immediately

Enterprise Custom Limits:

Contact sales@twig.so
Custom rate limits
Dedicated infrastructure
SLA guarantees

Best Practices

1. Stay Within Limits

✅ Monitor usage regularly ✅ Implement retry logic ✅ Cache when possible ✅ Use exponential backoff ❌ Don't spam API

2. Optimize Requests

✅ Batch related operations ✅ Use streaming for long responses ✅ Cache frequent queries ✅ Filter unnecessary requests ❌ Don't poll excessively

3. Plan for Growth

✅ Monitor usage trends ✅ Upgrade proactively ✅ Implement queue systems ✅ Use distributed rate limiting ❌ Don't wait until hitting limits

Troubleshooting

Frequently Hitting Limits

Solutions:

Upgrade plan
Implement caching
Optimize request patterns
Use batch processing
Contact support for custom limits

Unexpected Rate Limit Errors

Check:

Are you within stated limits?
Multiple servers sharing key?
Retry logic causing loops?
Background jobs consuming quota?

Next Steps

REST API Overview - API basics
Authentication - Secure access
SDKs - Client libraries with built-in rate limiting
Cost Optimization - Reduce costs

Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the ask query parameter:

GET /dev/product/developer-api/rate-limits.md?ask=<question>

The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.