Rate Limits

API rate limits prevent abuse and ensure fair usage across all customers. The Agoralia API enforces rate limits per IP address and per API key.

Overview

Rate limits are enforced:

Per IP address (via middleware)
Per API key (via tenant limits)
Per endpoint (some endpoints may have stricter limits)

Rate limit information is included in every API response via headers, allowing you to monitor your usage and implement proper backoff strategies.

Default Limits

Global Rate Limit (IP-based)

The API enforces a global rate limit of 100 requests per minute per IP address. This limit applies to all endpoints and is enforced by middleware.

Note: This is a default limit. Actual limits may vary based on your plan and usage patterns.

Tenant Rate Limits (API Key-based)

Each tenant has rate limits that are returned in the /me endpoint response:

{
  "tenant_id": 1234,
  "workspace_name": "My Workspace",
  "plan": "pro",
  "rate_limits": {
    "rpm": 120,
    "rpd": 20000
  }
}

rpm: Requests per minute
rpd: Requests per day

Current Default Values:

rpm: 120 requests per minute
rpd: 20,000 requests per day

Note: These values are currently hardcoded defaults. Plan-based limits may be implemented in the future.

Rate Limit Headers

Every API response includes rate limit headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1640995200

Header Fields

Header	Description	Example
`X-RateLimit-Limit`	Maximum requests allowed per window	`100`
`X-RateLimit-Remaining`	Requests remaining in current window	`45`
`X-RateLimit-Reset`	Unix timestamp when limit resets	`1640995200`

Note: The reset time is calculated as current time + 60 seconds (1 minute window).

Handling Rate Limits

429 Too Many Requests

When you exceed the rate limit, you'll receive a 429 Too Many Requests response:

Status Code: 429

Response Body:

{
  "detail": "Too many requests. Please try again later."
}

Headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640995200

Exponential Backoff

Implement exponential backoff for retries when you receive a 429 response:

Python Example:

import time
import random
import requests

def make_request_with_retry(url, headers, max_retries=5):
    """Make request with exponential backoff on rate limit"""
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)
        
        if response.status_code == 429:
            # Get reset time from header
            reset_time = int(response.headers.get('X-RateLimit-Reset', time.time() + 60))
            wait_time = max(0, reset_time - int(time.time()))
            
            # Add jitter to prevent thundering herd
            wait_time += random.uniform(0, 1)
            
            if attempt < max_retries - 1:
                time.sleep(wait_time)
                continue
            else:
                raise Exception("Rate limit exceeded, max retries reached")
        
        return response
    
    raise Exception("Max retries exceeded")

Node.js Example:

const axios = require('axios');

async function makeRequestWithRetry(url, headers, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      const response = await axios.get(url, { headers });
      return response;
    } catch (error) {
      if (error.response?.status === 429) {
        const resetTime = parseInt(error.response.headers['x-ratelimit-reset'] || Date.now() / 1000 + 60);
        const waitTime = Math.max(0, resetTime - Math.floor(Date.now() / 1000));
        const jitter = Math.random();
        
        if (attempt < maxRetries - 1) {
          await new Promise(resolve => setTimeout(resolve, (waitTime + jitter) * 1000));
          continue;
        }
      }
      throw error;
    }
  }
}

Retry-After Header

Some rate limit responses may include a Retry-After header indicating when to retry:

Retry-After: 2

This header specifies the number of seconds to wait before retrying.

Best Practices

✅ Monitor Rate Limit Headers

Always check rate limit headers to monitor your usage:

response = requests.get(url, headers=headers)
remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
limit = int(response.headers.get('X-RateLimit-Limit', 100))

if remaining < 10:
    # Slow down requests
    time.sleep(1)

✅ Implement Exponential Backoff

Use exponential backoff with jitter to avoid overwhelming the API:

import time
import random

def backoff(attempt):
    """Exponential backoff with jitter"""
    base_delay = 2 ** attempt
    jitter = random.uniform(0, 1)
    return base_delay + jitter

# Usage
for attempt in range(max_retries):
    try:
        response = make_request()
        break
    except RateLimitError:
        if attempt < max_retries - 1:
            time.sleep(backoff(attempt))

✅ Batch Operations

When possible, batch multiple operations into a single request:

# ❌ Bad: Multiple requests
for lead in leads:
    create_lead(lead)  # 100 requests

# ✅ Good: Batch request
create_leads(leads)  # 1 request

✅ Cache Responses

Cache responses to reduce API calls:

from functools import lru_cache
import time

@lru_cache(maxsize=100)
def get_agent(agent_id, cache_time=300):
    """Get agent with caching"""
    return requests.get(f"/api/v1/agents/{agent_id}").json()

✅ Use Webhooks Instead of Polling

Instead of polling for updates, use webhooks when available:

# ❌ Bad: Polling every 5 seconds
while True:
    check_for_updates()
    time.sleep(5)

# ✅ Good: Use webhooks
@app.route('/webhook', methods=['POST'])
def handle_webhook():
    process_update(request.json)

Rate Limit Strategies

1. Preemptive Rate Limiting

Monitor your usage and slow down before hitting limits:

class RateLimiter:
    def __init__(self, max_per_minute=100):
        self.max_per_minute = max_per_minute
        self.requests = []
    
    def wait_if_needed(self):
        now = time.time()
        # Remove requests older than 1 minute
        self.requests = [t for t in self.requests if now - t < 60]
        
        if len(self.requests) >= self.max_per_minute:
            # Wait until oldest request expires
            sleep_time = 60 - (now - self.requests[0])
            if sleep_time > 0:
                time.sleep(sleep_time)
        
        self.requests.append(now)

2. Request Throttling

Implement request throttling to stay within rate limits:

import time
from collections import deque

class RequestThrottler:
    def __init__(self, max_per_minute=100):
        self.max_per_minute = max_per_minute
        self.requests = deque()
    
    def wait_if_needed(self):
        now = time.time()
        # Remove requests older than 1 minute
        while self.requests and self.requests[0] < now - 60:
            self.requests.popleft()
        
        if len(self.requests) >= self.max_per_minute:
            # Wait until oldest request expires
            sleep_time = 60 - (now - self.requests[0])
            if sleep_time > 0:
                time.sleep(sleep_time)
        
        self.requests.append(time.time())

Increasing Limits

Current Status

Rate limits are currently set to default values. Plan-based limits may be implemented in the future.

Requesting Higher Limits

If you need higher rate limits:

Contact Support: Reach out to Agoralia support with your use case
Provide Details:
- Expected request volume (requests per minute/day)
- Use case description
- Integration type (webhook, polling, batch processing)
Wait for Approval: Custom limits are typically reviewed within 24-48 hours

Troubleshooting

Constantly Hitting Rate Limits

Problem: Frequently receiving 429 Too Many Requests errors.

Solutions:

Implement exponential backoff
Reduce request frequency
Batch operations when possible
Cache responses
Use webhooks instead of polling
Contact support to request higher limits

Rate Limit Headers Missing

Problem: Rate limit headers not present in responses.

Possible causes:

Using an older API version
Headers are only added when rate limiting is active

Solution: Headers should be present in all responses. If missing, contact support.

Inconsistent Rate Limit Behavior

Problem: Rate limits seem inconsistent or unpredictable.

Possible causes:

Multiple IP addresses making requests
Rate limits are per-IP, not per-API-key
Shared IP addresses (e.g., corporate proxy)

Solution:

Use a consistent IP address or API key
Monitor rate limit headers
Implement proper backoff strategies

Rate Limit Examples

Python Client with Rate Limiting

import time
import requests
from typing import Optional

class AgoraliaClient:
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.base_url = "https://api.agoralia.app/api/v1"
        self.last_request_time = 0
        self.min_request_interval = 0.5  # 500ms between requests
    
    def _make_request(self, method: str, endpoint: str, **kwargs):
        """Make request with rate limiting"""
        # Enforce minimum interval
        elapsed = time.time() - self.last_request_time
        if elapsed < self.min_request_interval:
            time.sleep(self.min_request_interval - elapsed)
        
        url = f"{self.base_url}{endpoint}"
        headers = {
            "X-API-Key": self.api_key,
            "Content-Type": "application/json"
        }
        
        response = requests.request(method, url, headers=headers, **kwargs)
        self.last_request_time = time.time()
        
        # Check rate limit headers
        remaining = int(response.headers.get('X-RateLimit-Remaining', 100))
        if remaining < 10:
            # Slow down if approaching limit
            time.sleep(1)
        
        if response.status_code == 429:
            reset_time = int(response.headers.get('X-RateLimit-Reset', time.time() + 60))
            wait_time = max(0, reset_time - int(time.time()))
            time.sleep(wait_time)
            # Retry once
            return requests.request(method, url, headers=headers, **kwargs)
        
        response.raise_for_status()
        return response
    
    def get_agents(self):
        return self._make_request("GET", "/agents").json()
    
    def create_campaign(self, data):
        return self._make_request("POST", "/campaigns", json=data).json()

Next Steps

Error Handling - Handle API errors gracefully
Idempotency - Ensure safe retries
Endpoints Reference - See available endpoints

Rate Limits

Overview​

Default Limits​

Global Rate Limit (IP-based)​

Tenant Rate Limits (API Key-based)​

Rate Limit Headers​

Header Fields​

Handling Rate Limits​

429 Too Many Requests​

Exponential Backoff​

Retry-After Header​

Best Practices​

✅ Monitor Rate Limit Headers​

✅ Implement Exponential Backoff​

✅ Batch Operations​

✅ Cache Responses​

✅ Use Webhooks Instead of Polling​

Rate Limit Strategies​

1. Preemptive Rate Limiting​

2. Request Throttling​

Increasing Limits​

Current Status​

Requesting Higher Limits​

Troubleshooting​

Constantly Hitting Rate Limits​

Rate Limit Headers Missing​

Inconsistent Rate Limit Behavior​

Rate Limit Examples​

Python Client with Rate Limiting​

Next Steps​