Rate Limits
API rate limits prevent abuse and ensure fair usage across all customers. The Agoralia API enforces rate limits per IP address and per API key.
Overview
Rate limits are enforced:
- Per IP address (via middleware)
- Per API key (via tenant limits)
- Per endpoint (some endpoints may have stricter limits)
Rate limit information is included in every API response via headers, allowing you to monitor your usage and implement proper backoff strategies.
Default Limits
Global Rate Limit (IP-based)
The API enforces a global rate limit of 100 requests per minute per IP address. This limit applies to all endpoints and is enforced by middleware.
Note: This is a default limit. Actual limits may vary based on your plan and usage patterns.
Tenant Rate Limits (API Key-based)
Each tenant has rate limits that are returned in the /me endpoint response:
{
"tenant_id": 1234,
"workspace_name": "My Workspace",
"plan": "pro",
"rate_limits": {
"rpm": 120,
"rpd": 20000
}
}
rpm: Requests per minuterpd: Requests per day
Current Default Values:
- rpm: 120 requests per minute
- rpd: 20,000 requests per day
Note: These values are currently hardcoded defaults. Plan-based limits may be implemented in the future.
Rate Limit Headers
Every API response includes rate limit headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1640995200
Header Fields
| Header | Description | Example |
|---|---|---|
X-RateLimit-Limit | Maximum requests allowed per window | 100 |
X-RateLimit-Remaining | Requests remaining in current window | 45 |
X-RateLimit-Reset | Unix timestamp when limit resets | 1640995200 |
Note: The reset time is calculated as current time + 60 seconds (1 minute window).
Handling Rate Limits
429 Too Many Requests
When you exceed the rate limit, you'll receive a 429 Too Many Requests response:
Status Code: 429
Response Body:
{
"detail": "Too many requests. Please try again later."
}
Headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640995200
Exponential Backoff
Implement exponential backoff for retries when you receive a 429 response:
Python Example:
import time
import random
import requests
def make_request_with_retry(url, headers, max_retries=5):
"""Make request with exponential backoff on rate limit"""
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 429:
# Get reset time from header
reset_time = int(response.headers.get('X-RateLimit-Reset', time.time() + 60))
wait_time = max(0, reset_time - int(time.time()))
# Add jitter to prevent thundering herd
wait_time += random.uniform(0, 1)
if attempt < max_retries - 1:
time.sleep(wait_time)
continue
else:
raise Exception("Rate limit exceeded, max retries reached")
return response
raise Exception("Max retries exceeded")
Node.js Example:
const axios = require('axios');
async function makeRequestWithRetry(url, headers, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
const response = await axios.get(url, { headers });
return response;
} catch (error) {
if (error.response?.status === 429) {
const resetTime = parseInt(error.response.headers['x-ratelimit-reset'] || Date.now() / 1000 + 60);
const waitTime = Math.max(0, resetTime - Math.floor(Date.now() / 1000));
const jitter = Math.random();
if (attempt < maxRetries - 1) {
await new Promise(resolve => setTimeout(resolve, (waitTime + jitter) * 1000));
continue;
}
}
throw error;
}
}
}
Retry-After Header
Some rate limit responses may include a Retry-After header indicating when to retry:
Retry-After: 2
This header specifies the number of seconds to wait before retrying.
Best Practices
✅ Monitor Rate Limit Headers
Always check rate limit headers to monitor your usage:
response = requests.get(url, headers=headers)
remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
limit = int(response.headers.get('X-RateLimit-Limit', 100))
if remaining < 10:
# Slow down requests
time.sleep(1)
✅ Implement Exponential Backoff
Use exponential backoff with jitter to avoid overwhelming the API:
import time
import random
def backoff(attempt):
"""Exponential backoff with jitter"""
base_delay = 2 ** attempt
jitter = random.uniform(0, 1)
return base_delay + jitter
# Usage
for attempt in range(max_retries):
try:
response = make_request()
break
except RateLimitError:
if attempt < max_retries - 1:
time.sleep(backoff(attempt))
✅ Batch Operations
When possible, batch multiple operations into a single request:
# ❌ Bad: Multiple requests
for lead in leads:
create_lead(lead) # 100 requests
# ✅ Good: Batch request
create_leads(leads) # 1 request
✅ Cache Responses
Cache responses to reduce API calls:
from functools import lru_cache
import time
@lru_cache(maxsize=100)
def get_agent(agent_id, cache_time=300):
"""Get agent with caching"""
return requests.get(f"/api/v1/agents/{agent_id}").json()
✅ Use Webhooks Instead of Polling
Instead of polling for updates, use webhooks when available:
# ❌ Bad: Polling every 5 seconds
while True:
check_for_updates()
time.sleep(5)
# ✅ Good: Use webhooks
@app.route('/webhook', methods=['POST'])
def handle_webhook():
process_update(request.json)
Rate Limit Strategies
1. Preemptive Rate Limiting
Monitor your usage and slow down before hitting limits:
class RateLimiter:
def __init__(self, max_per_minute=100):
self.max_per_minute = max_per_minute
self.requests = []
def wait_if_needed(self):
now = time.time()
# Remove requests older than 1 minute
self.requests = [t for t in self.requests if now - t < 60]
if len(self.requests) >= self.max_per_minute:
# Wait until oldest request expires
sleep_time = 60 - (now - self.requests[0])
if sleep_time > 0:
time.sleep(sleep_time)
self.requests.append(now)
2. Request Throttling
Implement request throttling to stay within rate limits:
import time
from collections import deque
class RequestThrottler:
def __init__(self, max_per_minute=100):
self.max_per_minute = max_per_minute
self.requests = deque()
def wait_if_needed(self):
now = time.time()
# Remove requests older than 1 minute
while self.requests and self.requests[0] < now - 60:
self.requests.popleft()
if len(self.requests) >= self.max_per_minute:
# Wait until oldest request expires
sleep_time = 60 - (now - self.requests[0])
if sleep_time > 0:
time.sleep(sleep_time)
self.requests.append(time.time())
Increasing Limits
Current Status
Rate limits are currently set to default values. Plan-based limits may be implemented in the future.
Requesting Higher Limits
If you need higher rate limits:
- Contact Support: Reach out to Agoralia support with your use case
- Provide Details:
- Expected request volume (requests per minute/day)
- Use case description
- Integration type (webhook, polling, batch processing)
- Wait for Approval: Custom limits are typically reviewed within 24-48 hours
Troubleshooting
Constantly Hitting Rate Limits
Problem: Frequently receiving 429 Too Many Requests errors.
Solutions:
- Implement exponential backoff
- Reduce request frequency
- Batch operations when possible
- Cache responses
- Use webhooks instead of polling
- Contact support to request higher limits
Rate Limit Headers Missing
Problem: Rate limit headers not present in responses.
Possible causes:
- Using an older API version
- Headers are only added when rate limiting is active
Solution: Headers should be present in all responses. If missing, contact support.
Inconsistent Rate Limit Behavior
Problem: Rate limits seem inconsistent or unpredictable.
Possible causes:
- Multiple IP addresses making requests
- Rate limits are per-IP, not per-API-key
- Shared IP addresses (e.g., corporate proxy)
Solution:
- Use a consistent IP address or API key
- Monitor rate limit headers
- Implement proper backoff strategies
Rate Limit Examples
Python Client with Rate Limiting
import time
import requests
from typing import Optional
class AgoraliaClient:
def __init__(self, api_key: str):
self.api_key = api_key
self.base_url = "https://api.agoralia.app/api/v1"
self.last_request_time = 0
self.min_request_interval = 0.5 # 500ms between requests
def _make_request(self, method: str, endpoint: str, **kwargs):
"""Make request with rate limiting"""
# Enforce minimum interval
elapsed = time.time() - self.last_request_time
if elapsed < self.min_request_interval:
time.sleep(self.min_request_interval - elapsed)
url = f"{self.base_url}{endpoint}"
headers = {
"X-API-Key": self.api_key,
"Content-Type": "application/json"
}
response = requests.request(method, url, headers=headers, **kwargs)
self.last_request_time = time.time()
# Check rate limit headers
remaining = int(response.headers.get('X-RateLimit-Remaining', 100))
if remaining < 10:
# Slow down if approaching limit
time.sleep(1)
if response.status_code == 429:
reset_time = int(response.headers.get('X-RateLimit-Reset', time.time() + 60))
wait_time = max(0, reset_time - int(time.time()))
time.sleep(wait_time)
# Retry once
return requests.request(method, url, headers=headers, **kwargs)
response.raise_for_status()
return response
def get_agents(self):
return self._make_request("GET", "/agents").json()
def create_campaign(self, data):
return self._make_request("POST", "/campaigns", json=data).json()
Next Steps
- Error Handling - Handle API errors gracefully
- Idempotency - Ensure safe retries
- Endpoints Reference - See available endpoints