Rate Limits
Overview
The gateway enforces limits per API key in three layers:
- Burst limit over a 10-second window
- Sustained requests-per-minute limit
- Daily and monthly cost budgets
The current public KB query endpoint records 1 cent of estimated cost per successful request.
Tiers
| Tier | Burst (10s) | RPM | Daily Budget | Monthly Budget |
|---|---|---|---|---|
| Free | 20 | 60 | 500 cents | 2,500 cents |
| Professional | 60 | 300 | 5,000 cents | 25,000 cents |
| Enterprise | 200 | 1,000 | 50,000 cents | 250,000 cents |
Budget windows are tracked in UTC:
- daily budgets reset at the next UTC midnight
- monthly budgets reset on the first day of the next UTC month
429 Responses
When enforcement blocks a request, the API returns 429 Too Many Requests and includes these headers:
| Header | Meaning |
|---|---|
Retry-After |
Seconds until the caller should retry |
X-RateLimit-Limit |
Limit for the exhausted window |
X-RateLimit-Remaining |
Always 0 on the blocked response |
X-RateLimit-Reset |
Unix timestamp for the reset point |
Example burst-limit response:
{
"detail": {
"error": "rate_limit_exceeded",
"type": "burst",
"retry_after": 10,
"limit": 20,
"window_seconds": 10
}
}
Daily budget exhaustion:
{
"detail": {
"error": "daily_budget_exceeded",
"type": "cost",
"retry_after": 86399,
"daily_limit_cents": 500,
"current_cost_cents": 500
}
}
Success Responses
Successful requests currently return X-Request-Id, but they do not yet expose remaining quota headers. Build integrations around 429 handling rather than assuming success-side rate-limit headers are present.
Fail-Open Behavior
If Upstash Redis is disabled, unconfigured, or temporarily unavailable, the gateway degrades to allow-all instead of blocking traffic. Usage accounting still attempts to write the daily usage row in PostgreSQL independently.
Retry Guidance
Use Retry-After exactly for burst and sustained throttling. For budget exhaustion, back off until the reset window instead of retrying in a tight loop.