Rate limits
/api/v1 is rate-limited per API key. Quotas are generous for normal dashboard- and CI-shaped traffic and exist primarily to protect the upstream trademark databases from accidental hot loops.
Quotas
The MVP enforces a single per-key quota — 60 requests per minute, fixed window:
| Tier | Steady-state |
|---|---|
free | API access not available — see plan tiers. |
solo | API access not available — see plan tiers. |
team | 60 req/min |
enterprise | 60 req/min (raise on request) |
Tier-differentiated quotas (Team 60/min, Enterprise 600/min, Enterprise burst, …) are not enforced in the MVP — both Team and Enterprise are gated at the same 60 req/min. Quotas will diverge in a follow-up issue once a customer needs the headroom; there is no behaviour change planned for current keys.
The implementation is a fixed-window counter (resets at the top of the wall-clock minute), in-process. With horizontal scaling on Fly the effective ceiling scales with instance count — N instances means up to N × 60 req/min in the worst case. For precise enforcement, the limiter will move to a shared store (Redis / pg) under a follow-up issue.
Response headers
Every authenticated /api/v1 response — success or failure — carries quota headers so clients can self-throttle without trial-and-error:
| Header | Type | Meaning |
|---|---|---|
X-RateLimit-Limit | integer | The steady-state quota for this key (60). |
X-RateLimit-Remaining | integer | Requests left in the current window at the time the response was produced. May be 0 immediately before a 429. |
Retry-After | integer | Only on 429 responses. Seconds to wait before the window resets. Always honour this. |
X-RateLimit-Reset (UNIX seconds at which the window resets) is not emitted in MVP — derive the reset by adding Retry-After to the response time, or rely on retry-after directly. It will be added in the same follow-up that introduces tiered limits.
429 rate_limited
When the bucket is empty and a request arrives, the server returns:
HTTP/1.1 429 Too Many RequestsContent-Type: application/jsonRetry-After: 7X-RateLimit-Limit: 60X-RateLimit-Remaining: 0{ "error": { "code": "rate_limited", "message": "Rate limit exceeded. Retry in 7s." }}Handling a 429
- Honour
Retry-After. Sleep for that many seconds, then retry the same request. Do not retry sooner. - Add jitter if you have parallel workers sharing a key — uniform sleeps from many workers reconverge into the same spike. Sleep
Retry-After + uniform(0, 0.25 * Retry-After). - Cap retries. Three or four attempts is plenty; keep going past that and you are masking a bug somewhere upstream of the API client.
- Watch
X-RateLimit-Remaining. When it drops below ~10% ofX-RateLimit-Limit, slow down voluntarily rather than racing to the cliff.
# Minimal bash retry loop honouring Retry-After.attempt=0while true; do attempt=$((attempt + 1)) response=$(curl -sS -D /tmp/h \ https://api.trademarksentinel.app/api/v1/watches \ -H "Authorization: Bearer ts_REPLACE_ME") status=$(awk 'NR==1 {print $2}' /tmp/h) case "$status" in 429) [ "$attempt" -ge 4 ] && { echo "giving up after $attempt attempts" >&2; exit 1; } retry=$(awk 'tolower($1)=="retry-after:" {print $2}' /tmp/h | tr -d "\r") echo "rate limited; sleeping ${retry}s" >&2 sleep "$retry" ;; 2*) echo "$response"; break ;; *) echo "non-retryable status: $status" >&2; exit 1 ;; esacdoneDesigning around the quota
- Batch list reads. A single
GET /alerts?limit=200&since=...is one request and gives you up to 200 records; one-record-at-a-time polling is the pathological pattern. - Use
sincefor incremental sync. Re-listing every alert you have ever seen on every poll wastes the bucket. - One key per system, not per call site. A shared key serialised through a per-process limiter is easier to reason about than a key per microservice with no shared budget.
- Contact support for Enterprise bumps if your steady state genuinely exceeds 600 req/min. We would rather raise the quota than have customers re-implement client-side request coalescing badly.