Rate limits

/api/v1 is rate-limited per API key. Quotas are generous for normal dashboard- and CI-shaped traffic and exist primarily to protect the upstream trademark databases from accidental hot loops.

Quotas

The MVP enforces a single per-key quota — 60 requests per minute, fixed window:

Tier	Steady-state
`free`	API access not available — see plan tiers.
`solo`	API access not available — see plan tiers.
`team`	60 req/min
`enterprise`	60 req/min (raise on request)

Tier-differentiated quotas (Team 60/min, Enterprise 600/min, Enterprise burst, …) are not enforced in the MVP — both Team and Enterprise are gated at the same 60 req/min. Quotas will diverge in a follow-up issue once a customer needs the headroom; there is no behaviour change planned for current keys.

The implementation is a fixed-window counter (resets at the top of the wall-clock minute), in-process. With horizontal scaling on Fly the effective ceiling scales with instance count — N instances means up to N × 60 req/min in the worst case. For precise enforcement, the limiter will move to a shared store (Redis / pg) under a follow-up issue.

Response headers

Every authenticated /api/v1 response — success or failure — carries quota headers so clients can self-throttle without trial-and-error:

Header	Type	Meaning
`X-RateLimit-Limit`	integer	The steady-state quota for this key (`60`).
`X-RateLimit-Remaining`	integer	Requests left in the current window at the time the response was produced. May be `0` immediately before a `429`.
`Retry-After`	integer	Only on `429` responses. Seconds to wait before the window resets. Always honour this.

X-RateLimit-Reset (UNIX seconds at which the window resets) is not emitted in MVP — derive the reset by adding Retry-After to the response time, or rely on retry-after directly. It will be added in the same follow-up that introduces tiered limits.

`429 rate_limited`

When the bucket is empty and a request arrives, the server returns:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 7
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0

{
  "error": {
    "code": "rate_limited",
    "message": "Rate limit exceeded. Retry in 7s."
  }
}

Handling a 429

Honour Retry-After. Sleep for that many seconds, then retry the same request. Do not retry sooner.
Add jitter if you have parallel workers sharing a key — uniform sleeps from many workers reconverge into the same spike. Sleep Retry-After + uniform(0, 0.25 * Retry-After).
Cap retries. Three or four attempts is plenty; keep going past that and you are masking a bug somewhere upstream of the API client.
Watch X-RateLimit-Remaining. When it drops below ~10% of X-RateLimit-Limit, slow down voluntarily rather than racing to the cliff.

# Minimal bash retry loop honouring Retry-After.
attempt=0
while true; do
  attempt=$((attempt + 1))
  response=$(curl -sS -D /tmp/h \
    https://api.trademarksentinel.app/api/v1/watches \
    -H "Authorization: Bearer ts_REPLACE_ME")
  status=$(awk 'NR==1 {print $2}' /tmp/h)
  case "$status" in
    429)
      [ "$attempt" -ge 4 ] && { echo "giving up after $attempt attempts" >&2; exit 1; }
      retry=$(awk 'tolower($1)=="retry-after:" {print $2}' /tmp/h | tr -d "\r")
      echo "rate limited; sleeping ${retry}s" >&2
      sleep "$retry" ;;
    2*)
      echo "$response"; break ;;
    *)
      echo "non-retryable status: $status" >&2; exit 1 ;;
  esac
done

Designing around the quota

Batch list reads. A single GET /alerts?limit=200&since=... is one request and gives you up to 200 records; one-record-at-a-time polling is the pathological pattern.
Use since for incremental sync. Re-listing every alert you have ever seen on every poll wastes the bucket.
One key per system, not per call site. A shared key serialised through a per-process limiter is easier to reason about than a key per microservice with no shared budget.
Contact support for Enterprise bumps if your steady state genuinely exceeds 600 req/min. We would rather raise the quota than have customers re-implement client-side request coalescing badly.