Page cover

Rate Limit Enforcement

Client-visible response

Rate limits control burst throughput. Quotas control long-window totals. You usually want both.

Where enforcement happens

Decision flow (gateway)

The gateway decision path is:

Rate limiting runs in the API gateway. No domain services are invoked on rejection.

Typical implementation characteristics:

  1. Classify the route (product + endpoint category).

  2. Compute request cost (flat or weighted).

  3. Evaluate counters for the active window(s).

  4. Allow or reject.

  • Atomic increments (Lua script or equivalent) to avoid race conditions.

  • Rolling windows or leaky-bucket style smoothing for bursts.

  • Server-time windowing to avoid client clock skew.

  • Weighted accounting, where 1 request can cost N units.

Enforcement model

Acceso typically uses Redis-backed counters with atomic increments. Implementations commonly use:

What can change the limit

Common headers:

Rate limit state is surfaced via response headers. Header names may vary by deployment. The semantics are stable.

Client-visible headers

Limits are evaluated per API key. Policies can vary by:

  • Endpoint category (cheap vs expensive).

  • HTTP method (read vs write, when applicable).

  • Request cost weight (route-specific).

  • Tier / plan (key-specific).

  • RateLimit-Limit: max units per window.

  • RateLimit-Remaining: units left in the active window.

  • RateLimit-Reset: seconds until the window resets.

  • Retry-After: seconds the client should wait before retrying (best-effort).

Example: headers on a successful response

Over-limit response (429)

1) Throttle proactively

2) Backoff correctly on 429

2) Back off correctly on 429

Use RateLimit-Remaining to cap concurrency. Reduce parallel work as the remaining budget drops.

Respect Retry-After when present. Add jitter to prevent synchronized retries.

3) Avoid high-frequency polling

If Redis is unavailable, the gateway must choose a safety mode. The common default is fail-closed to protect upstream dependencies. Some deployments enable controlled fail-open for non-critical routes.

Common defaults:

If the rate-limit store is unavailable, the gateway must choose a safety mode.

Failure modes and safety rails

Operational controls often include:

Acceso can also apply:

  • Fail-closed to protect upstream dependencies.

  • Controlled fail-open only for non-critical routes (deployment-specific).

  • Per-route weights to cap expensive endpoints.

  • Burst allowances to smooth spiky workloads.

  • Temporary overrides during incident response.

Last updated