Page cover

Usage Tracking & Quotas

Acceso meters usage at the infrastructure layer. Metering is associated with API keys. It is aggregated into quota windows for enforcement and billing.

  • Flat: every accepted request costs 1.

  • Weighted: expensive routes cost N units.

  • Hybrid: weight by endpoint category and response size.

Common models:

The default unit is “request cost”. Not all endpoints need to cost the same.

What counts as “usage”

Rate limits control short bursts. Quotas control long-window totals. You usually want both.

Quota windows

Quotas can be enforced on:

  • Daily windows (good for abuse containment).

  • Monthly windows (good for billing plans).

  • Custom windows (enterprise contracts).

Quota burn is usually concurrency-driven. Cap parallel requests per key.

Control concurrency

Cache stable metadata (token symbol/decimals). Cache historical lookups. Revalidate only when needed.

Cache aggressively on the client

Use pagination and filters. Avoid N+1 account fan-outs when possible.

Prefer coarse queries over many small queries

Practical patterns to reduce quota pressure

If you need strict semantics, treat them as contract-level. Do not infer from one-off observations.

  • 401 / 403 are rejected early and do not count.

  • 429 responses do not count.

  • Successful requests count.

  • Some deployments count 5xx only if domain work started.

Counting rules are policy-driven. A common default:

What gets billed / counted

Example: reading quota headers with curl
  • RateLimit-Limit: allowed units for the current window.

  • RateLimit-Remaining: remaining units in the window.

  • RateLimit-Reset: seconds until the window resets.

Common fields:

Responses can include headers that describe current usage state. Use them to self-throttle without guesswork.

Usage headers (client feedback loop)

  • quota_exceeded

Typical error code:

Quota-exceeded requests return 429. They fail fast at the gateway.

Enforcement behavior

Windows are evaluated per API key. One key cannot spend another key’s quota.

Example: adaptive concurrency (pseudocode)

Last updated