
Usage Tracking & Quotas
Acceso meters usage at the infrastructure layer. Metering is associated with API keys. It is aggregated into quota windows for enforcement and billing.
Flat: every accepted request costs
1.Weighted: expensive routes cost
Nunits.Hybrid: weight by endpoint category and response size.
Common models:
The default unit is “request cost”. Not all endpoints need to cost the same.
What counts as “usage”
Quota windows
Quotas can be enforced on:
Daily windows (good for abuse containment).
Monthly windows (good for billing plans).
Custom windows (enterprise contracts).
Quota burn is usually concurrency-driven. Cap parallel requests per key.
Control concurrency
Cache stable metadata (token symbol/decimals). Cache historical lookups. Revalidate only when needed.
Cache aggressively on the client
Use pagination and filters. Avoid N+1 account fan-outs when possible.
Prefer coarse queries over many small queries
Practical patterns to reduce quota pressure
If you need strict semantics, treat them as contract-level. Do not infer from one-off observations.
401/403are rejected early and do not count.429responses do not count.Successful requests count.
Some deployments count
5xxonly if domain work started.
Counting rules are policy-driven. A common default:
What gets billed / counted
RateLimit-Limit: allowed units for the current window.RateLimit-Remaining: remaining units in the window.RateLimit-Reset: seconds until the window resets.
Common fields:
Responses can include headers that describe current usage state. Use them to self-throttle without guesswork.
Usage headers (client feedback loop)
quota_exceeded
Typical error code:
Quota-exceeded requests return 429. They fail fast at the gateway.
Enforcement behavior
Windows are evaluated per API key. One key cannot spend another key’s quota.
If your workload is inherently high-frequency, polling will hurt. Prefer streaming, webhooks, or push-based patterns where available.
Last updated