Unkey’s rate limiting is designed for global, low-latency enforcement across distributed systems.Documentation Index
Fetch the complete documentation index at: https://unkey.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Architecture
When you calllimiter.limit(identifier):
- Request hits the nearest Unkey location
- Counter is checked and updated
- Decision returned in ~30ms globally
Sliding window algorithm
Unkey uses a sliding window algorithm that provides smooth rate limiting without the “burst at window start” problem of fixed windows. Fixed window problem:- Limit: 100/minute
- User sends 100 requests at 0:59
- Window resets at 1:00
- User sends 100 more at 1:01
- Result: 200 requests in 2 seconds ❌
- Limit: 100/minute
- Considers requests from the past 60 seconds at any point
- No burst exploitation possible
Global consistency
Rate limits are enforced consistently across all regions. A user can’t bypass limits by hitting different geographic endpoints.Cross-region denial propagation
When an identifier crosses its limit in any region, every other region picks up the denial within a few seconds and starts rejecting the same identifier locally — even before that region sees any of the abusive traffic firsthand. The window is honored end to end: as the offending window decays, every region releases the identifier at the same time. This means a single attacker hitting your API from multiple geographies can’t multiply their effective limit by the number of regions they hit. Once any region denies them, every region denies them. You don’t have to enable or configure anything — propagation runs automatically for every namespace.Cross-region enforcement applies to limits with a window of at least 1 minute.
Shorter windows (for example, per-second burst limits) are enforced per region
only, because the propagation roundtrip takes longer than the window itself.
Response fields
Every rate limit check returns:| Field | Type | Description |
|---|---|---|
success | boolean | true if request is allowed |
limit | number | The configured limit |
remaining | number | Requests left in current window |
reset | number | Unix timestamp (ms) when window resets |
Handling the response
Cost-based limiting
Not all requests are equal. Usecost to deduct more from the limit for expensive operations:
- 100 normal requests, OR
- 20 expensive requests, OR
- Mix of both
Track token consumption in the dashboard
When you use cost-based limiting, the rate limit overview in your Unkey dashboard surfaces token usage per identifier alongside request counts. Each row in the namespace logs table includes:| Column | What it shows |
|---|---|
Passed Requests | Number of requests that were allowed in the selected window |
Blocked Requests | Number of requests that were denied in the selected window |
Passed Tokens | Sum of cost deducted from the limit by allowed requests |
Blocked Tokens | Sum of cost that would have been required by denied requests |
cost value, every request counts as 1 token, so the token columns mirror the request columns.
Timeout and fallback
Configure behavior when Unkey is unreachable:Next steps
Custom overrides
Give specific users different limits
SDK reference
Full SDK documentation

