Rate Limiting
Sleeved API rate limits — fixed 15-minute windows, per-key quotas, 429 response handling, and recommendations for backoff strategies.
The Sleeved API enforces rate limits per API key to ensure fair use across integration partners.
Request Limits
| Scope | Limit |
|---|---|
| API requests | Per 15-minute window (limit set per key, typically 3,000) |
| Share token generation (Sleeved users) | 20 tokens per hour per user account |
The per-key request limit may differ from the default depending on your agreement — see Authentication for contact details.
Window Behavior
The rate limit uses a fixed window — the window resets every 15 minutes on a fixed schedule, not 15 minutes after your first request. This means your available quota can reset at any point within a 15-minute period.
Fixed window behavior to keep in mind:
- A burst of requests near the end of a window may exhaust your quota, then a large portion of that quota resets at the window boundary
- Unlike a sliding window, requests from early in the window do not "age out" gradually — the full window resets at once
Rate Limit Exceeded
When you exceed the limit, the API returns 429 Too Many Requests:
{
"error": "Rate limit exceeded"
}No Retry-After header is guaranteed. Wait until the next 15-minute boundary before retrying, or implement the backoff strategy below.
Handling 429 Responses
Implement exponential backoff when you receive a 429:
- On first
429: wait 1–2 seconds before retrying - On second consecutive
429: double the wait time - Continue doubling, up to a reasonable ceiling (e.g., 60 seconds)
- Add a small random jitter (±10–20%) to each wait to prevent synchronized retries when multiple instances are running
Example retry logic (pseudocode):
async function fetchWithBackoff(url, options, maxRetries = 5) {
let delay = 1000; // start at 1 second
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const response = await fetch(url, options);
if (response.status !== 429) {
return response;
}
if (attempt === maxRetries) {
throw new Error("Rate limit exceeded after max retries");
}
const jitter = delay * 0.1 * (Math.random() * 2 - 1);
await sleep(delay + jitter);
delay = Math.min(delay * 2, 60_000);
}
}Efficient Card Sync
The card pagination endpoint is the most common source of rate limit pressure during initial sync. To minimize request count during a full index sync:
- Use the maximum
hitsPerPagevalue of1000to minimize the number of requests required - Check the sync metadata endpoint before starting a sync — if
lastUpdatedAthas not changed since your last sync, skip the full pull - Cache card data locally and re-sync incrementally rather than doing full pulls frequently