Rate Limiting

Sleeved API rate limits — fixed 15-minute windows, per-key quotas, 429 response handling, and recommendations for backoff strategies.

The Sleeved API enforces rate limits per API key to ensure fair use across integration partners.

Request Limits

Scope	Limit
API requests	Per 15-minute window (limit set per key, typically 3,000)
Share token generation (Sleeved users)	20 tokens per hour per user account

The per-key request limit may differ from the default depending on your agreement — see Authentication for contact details.

The rate limit uses a fixed window — the window resets every 15 minutes on a fixed schedule, not 15 minutes after your first request. This means your available quota can reset at any point within a 15-minute period.

Fixed window behavior to keep in mind:

A burst of requests near the end of a window may exhaust your quota, then a large portion of that quota resets at the window boundary
Unlike a sliding window, requests from early in the window do not "age out" gradually — the full window resets at once

Rate Limit Exceeded

When you exceed the limit, the API returns 429 Too Many Requests:

{
  "error": "Rate limit exceeded"
}

No Retry-After header is guaranteed. Wait until the next 15-minute boundary before retrying, or implement the backoff strategy below.

Handling 429 Responses

Implement exponential backoff when you receive a 429:

On first 429: wait 1–2 seconds before retrying
On second consecutive 429: double the wait time
Continue doubling, up to a reasonable ceiling (e.g., 60 seconds)
Add a small random jitter (±10–20%) to each wait to prevent synchronized retries when multiple instances are running

Example retry logic (pseudocode):

async function fetchWithBackoff(url, options, maxRetries = 5) {
  let delay = 1000; // start at 1 second

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    if (attempt === maxRetries) {
      throw new Error("Rate limit exceeded after max retries");
    }

    const jitter = delay * 0.1 * (Math.random() * 2 - 1);
    await sleep(delay + jitter);
    delay = Math.min(delay * 2, 60_000);
  }
}

Efficient Card Sync

The card pagination endpoint is the most common source of rate limit pressure during initial sync. To minimize request count during a full index sync:

Use the maximum hitsPerPage value of 1000 to minimize the number of requests required
Check the sync metadata endpoint before starting a sync — if lastUpdatedAt has not changed since your last sync, skip the full pull
Cache card data locally and re-sync incrementally rather than doing full pulls frequently

Request Limits

Window Behavior

Rate Limit Exceeded

Handling 429 Responses

Efficient Card Sync

On this page