logo

Rate Limiting

Rate limiting restricts how many requests a client (or your application) can make to an API within a given time window. API providers use it to ensure fair usage and system stability. Consumers use it to avoid overwhelming downstream services.

How Rate Limiting Works

An API sets a maximum number of allowed requests per time window. Common formats:

  • 100 requests per minute per API key
  • 10 requests per second per IP address
  • 1,000 requests per hour per account

When you exceed the limit, the API returns a 429 Too Many Requests response:

{
  "error": "rate_limit_exceeded",
  "retry_after": 30
}

Rate Limiting and Task Queues

Task queues solve rate limiting problems naturally. Instead of sending 1,000 requests at once and getting 900 rejected, you queue all 1,000 and process them at a controlled pace:

// Without a queue: 900 of these fail with 429
for (const item of items) {
  await callAPI(item); // bursts all at once
}

// With AsyncQueue: all 1,000 succeed over time
for (const item of items) {
  await aq.tasks.create({
    callbackUrl: 'https://api.example.com/process',
    payload: item,
    retries: 3,
  });
}

AsyncQueue processes tasks at a sustainable rate. If an API returns 429, the task is retried automatically with exponential backoff.

Common Rate Limit Headers

HeaderMeaning
X-RateLimit-LimitMax requests allowed in the window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the window resets
Retry-AfterSeconds to wait before retrying