Rate Limiting

Rate limiting restricts how many requests a client (or your application) can make to an API within a given time window. API providers use it to ensure fair usage and system stability. Consumers use it to avoid overwhelming downstream services.

How Rate Limiting Works

An API sets a maximum number of allowed requests per time window. Common formats:

100 requests per minute per API key
10 requests per second per IP address
1,000 requests per hour per account

When you exceed the limit, the API returns a 429 Too Many Requests response:

{
  "error": "rate_limit_exceeded",
  "retry_after": 30
}

Rate Limiting and Task Queues

Task queues solve rate limiting problems naturally. Instead of sending 1,000 requests at once and getting 900 rejected, you queue all 1,000 and process them at a controlled pace:

// Without a queue: 900 of these fail with 429
for (const item of items) {
  await callAPI(item); // bursts all at once
}

// With AsyncQueue: all 1,000 succeed over time
for (const item of items) {
  await aq.tasks.create({
    callbackUrl: 'https://api.example.com/process',
    payload: item,
    retries: 3,
  });
}

AsyncQueue processes tasks at a sustainable rate. If an API returns 429, the task is retried automatically with exponential backoff.

Common Rate Limit Headers

Header	Meaning
`X-RateLimit-Limit`	Max requests allowed in the window
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when the window resets
`Retry-After`	Seconds to wait before retrying

Rate Limiting

How Rate Limiting Works

Rate Limiting and Task Queues

Common Rate Limit Headers

Related Terms