Your API was fast in development. Now it crawls in production and nobody can figure out why. The logs look fine, the server isn’t maxed out, and the code hasn’t changed.
Slow API performance rarely has a single cause. Multiple factors compound under real-world conditions.
Here are the most common culprits — and what to do about each one.
1. N+1 Query Problem
The single most common cause of sluggish APIs. Your code fetches a list of items, then makes a separate database query for each item’s related data.
// N+1: 1 query for orders + N queries for customers
const orders = await db.orders.findAll();
for (const order of orders) {
order.customer = await db.customers.findById(order.customerId);
}
With 100 orders, that’s 101 database round-trips. With 10,000 orders, the endpoint takes minutes.
The fix
Use eager loading or batch queries:
// 2 queries total, regardless of order count
const orders = await db.orders.findAll({
include: [{ model: db.customers }],
});
2. Missing Database Indexes
Every unindexed WHERE clause triggers a full table scan. On small tables the delay goes unnoticed. On tables with millions of rows, the impact proves catastrophic.
-- Without an index on customer_email, this scans every row
SELECT * FROM orders WHERE customer_email = 'user@example.com';
-- With an index: instant
CREATE INDEX idx_orders_customer_email ON orders(customer_email);
How to detect
Most databases provide query analysis tools:
-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_email = 'user@example.com';
-- Look for "Seq Scan" — that means no index is being used
3. Synchronous External API Calls
Your API calls a third-party service and blocks until it responds. If that service is slow — even occasionally — your API inherits the delay.
app.get('/dashboard', async (req, res) => {
const user = await db.users.findById(req.userId); // 5ms
const analytics = await fetch('https://analytics.io/data'); // 800ms
const notifications = await fetch('https://notify.io/unread'); // 200ms
// Total: 1,005ms — but only 5ms was your code
res.json({ user, analytics, notifications });
});
The fix
Parallelize independent calls:
app.get('/dashboard', async (req, res) => {
const [user, analytics, notifications] = await Promise.all([
db.users.findById(req.userId),
fetch('https://analytics.io/data'),
fetch('https://notify.io/unread'),
]);
// Total: max(5ms, 800ms, 200ms) = 800ms — 20% faster
res.json({ user, analytics, notifications });
});
For calls that don’t need real-time results, offload them to a task queue and serve cached data instead.
4. No Response Caching
If the same data is requested repeatedly and rarely changes, recomputing it each time wastes resources.
// Called 1,000 times per minute, returns the same data
app.get('/api/pricing', async (req, res) => {
const plans = await db.plans.findAll(); // 15ms
const features = await db.features.findAll(); // 12ms
const discounts = await calculateDiscounts(plans); // 25ms
res.json({ plans, features, discounts }); // Total: 52ms × 1,000 = wasted
});
The fix
Cache at the appropriate layer:
import { Redis } from 'ioredis';
const redis = new Redis();
app.get('/api/pricing', async (req, res) => {
const cached = await redis.get('pricing');
if (cached) return res.json(JSON.parse(cached));
const plans = await db.plans.findAll();
const features = await db.features.findAll();
const discounts = await calculateDiscounts(plans);
const data = { plans, features, discounts };
await redis.set('pricing', JSON.stringify(data), 'EX', 300); // Cache 5 min
res.json(data);
});
5. Oversized Payloads
Returning more data than the client needs wastes both bandwidth and serialization time.
// Returns 50 fields per user, client only uses 3
app.get('/api/users', async (req, res) => {
const users = await db.users.findAll(); // Returns everything
res.json(users); // 2MB response
});
The fix
Select only the fields you need and paginate results:
app.get('/api/users', async (req, res) => {
const page = parseInt(req.query.page) || 1;
const limit = Math.min(parseInt(req.query.limit) || 20, 100);
const users = await db.users.findAll({
attributes: ['id', 'name', 'email'],
limit,
offset: (page - 1) * limit,
});
res.json({ users, page, limit });
});
6. Connection Pool Exhaustion
Every database or HTTP connection carries overhead. If your pool is too small, requests stack up waiting for an open slot.
Request 1: Using connection 1 ✓
Request 2: Using connection 2 ✓
...
Request 10: Using connection 10 ✓
Request 11: Waiting for a connection... (adds 500ms+)
The fix
Size your connection pool based on expected concurrency:
const pool = new Pool({
max: 20, // Max connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000, // Fail fast if pool is exhausted
});
Monitor pool usage in production. If connections are maxed out consistently, increase the pool size or investigate why connections are held open for so long.
7. Expensive Middleware
Middleware runs on every request. Even minor overhead compounds fast.
// This runs on EVERY request
app.use(async (req, res, next) => {
const user = await db.users.findById(req.userId); // 10ms
const perms = await db.permissions.findByUser(user); // 8ms
const team = await db.teams.findById(user.teamId); // 6ms
req.context = { user, perms, team };
next();
});
// 24ms overhead on every single request, even static assets
The fix
- Only run expensive middleware on routes that need it
- Cache middleware results per session or request
- Use lightweight token verification instead of full database lookups
8. DNS Resolution Overhead
Each unique hostname your API calls requires a Domain Name System (DNS) lookup. In containerized environments, DNS resolution can be a surprising bottleneck.
The fix
- Use connection keep-alive to reuse Transmission Control Protocol (TCP) connections
- Configure DNS caching at the operating system or application level
- Reduce the number of unique hostnames your API contacts per request
9. Uncompressed Responses
JSON responses compress well. A 1MB response might shrink to 50KB with gzip.
import compression from 'compression';
app.use(compression()); // 10x–20x smaller responses
10. Long-Running Operations in Request Handlers
The most fundamental performance problem: cramming too much work into a single request.
// This endpoint does 5 minutes of work
app.post('/generate-report', async (req, res) => {
const data = await queryMillionsOfRows();
const report = await generatePDF(data);
const uploaded = await uploadToS3(report);
await sendEmail(req.user.email, uploaded.url);
res.json({ url: uploaded.url });
});
The fix
Offload to a task queue. The endpoint should accept the work and respond immediately, using a webhook for result delivery:
app.post('/generate-report', async (req, res) => {
const task = await asyncqueue.tasks.create({
callbackUrl: 'https://your-app.com/api/internal/generate-report',
payload: { userId: req.user.id, reportType: req.body.type },
webhookUrl: 'https://your-app.com/api/on-report-ready',
retries: 2,
});
res.json({ taskId: task.id, status: 'generating' });
});
Diagnosing Slow APIs: A Checklist
When investigating a sluggish API, work through these in order:
- Check response times by endpoint — is it one route or all of them?
- Enable slow query logging — are database queries taking too long?
- Count queries per request — do you have an N+1 problem?
- Measure external call latency — are third-party APIs the bottleneck?
- Check resource utilization — CPU, memory, connections, disk I/O
- Measure response payload size — are you sending too much data?
- Profile the code — where is time actually being spent?
Conclusion
Sluggish APIs rarely suffer from a single flaw. You’ll typically find a combination: an N+1 query masked by a missing index, amplified by synchronous external calls, on an endpoint that should have been cached or offloaded to a queue.
Start with measurement and fix the biggest bottleneck first. Keep going until your p95 latency hits an acceptable target. For anything inherently slow — report generation, file processing, external API orchestration — don’t fight it. Offload it to AsyncQueue and keep your API fast.