Your senior hands you a PR with a working endpoint that has p99 of 4 seconds. Add EXPLAIN ANALYZE, fix the query, add Redis caching, add rate limiting. Code review in 24 hours.
This challenge is the exact bar for a mid-level backend engineer: independently diagnose N+1, add caching with TTL, implement rate limiting. If you can do all three in 24 hours with tests, you are operating at mid-level. Focus on using EXPLAIN ANALYZE before adding any fix.
Senior engineers do all three AND explain why (not just how). In code review, a senior catches N+1 in the PR diff without running the code. They specify TTL based on data freshness requirements, not "5 minutes sounds right." They design rate limit tiers, not a single limit.
Your senior hands you a PR with a working endpoint that has p99 of 4 seconds. Add EXPLAIN ANALYZE, fix the query, add Redis caching, add rate limiting. Code review in 24 hours.
GET /api/feed working but p99=4.2s. k6: fails at 200 concurrent users. Senior says: fix it.
WARNINGEXPLAIN ANALYZE run. N+1 found: 20 author queries per feed request. Fix: single JOIN query. p99: 180ms.
Redis cache-aside added with 5min TTL + jitter. Cache hit p99: 3ms. Load test: passes 500 concurrent users.
Rate limiting added: 60 requests/minute per user ID (token bucket in Redis). 429 returns Retry-After header.
Code review: 3 comments (naming, test coverage), 0 architecture comments. PR merged.
The question this raises
When a working endpoint has 4-second p99, N+1 queries, no caching, and no rate limiting — and you have 24 hours to fix it — where do you start?
Your ORM code loads 20 posts for a user's feed, then accesses post.author.name for each post in a loop. How many database queries does this make?
Lesson outline
How this concept changes your thinking
ORM query for related data
“Fetch 20 posts. For each post, ORM lazy-loads the author: 20 separate SELECT queries. N+1 = 21 total queries. Invisible in development at N=5, catastrophic in production at N=20 under load.”
“Eager load: single JOIN query fetches posts + authors together. 1 query instead of 21. ORM: include("author"). Raw SQL: LEFT JOIN users ON posts.author_id = users.id.”
Repeated identical requests
“Every GET /feed/123 hits the database. 1000 users refreshing their feed per second = 1000 DB queries/second for the same data. Database CPU at 90%. Connection pool exhausted.”
“Cache-aside with TTL jitter: first request fetches from DB and caches. Next 999 requests served from Redis in 3ms. Database sees 1 query per cache TTL window, not 1000/second.”
No rate limiting on read endpoints
“One user with a script can call GET /feed/123 in a tight loop: 10,000 requests/minute. Connection pool saturates. All other users experience slow responses or errors.”
“Token bucket: 60 requests/minute per authenticated user ID. 429 response with Retry-After header. Legitimate users: unaffected. Scripts: blocked at gateway, not at database.”
k6 load test results — 200 concurrent users — FAIL
scenarios: 200 VUs | http_req_duration p99=4.218s (threshold: 500ms) FAIL | http_req_failed rate=34.2% | db.connection_pool.active: 48/50 (96%) | db.query_count: 4,200/sec (21 queries per request x 200 concurrent) | redis.connected: false | rate_limiting: not configured | Verdict: not production-ready — fix before merge
EXPLAIN ANALYZE on the feed query reveals the N+1. 21 queries per request is visible in pg_stat_statements: the same SELECT users WHERE id = ? query running 20 times per feed load, with the same execution plan.
1-- pg_stat_statements: find N+1 pattern2-- (same query template running many times per second)3SELECT4left(query, 80) AS query_snippet,5calls,6round(mean_exec_time::numeric, 2) AS mean_ms,7round(total_exec_time::numeric, 0) AS total_ms8FROM pg_stat_statements9WHERE query LIKE '%SELECT%users%WHERE%id%'10ORDER BY calls DESC11LIMIT 5;1213-- Output revealing the N+1:84,000 calls of the same single-author query = N+1. Total time: 176 seconds wasted on redundant queries.14-- query_snippet calls mean_ms total_ms15-- SELECT id, name, avatar FROM users WHERE id = $1 84000 2.1 17640016-- 84000 calls = 4200 requests * 20 authors each17-- mean_ms=2.1ms looks innocent — total_ms=176,400ms is the damage1819-- EXPLAIN ANALYZE on the N+1 query (single author fetch):20EXPLAIN ANALYZE SELECT id, name, avatar FROM users WHERE id = 123;21-- Index Scan using users_pkey on users (cost=0.42..2.44 rows=1)22-- Actual time=0.043..0.045 rows=1 <- fast per query23-- But 20x per feed request = 40ms wasted per feed load on author fetches alone24The JOIN fetches all 20 posts and all 20 authors in a single round-trip to the database.25-- Fix: single JOIN query26EXPLAIN ANALYZE27SELECT p.*, u.id AS author_id, u.name AS author_name, u.avatar28FROM posts p29LEFT JOIN users u ON p.author_id = u.id30WHERE p.user_id = 12331ORDER BY p.created_at DESC32LIMIT 20;33-- Index Scan using idx_posts_user_created on posts34-- Hash Join on users35-- Actual time=0.8..1.2 rows=20 <- 1 query for all 20 posts + all 20 authors
| Fix | Tool | Before | After | Verification |
|---|---|---|---|---|
| N+1 query | EXPLAIN ANALYZE + JOIN | 21 queries, 4.2s p99 | 1 query, 180ms p99 | pg_stat_statements: author query calls drop from 84K/min to 0 |
| No caching | Redis cache-aside + TTL jitter | 180ms every request | 3ms cache hit, 180ms miss | redis INFO: keyspace_hits/misses ratio > 95% hit rate |
| No rate limiting | Token bucket (Redis SETNX) | Unlimited, pool saturates | 60 req/min/user, 429 on excess | Load test: 1000 req/user/min → 429 after 60, pool stays <20% |
The order matters: fix query first, then cache
Fix N+1 before adding Redis. Caching a 4.2s query means every cache miss is still 4.2s — and under load, cache misses are frequent. After the JOIN fix: cache misses are 180ms, cache hits are 3ms. Now caching is worth it.
Strong vs weak interview answers for the "slow API" scenario
N+1 queries
📖 What the exam expects
N+1 occurs when you fetch N parent records and then issue one additional query per record to fetch related data, resulting in N+1 total database queries instead of 1.
Toggle between what certifications teach and what production actually requires
The "fix this slow endpoint" scenario appears in backend interviews across FAANG and growth-stage companies. The evaluation is: do you diagnose before fixing, do you know the correct order of fixes (query first, cache second, rate limit third), and do you verify with metrics?
Common questions:
Strong answers include:
Red flags:
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.