Interactive Explainer

🎯Key Takeaways

APIs are contracts — additive changes are safe, anything else requires versioning

Use correct HTTP methods and status codes — they carry semantic meaning

Use cursor-based pagination, not offset-based — offset pagination drifts on inserts

GraphQL requires DataLoader — without it, you create N+1 query catastrophes

Rate limiting is not optional — protect your API from abuse and accidental DDoS

IDOR (Insecure Direct Object Reference) is the most common API vulnerability — always verify ownership

API Design & Backend Architecture

REST vs GraphQL vs gRPC, versioning strategies, authentication patterns, rate limiting, and the backend architecture decisions that define how well your system scales.

~7 min read

Be the first to complete!

What you'll learn

APIs are contracts — additive changes are safe, anything else requires versioning
Use correct HTTP methods and status codes — they carry semantic meaning
Use cursor-based pagination, not offset-based — offset pagination drifts on inserts
GraphQL requires DataLoader — without it, you create N+1 query catastrophes
Rate limiting is not optional — protect your API from abuse and accidental DDoS
IDOR (Insecure Direct Object Reference) is the most common API vulnerability — always verify ownership

Lesson outline

APIs are contracts — break them and you break trust

An API is a promise to every developer who calls it. Breaking that promise — removing a field, changing a status code, redefining the semantics of an endpoint — breaks their code in production. Good API design means thinking about evolution, not just the current state.

The cost of a bad API compounds: once mobile clients depend on it, you cannot change it without a forced app update. Once 50 microservices call it, a breaking change requires coordinated deployment across 50 teams. Design as if you can never change it — because in practice, you often cannot.

REST: the right way (not the common way)

Most APIs called "REST" are actually just JSON over HTTP. Real REST follows constraints: uniform interface, statelessness, cacheable responses, layered system.

Resource naming: Nouns, not verbs. `/users/42/orders` not `/getUserOrders`. Use plural nouns for collections (`/orders`), singular for specific resources (`/orders/42`).

HTTP methods semantics: GET (read, idempotent, cacheable), POST (create, not idempotent), PUT (replace entire resource, idempotent), PATCH (partial update), DELETE (remove, idempotent). Using POST for everything is a code smell.

Status codes mean something: 200 OK, 201 Created (with Location header), 204 No Content (successful DELETE), 400 Bad Request (client error, do not retry), 401 Unauthorized (missing/invalid auth), 403 Forbidden (valid auth, insufficient permissions), 404 Not Found, 409 Conflict (e.g. duplicate), 429 Too Many Requests, 500 Internal Server Error (server error, safe to retry with backoff).

Versioning: URI versioning (`/v1/users`) is most visible. Header versioning (`Accept: application/vnd.myapi.v2+json`) is more RESTful but less debuggable. Never version without a deprecation timeline — communicate sunset dates.

Additive changes are non-breaking

You can safely add new fields to responses, add new optional request parameters, and add new endpoints. You cannot remove fields, rename them, change their type, or change endpoint behavior. Design with this constraint in mind from day one.

api-design-patterns.ts

1// ✅ GOOD REST API Design
2 
3// Resource-based URLs (nouns, not verbs)
4GET    /v1/orders              // List orders (paginated)
5POST   /v1/orders              // Create order → 201 + Location: /v1/orders/42
6GET    /v1/orders/42           // Get order
7PATCH  /v1/orders/42           // Partial update (status, notes)
8DELETE /v1/orders/42           // Cancel order → 204 No Content
9 
10// Nested resources for relationships
11GET    /v1/orders/42/items     // Line items in this order
12POST   /v1/orders/42/items     // Add item to order
13 
14// ✅ GOOD Error Response (RFC 7807 Problem Details)
RFC 7807 Problem Details is the industry standard for error responses
15{
16  "type": "https://api.example.com/errors/insufficient-inventory",
17  "title": "Insufficient Inventory",
18  "status": 409,
19  "detail": "Product SKU-123 has 0 units available; requested 5",
20  "instance": "/v1/orders",
21  "requestId": "req_abc123",          // for support tracing
22  "retryAfter": null                   // no point retrying
23}
24 
25// ✅ GOOD Pagination (cursor-based — does not drift on inserts)
26GET /v1/orders?cursor=eyJpZCI6NDJ9&limit=20
27 
28// Response:
29{
30  "data": [...],
31  "pagination": {
32    "nextCursor": "eyJpZCI6NjJ9",
Cursor pagination is O(1) and stable — offset pagination is O(n) and drifts
33    "hasMore": true
34  }
35}
36 
37// ❌ BAD: Offset pagination drifts when new items are inserted
38// GET /v1/orders?page=2&perPage=20
39// If 5 new orders are inserted, page 2 shows duplicates from page 1

GraphQL: when it helps and when it hurts

GraphQL lets clients request exactly the data they need — no over-fetching (getting more than needed) or under-fetching (needing multiple requests). One endpoint, flexible queries. Beloved by frontend teams.

When GraphQL wins: You have many client types (mobile, web, partners) with different data needs. Your data is a graph (social network, e-commerce product catalog with relationships). You want to eliminate multiple API round-trips.

When GraphQL hurts: N+1 query problem — if you have a list of 100 posts and each has an author, GraphQL naively makes 1 query for posts + 100 queries for authors. Fix with DataLoader (batch + deduplicate). HTTP caching is harder (all queries are POST to the same endpoint). Complexity in error handling (HTTP 200 even on errors).

DataLoader is not optional: Any production GraphQL server needs DataLoader for batching. Without it, a single query can generate thousands of database queries.

graphql-dataloader.ts

1import DataLoader from 'dataloader';
2import { db } from './db';
3 
4// ❌ WITHOUT DataLoader — N+1 queries
5// Query: { posts { id title author { name } } }
6// Result: 1 DB query for posts + 100 DB queries for authors
7 
8// ✅ WITH DataLoader — 2 total queries regardless of list size
9const userLoader = new DataLoader(async (userIds: readonly string[]) => {
DataLoader batches calls within a single event loop tick
10  // Called ONCE with all user IDs batched together
11  const users = await db
12    .select()
13    .from(usersTable)
14    .where(inArray(usersTable.id, [...userIds]));
15 
16  // CRITICAL: Return in same order as input IDs
17  const userMap = new Map(users.map(u => [u.id, u]));
Must return in same order as input — common gotcha
18  return userIds.map(id => userMap.get(id) ?? new Error(`User ${id} not found`));
19});
20 
21// GraphQL resolver — called per-post but batched by DataLoader
22const postResolvers = {
23  Post: {
24    author: (post: Post) => userLoader.load(post.authorId),
25  },
26};
27 
28// Now 100 posts = 1 SQL query for posts + 1 SQL query for all authors
29// SELECT * FROM users WHERE id IN (1, 2, 3, ..., 100)

gRPC: for internal service-to-service communication

gRPC uses Protocol Buffers (binary serialization) over HTTP/2. It is 5-10x faster than JSON REST for internal communication, supports streaming, and provides strong typing through `.proto` files.

When to use gRPC: Internal microservices communication where latency matters, streaming (server → client, client → server, bidirectional), polyglot environments (`.proto` generates clients in any language).

When NOT to use: Public APIs (browser support requires gRPC-Web proxy), teams unfamiliar with Protobuf, when debugging ease is more important than performance.

Protocol	Payload	Speed	Streaming	Browser Support	Best For
REST/JSON	Text (verbose)	Baseline	❌ (SSE only)	✅ Native	Public APIs, external clients
GraphQL	Text (flexible)	Similar to REST	✅ Subscriptions	✅ Native	Frontend-heavy apps
gRPC	Binary (compact)	5-10x faster	✅ Bidirectional	⚠️ gRPC-Web only	Internal microservices
WebSocket	Binary or Text	Very fast	✅ Full-duplex	✅ Native	Real-time features (chat, live)

Rate limiting: protecting your API from abuse (and yourself)

Rate limiting enforces a maximum request rate per client (by IP, API key, or user). Without it, a single misconfigured client can DDOS your service or run up your database bill.

Token bucket algorithm: Each client has a bucket of tokens (capacity N). Each request consumes 1 token. Tokens replenish at a fixed rate. Allows bursts up to N, then smooths to the replenishment rate. Most common for API rate limiting.

Leaky bucket: Requests are queued and processed at a fixed rate. No bursts allowed — good for smoothing traffic to a downstream service.

Sliding window: Count requests in the last N seconds using a circular buffer or Redis sorted set. More accurate than fixed window (which allows 2x the limit at window boundaries).

Where to enforce: API Gateway level (early rejection, protect entire backend), middleware (per-service policy), database query rate (protect the database from application layer)

rate-limiter-redis.ts

1import { Redis } from 'ioredis';
2 
3const redis = new Redis();
4 
5// Sliding Window Rate Limiter using Redis Sorted Set
6// Allows N requests per windowMs per key
7async function isRateLimited(
8  key: string,
9  limit: number,
10  windowMs: number
11): Promise<{ limited: boolean; remaining: number; resetAt: number }> {
12  const now = Date.now();
13  const windowStart = now - windowMs;
Remove expired entries before counting — critical for accuracy
14 
15  const pipeline = redis.pipeline();
Pipeline all Redis ops in one round-trip for performance
16  pipeline.zremrangebyscore(key, 0, windowStart);   // Remove old entries
17  pipeline.zadd(key, now, `${now}-${Math.random()}`); // Add current request
18  pipeline.zcard(key);                               // Count requests in window
19  pipeline.pexpire(key, windowMs);                  // Auto-cleanup
20 
21  const results = await pipeline.exec();
22  const count = (results?.[2]?.[1] as number) ?? 0;
23 
24  return {
25    limited: count > limit,
26    remaining: Math.max(0, limit - count),
27    resetAt: now + windowMs,
28  };
29}
30 
31// Express middleware
32export function rateLimitMiddleware(limit: number, windowMs: number) {
33  return async (req: Request, res: Response, next: NextFunction) => {
34    const key = `ratelimit:${req.ip}:${req.path}`;
35    const result = await isRateLimited(key, limit, windowMs);
36 
37    res.setHeader('X-RateLimit-Limit', limit);
38    res.setHeader('X-RateLimit-Remaining', result.remaining);
Always return rate limit headers so clients can back off gracefully
39    res.setHeader('X-RateLimit-Reset', result.resetAt);
40 
41    if (result.limited) {
42      return res.status(429).json({
43        error: 'Too Many Requests',
44        retryAfter: Math.ceil(windowMs / 1000),
45      });
46    }
47    next();
48  };
49}

API authentication: choosing the right pattern

API Keys: Simple, long-lived credentials for machine-to-machine. Store hashed (bcrypt), never log raw keys, support rotation. Good for external developer APIs.

JWT (JSON Web Tokens): Short-lived signed tokens issued after login. Self-contained (no database lookup needed to verify). Caveat: cannot be revoked until expiry unless you maintain a token blocklist. Use access tokens (15-60 min) + refresh tokens (long-lived, rotatable).

OAuth 2.0: Delegation protocol — let users grant your app access to their resources on another service (Google, GitHub). Use for "Login with Google" and any cross-service authorization.

mTLS (Mutual TLS): Both client and server present certificates. Used for service-to-service auth in zero-trust networks. Highest security, highest operational overhead.

JWT pitfall: do not store sensitive data

JWTs are base64-encoded, not encrypted. Anyone with the token can decode the payload. Never store passwords, PII, or secrets in a JWT. Sign them (prevent tampering) — consider encrypting (JWE) if the payload is sensitive.

How this might come up in interviews

API design interviews test both technical knowledge and product thinking — can you design an API another team would want to use?

Common questions:

Design a rate limiter for a public API
How would you version an API without breaking existing clients?
REST vs GraphQL — when would you choose each?
How do you handle authentication and authorization in a multi-tenant SaaS API?
What is IDOR and how do you prevent it?

Strong answers include:

Designs for backwards compatibility from the start
Chooses cursor-based pagination for real-time data
Knows the difference between authentication and authorization
Mentions DataLoader when discussing GraphQL at scale

Red flags:

Uses verbs in resource URLs
Returns 200 for all responses including errors
Cannot explain when to use PUT vs PATCH
Does not mention rate limiting or auth in API design

Quick check · API Design & Backend Architecture

1 / 1

A mobile client needs to display a user's profile with their posts and follower count in one screen. With REST, this requires 3 API calls. What is the BEST solution?

Key takeaways

APIs are contracts — additive changes are safe, anything else requires versioning
Use correct HTTP methods and status codes — they carry semantic meaning
Use cursor-based pagination, not offset-based — offset pagination drifts on inserts
GraphQL requires DataLoader — without it, you create N+1 query catastrophes
Rate limiting is not optional — protect your API from abuse and accidental DDoS
IDOR (Insecure Direct Object Reference) is the most common API vulnerability — always verify ownership

From the books

API Design Patterns — JJ Geewax (2021)

Chapter 3: Naming, Chapter 7: Partial Updates, Chapter 12: Pagination

Google's internal API design guide (AIP — API Improvement Proposals) is the gold standard. Follow it even if you are not using Google Cloud — the patterns are battle-tested across thousands of APIs.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A