The battle-tested 45-minute framework every FAANG engineer uses: clarify requirements, estimate capacity, sketch the design, deep-dive bottlenecks. Master the process and you can design any system.
The battle-tested 45-minute framework every FAANG engineer uses: clarify requirements, estimate capacity, sketch the design, deep-dive bottlenecks. Master the process and you can design any system.
Lesson outline
System design interviews have a brutal pass rate at top tech companies — roughly 30-40% even among experienced engineers. The failure mode is almost always the same: candidates skip straight to drawing boxes. They hear "Design Twitter" and immediately start talking about microservices, Kafka, and sharding — before understanding what they're actually building.
Consider what the interviewer sees from the other side of the table. When a candidate jumps straight to solution mode, it signals a critical engineering weakness: they can't gather requirements. In real engineering, requirements gathering is the most important skill. A system built for the wrong requirements is worthless, regardless of how technically sophisticated it is.
The #1 Failure Mode: Premature Architecture
Interviewer says "Design a URL shortener." Candidate immediately says: "I'll use a distributed hash table, Cassandra for storage, Redis for caching, and Kafka for async processing." The interviewer hasn't said anything about scale, consistency requirements, or features. This candidate designed a $2M infrastructure for what might need $200/month.
The second most common failure is poor time management. Candidates spend 30 minutes on requirements and run out of time before drawing a single diagram. Or they over-engineer one component — spending 20 minutes designing the perfect database schema — while never discussing the hard parts: caching, sharding, failure modes.
The third failure is forgetting that system design interviews are collaborative. They're not a test where you write an answer and hand it in. The interviewer wants to have a technical conversation. They'll drop hints. They'll push back. Candidates who ignore this and monologue for 45 minutes miss the collaborative signal entirely.
The 5 Most Common System Design Interview Failure Modes
The good news: system design interviews are learnable. Unlike coding interviews where you either know the algorithm or you don't, system design follows a repeatable process. Learn the 4-step framework and the time budget, and you can design any system competently — even systems you've never built.
The most valuable thing you can do in the first 5-10 minutes of a system design interview is ask questions. Not design questions — requirements questions. You need to understand what you're building before you can design it.
There are two types of requirements you need to uncover: functional requirements (what the system does) and non-functional requirements (how well it does it). Most candidates only ask about functional requirements. Senior engineers ask both.
The Requirements Question Bank
Have these questions ready for every interview: Scale questions: How many DAU? What's peak vs. average traffic? Read-heavy or write-heavy? Consistency questions: Do we need strong consistency or is eventual okay? Can users see stale data? Latency questions: What's the acceptable p99 latency? Real-time or near-real-time? Availability questions: What's the SLA? Can we afford 1 hour of downtime/month? Feature scope: What are must-have vs. nice-to-have features? Any features we explicitly exclude?
For a concrete example, let's say the interviewer asks "Design a news feed system like Facebook." Before drawing anything, you should ask:
Requirements Questions for a News Feed System
01
Scale: How many DAU? 1M or 1B makes a 1000x architectural difference. At 1M DAU you can afford a monolith. At 1B you need distributed everything.
02
Feed generation: Fan-out on write (push to followers at write time) or fan-out on read (pull at read time)? This is the central trade-off in feed systems.
03
Content types: Text only? Photos? Videos? Video changes storage and bandwidth requirements by 100-1000x.
04
Freshness: How stale can the feed be? 5 seconds? 5 minutes? This determines caching strategy entirely.
05
Ranking: Chronological or algorithmic? Algorithmic feed requires ML infrastructure that is a completely separate system.
06
Interactions: Do we need likes, comments, shares? Each adds write amplification.
Scale: How many DAU? 1M or 1B makes a 1000x architectural difference. At 1M DAU you can afford a monolith. At 1B you need distributed everything.
Feed generation: Fan-out on write (push to followers at write time) or fan-out on read (pull at read time)? This is the central trade-off in feed systems.
Content types: Text only? Photos? Videos? Video changes storage and bandwidth requirements by 100-1000x.
Freshness: How stale can the feed be? 5 seconds? 5 minutes? This determines caching strategy entirely.
Ranking: Chronological or algorithmic? Algorithmic feed requires ML infrastructure that is a completely separate system.
Interactions: Do we need likes, comments, shares? Each adds write amplification.
A good requirements conversation might take 8 minutes. The interviewer is watching you closely during this time. They want to see: Do you think systematically? Do you cover both functional and non-functional? Do you set scope before designing?
The Scope Statement — Close the Requirements Phase Properly
Before moving to estimation, summarize what you're building: "So based on what we've discussed: we're designing a news feed for 100M DAU, fan-out on write for users with < 5K followers, read-heavy (100:1 read:write ratio), eventual consistency is fine, and latency target is < 300ms for feed load. I'm going to exclude the ranking algorithm and focus on the feed delivery infrastructure. Does that sound right?" This shows systems thinking and gives the interviewer a chance to correct scope before you design the wrong thing.
The number you absolutely must establish before moving on: DAU (daily active users) or MAU (monthly active users). This single number drives every capacity estimate. If you don't get this number, make an assumption and state it clearly: "I'll assume 10M DAU — is that in the right ballpark?"
| DAU | Avg Requests/Day | QPS (avg) | Peak QPS (3x) | Architecture Implication |
|---|---|---|---|---|
| 100K | 500K | 6 | 18 | Single server, SQLite or Postgres |
| 1M | 5M | 58 | 174 | Vertical scaling, single DB with cache |
| 10M | 50M | 578 | 1,734 | Horizontal scaling, read replicas needed |
| 100M | 500M | 5,787 | 17,361 | Sharded DB, CDN, full caching layer |
| 1B | 5B | 57,870 | 173,611 | Full distributed system, custom infra |
One of the clearest signals that separates L5+ candidates from L4 candidates at FAANG is understanding the difference between functional and non-functional requirements — and spending as much time on non-functional as functional.
Functional requirements answer: What does the system do? Non-functional requirements answer: How well does it do it, and under what constraints? Non-functional requirements are often more important for architecture because they determine which trade-offs to make.
| Requirement Type | Example Question | Example Answer | Architectural Impact |
|---|---|---|---|
| Functional | Can users upload photos? | Yes, up to 10MB, JPG/PNG | Object storage (S3), CDN, image processing pipeline |
| Availability | What's the SLA? | 99.99% (52 min downtime/year) | Multi-AZ deployment, automatic failover, no single points of failure |
| Consistency | Can users see stale data? | Eventually consistent OK for feed | Cache-aside, async replication, no distributed transactions needed |
| Latency | What's acceptable p99? | < 500ms for feed load | CDN for static, Redis for hot data, connection pooling |
| Durability | Can we lose data? | Never lose a message | Synchronous writes to 3 replicas, WAL, backup strategy |
| Scalability | Traffic growth rate? | 3x per year for 2 years | Horizontal scaling, auto-scaling groups, stateless services |
The CALS Framework for Non-Functional Requirements
Remember CALS: Consistency (strong vs. eventual), Availability (what's the SLA?), Latency (p50/p99 targets), Scale (current + projected). Every system design interview answer should address all four. If you skip any, you're leaving architectural decisions to chance.
Availability is frequently misunderstood. "99% availability" sounds great until you realize that's 3.65 days of downtime per year. "99.9%" is 8.76 hours. "99.99%" is 52 minutes. "99.999%" (five nines) is 5 minutes. The difference between 99.9% and 99.99% is an order of magnitude more complexity and cost.
Availability SLA → Architecture Requirements
Consistency requirements determine your database choice more than anything else. If the interviewer says "users can see slightly stale data" — you have enormous flexibility: Redis, Cassandra, DynamoDB with eventual consistency. If they say "never show stale data, every user must see the same thing" — you need strong consistency, which means distributed coordination, which is expensive and slow.
graph TD
A[Non-Functional Requirements] --> B[Availability SLA]
A --> C[Consistency Model]
A --> D[Latency Targets]
A --> E[Scale Requirements]
B --> F{99.99%?}
F -->|Yes| G[Multi-AZ, Auto-failover]
F -->|No| H[Single region OK]
C --> I{Strong consistency?}
I -->|Yes| J[PostgreSQL, MySQL with sync replication]
I -->|No| K[DynamoDB, Cassandra, Redis]
D --> L{Sub-100ms p99?}
L -->|Yes| M[Cache everything, CDN at edge]
L -->|No| N[Standard caching OK]
E --> O{> 100K QPS?}
O -->|Yes| P[Horizontal scaling, sharding]
O -->|No| Q[Vertical scaling, read replicas]Non-functional requirements drive every major architectural decision
Capacity estimation is where most candidates either impress or fall flat. The goal isn't to get exact numbers — it's to get numbers in the right order of magnitude, fast, while showing your reasoning. A ±2x error is fine. A ±10x error will lead you to design the wrong system.
The formula is always the same: DAU × Requests/User/Day ÷ 86,400 seconds = average QPS. Then multiply by 2-3x for peak. Let's walk through it for a Twitter-like system.
The Numbers You Must Memorize
86,400 = seconds in a day. 1KB = 1,024 bytes ≈ 1,000 for estimation. 1MB = 1,048,576 bytes ≈ 1M. 1GB = 1,073,741,824 bytes ≈ 1B. 1TB = 1,099,511,627,776 bytes ≈ 1T. 1M users × 1 request/day = ~12 QPS. 100M users × 10 requests/day = ~11,574 QPS ≈ 12K QPS. When in doubt, round aggressively.
For a Twitter-like system at 100M DAU:
Twitter Capacity Estimation Walkthrough
01
Write QPS: 100M DAU × 2 tweets/day = 200M tweets/day. 200M / 86,400 = ~2,315 writes/sec avg. Peak = 2,315 × 3 = ~7,000 tweets/sec.
02
Read QPS: Feeds are read-heavy. Assume 100:1 read:write ratio. 2,315 × 100 = ~230K reads/sec avg. Peak ~700K reads/sec.
03
Storage per tweet: 140 chars ≈ 300 bytes text. Plus metadata (user_id, timestamp, reply_to) ≈ 200 bytes. Total ≈ 500 bytes/tweet.
04
Daily storage: 200M tweets × 500 bytes = 100 GB/day.
05
5-year storage: 100 GB × 365 × 5 = 182.5 TB. Round to ~200 TB. This tells you: you need a distributed storage system, not a single Postgres instance.
06
Media storage: If 10% of tweets have images at 300KB avg: 20M images × 300KB = 6 TB/day. Over 5 years: 10.95 PB. This drives CDN and object storage decisions.
Write QPS: 100M DAU × 2 tweets/day = 200M tweets/day. 200M / 86,400 = ~2,315 writes/sec avg. Peak = 2,315 × 3 = ~7,000 tweets/sec.
Read QPS: Feeds are read-heavy. Assume 100:1 read:write ratio. 2,315 × 100 = ~230K reads/sec avg. Peak ~700K reads/sec.
Storage per tweet: 140 chars ≈ 300 bytes text. Plus metadata (user_id, timestamp, reply_to) ≈ 200 bytes. Total ≈ 500 bytes/tweet.
Daily storage: 200M tweets × 500 bytes = 100 GB/day.
5-year storage: 100 GB × 365 × 5 = 182.5 TB. Round to ~200 TB. This tells you: you need a distributed storage system, not a single Postgres instance.
Media storage: If 10% of tweets have images at 300KB avg: 20M images × 300KB = 6 TB/day. Over 5 years: 10.95 PB. This drives CDN and object storage decisions.
These numbers immediately tell you architectural things: 700K reads/sec is far beyond what a single database can handle (a well-tuned Postgres tops out around 50K QPS). You need caching, read replicas, or a NoSQL database. 6TB/day of images means you need CDN and object storage (S3), not local disk.
| Metric | Calculation | Result | Architectural Implication |
|---|---|---|---|
| Write QPS (avg) | 100M × 2 / 86400 | ~2.3K/sec | Single primary DB can handle this |
| Read QPS (avg) | 2,300 × 100 read/write ratio | ~230K/sec | Cannot hit DB — need Redis cache |
| Read QPS (peak) | 230K × 3 | ~700K/sec | Multiple cache clusters needed |
| Text storage/day | 200M × 500 bytes | 100 GB/day | NoSQL or sharded SQL required |
| Image storage/day | 20M × 300KB | 6 TB/day | S3 + CDN mandatory |
| Cache memory | 700K QPS × 1KB hot data | ~700 MB working set | Single Redis node enough for hot data |
Show Your Work — Even When You're Unsure
Never just say "we'll need a distributed cache." Always say "700K reads/sec exceeds a single DB's capacity of ~50K QPS, so we need Redis in front of the database. With 1KB per cached object, a 10GB Redis instance holds 10M hot objects — enough for 10% of our 100M users' most recent feed items." The numbers tell the story. No numbers = no credibility.
After requirements and estimation, you draw the boxes. The goal of the high-level design isn't to draw every component — it's to draw the critical data flows clearly and show that you understand which components need to exist and why.
Start with the client. Work your way to the data store. Draw every component your system absolutely needs: the client, the load balancer, the application servers, the cache layer, and the database. Label the connections. Show the direction of data flow with arrows.
graph LR
C[Client
Mobile/Web] --> LB[Load Balancer]
LB --> A1[App Server 1]
LB --> A2[App Server 2]
LB --> A3[App Server N]
A1 --> CH[Cache Layer
Redis Cluster]
A2 --> CH
A3 --> CH
CH --> DB[(Primary DB
Postgres)]
DB --> R1[(Read Replica 1)]
DB --> R2[(Read Replica 2)]
A1 --> Q[Message Queue
Kafka]
Q --> W[Worker Services
Async Processing]
W --> OS[Object Storage
S3]
OS --> CDN[CDN
CloudFront]
CDN --> CStandard high-level architecture: client → LB → app servers → cache → DB + async workers + CDN
There are five components that should appear in almost every system design:
The 5 Universal Components
Once you have the basic components drawn, add the component that's specific to the problem. For a news feed: fan-out service. For a URL shortener: hash generation service. For a payment system: idempotency layer. This is where you demonstrate domain knowledge.
How to Draw Quickly and Clearly
In a whiteboard interview: use boxes for services, cylinders for databases, triangles for caches (convention). In a virtual interview (CoderPad, Excalidraw): still use consistent shapes. Always label components with their purpose AND their technology: "Feed Service (Node.js)" not just "Service". Label connections with protocol and data: "HTTP REST" or "gRPC streaming". Draw data flow arrows. Interviewers love seeing arrows — it shows you understand causality.
After drawing the diagram, narrate the main user request flow. Walk the interviewer through a typical request: "A user opens the app. The request hits the CDN first — if the static assets are cached, they're served from the edge. The feed request goes to the load balancer, which routes it to any available app server. The app server checks Redis for the user's pre-computed feed. Cache hit: return the feed in ~5ms. Cache miss: query the feed service, which aggregates posts from the user's follows, writes back to Redis, and returns the result in ~100ms."
sequenceDiagram
participant U as User
participant CDN as CDN
participant LB as Load Balancer
participant App as App Server
participant Redis as Redis Cache
participant DB as Database
U->>CDN: GET /feed (static assets cached)
CDN-->>U: Static assets (5ms)
U->>LB: GET /api/v1/feed
LB->>App: Route to available server
App->>Redis: GET feed:user_123
alt Cache Hit
Redis-->>App: Return cached feed (1ms)
App-->>U: 200 OK + feed (~10ms total)
else Cache Miss
App->>DB: SELECT posts FROM follows (complex JOIN)
DB-->>App: Raw posts (50ms)
App->>Redis: SET feed:user_123 TTL=5min
App-->>U: 200 OK + feed (~80ms total)
endRequest flow: CDN handles static assets, Redis handles hot feed data, DB only for cache misses
The deep dive is where you earn the offer at senior levels. This is where you pick the hardest part of your design and explain exactly how it works, including the failure modes and trade-offs. Most candidates skip this or do it superficially.
Before diving in, ask the interviewer: "What would you like me to deep-dive on?" They usually have something specific in mind. If they say "it's up to you," pick the bottleneck that will cause the most pain at scale — usually the database, the cache invalidation strategy, or the most write-heavy component.
What to Deep-Dive: The Priority Order
1. The component you called out as "I'll come back to this" during requirements. 2. The hottest read or write path — what gets hammered most? 3. The component most likely to fail — single DB, network partition points. 4. The hardest consistency requirement — if there's any distributed transaction, that's your deep dive target.
For a news feed, the deep dive target is usually fan-out: how does a tweet from a celebrity with 10M followers get delivered to all their followers' feeds without killing your system?
| Strategy | How it Works | Write Cost | Read Cost | Best For |
|---|---|---|---|---|
| Fan-out on Write (Push) | When user tweets, immediately write to all followers' feed tables in Redis | O(followers) — expensive for celebrities | O(1) — just read pre-computed feed | Most users (< 10K followers) |
| Fan-out on Read (Pull) | When user opens feed, query all follows' recent posts and merge | O(1) — just write the tweet | O(follows × posts) — expensive for users following many | Celebrity accounts with millions of followers |
| Hybrid | Push to regular users, pull-on-read for celebrities. Merge at read time. | O(non-celebrity followers) | O(1) for regular follows + O(celebrity posts) | Twitter, Instagram, Facebook — production approach |
A strong deep dive doesn't just describe what you'd do — it describes the failure modes and how you handle them. For fan-out on write: what happens when Elon Musk tweets? 130M followers × write operation = massive write spike. The hybrid approach handles this, but you need to define the threshold for "celebrity" and handle the edge case where a user becomes a celebrity mid-session.
Deep Dive Structure Template
01
State the problem clearly: "The challenge is that celebrity accounts cause write amplification — one tweet triggers 10M fan-out writes, which spikes our write QPS from 7K/sec to millions."
02
State the naive approach and why it breaks: "Naive fan-out on write would queue 10M Redis writes per celebrity tweet. At 130M followers, that's 130M operations in seconds. Redis can do ~1M ops/sec, so this takes 130 seconds — far too slow."
03
Present your solution with the mechanism: "Hybrid fan-out: for users with < 10K followers, push to Redis at write time. For celebrities, skip Redis entirely at write time. At read time, merge the pre-computed feed (from Redis) with the last N tweets from each celebrity the user follows."
04
Address the failure mode: "The merge at read time adds latency. To bound this: only check celebrities the user follows (typically < 10). Cap celebrity lookups to last 50 tweets. This keeps read latency under 50ms extra."
05
Mention monitoring: "Track fan-out queue depth, celebrity threshold violations, and read-time merge duration. Alert if merge latency exceeds 100ms — that signals too many celebrity follows for a user."
State the problem clearly: "The challenge is that celebrity accounts cause write amplification — one tweet triggers 10M fan-out writes, which spikes our write QPS from 7K/sec to millions."
State the naive approach and why it breaks: "Naive fan-out on write would queue 10M Redis writes per celebrity tweet. At 130M followers, that's 130M operations in seconds. Redis can do ~1M ops/sec, so this takes 130 seconds — far too slow."
Present your solution with the mechanism: "Hybrid fan-out: for users with < 10K followers, push to Redis at write time. For celebrities, skip Redis entirely at write time. At read time, merge the pre-computed feed (from Redis) with the last N tweets from each celebrity the user follows."
Address the failure mode: "The merge at read time adds latency. To bound this: only check celebrities the user follows (typically < 10). Cap celebrity lookups to last 50 tweets. This keeps read latency under 50ms extra."
Mention monitoring: "Track fan-out queue depth, celebrity threshold violations, and read-time merge duration. Alert if merge latency exceeds 100ms — that signals too many celebrity follows for a user."
The Deep Dive Quality Signal
L4 engineers describe the happy path. L5 engineers describe the happy path and the failure mode. L6 engineers describe the happy path, the failure mode, how they'd monitor it, how they'd recover, and what trade-offs they chose and why. When you're practicing deep dives, force yourself to ask: "What breaks? How do I know it broke? How do I fix it?" for every component.
Time management is the most underrated skill in system design interviews. A 45-minute interview can feel like forever and vanish in seconds. Without an explicit time budget, most candidates accidentally spend 20 minutes on requirements and never finish the design — or race through requirements in 2 minutes and design the wrong system.
Here is the exact time allocation used by candidates who consistently pass FAANG system design interviews:
| Phase | Time Budget | What You're Doing | What the Interviewer Is Evaluating |
|---|---|---|---|
| Requirements | 0-8 min | Functional + non-functional questions. Get DAU, consistency, latency targets. | Can you scope? Do you think about SLA, not just features? |
| Scope Summary | 8-10 min | State what you're building and explicitly what you're not building. | Do you set boundaries? Can you simplify without being naive? |
| Estimation | 10-15 min | DAU → QPS → storage → bandwidth. Show math on the board. | Do you quantify? Can you derive architectural implications from numbers? |
| High-Level Design | 15-25 min | Draw all major components. Narrate a main request flow. | Can you design a complete system? Do you know the standard components? |
| Deep Dive | 25-42 min | Pick 1-2 hard problems. Explain mechanism, failure mode, trade-offs. | Can you go deep? Do you know the hard parts of distributed systems? |
| Wrap-Up | 42-45 min | Recap key decisions. What would you change at 10x scale? | Do you think about evolution? Do you acknowledge trade-offs? |
The Time Check at Minute 15
At the 15-minute mark, look at your diagram. If you haven't started drawing yet, you're in trouble. Cut requirements short, state "I'll assume X" for remaining questions, and start drawing immediately. A partial design that you narrate clearly is far better than perfect requirements with no design.
Interviewer signals to watch for during the interview: If the interviewer says "interesting, tell me more about that" — they want you to deep-dive into that component. If they say "let's move on" or "I think we have enough there" — they want to cover more ground. If they say "what would happen if..." — that's a failure mode question, answer it directly.
Time Recovery Strategies When You're Running Behind
One more time tip: narrate your thinking out loud during transitions. When you switch from requirements to estimation, say "Okay, I have enough context — let me now estimate the scale." This signals to the interviewer that you're intentionally moving between phases, not wandering.
FAANG system design interviews aren't graded on a binary pass/fail — they're scored on multiple dimensions, and the hiring committee calibrates them against level expectations. Understanding the rubric changes how you should approach the interview.
Amazon, Meta, Google, and Apple all use roughly similar evaluation criteria, though the exact language differs. Here are the four core dimensions and what each looks like at different levels:
| Dimension | L4 (SDE II / E4) | L5 (Sr. SDE / E5) | L6 (Staff SDE / E6) |
|---|---|---|---|
| Problem Scoping | Asks 3-5 basic functional questions. Covers scale vaguely. | Covers both functional and non-functional. Identifies 1-2 key constraints. | Immediately identifies the hardest constraint. Pushes back on ambiguous requirements. |
| Estimation | Rough numbers, shows formula but may err by 5x. | Accurate to 2x, derives architectural implications from numbers. | Derives non-obvious implications: "This QPS means our cache miss rate needs to be < 0.1%." |
| Design Quality | Draws standard components. Mentions trade-offs if prompted. | Proactively mentions trade-offs. Discusses 2+ alternatives. | Designs for the 10x scale case. Anticipates second-order effects. |
| Depth of Knowledge | Knows standard solutions. Struggles with edge cases. | Knows failure modes. Can discuss consensus algorithms, CAP theorem. | Can derive solutions from first principles. Has designed this type of system before. |
| Communication | Explains what they're doing. Somewhat structured. | Structured, narrates reasoning, responds well to interviewer hints. | Drives the interview, involves the interviewer as a partner, summarizes clearly. |
The Hidden Grading Dimension: Intellectual Honesty
Senior engineers at FAANG are also evaluating whether you'll be intellectually honest on the job. When you don't know something, say "I'm less familiar with the internals of Kafka's log compaction — let me reason through what I'd expect it to do." This scores far higher than confidently saying something wrong. Every engineer on the panel has seen candidates bluff their way through a deep-dive — they can tell, and it's a hard no.
The most important signal at L6 is proactive trade-off discussion. L4 candidates wait to be asked "what are the trade-offs?" L6 candidates say "I'm choosing fan-out on write for normal users because the 5ms read latency benefit outweighs the write amplification for the 99th percentile user — but I'm consciously accepting that celebrity accounts will need special handling, and here's how I'd do it."
What Makes a Strong System Design Interview at Each Level
One more thing: FAANG interviewers write a detailed write-up after your interview that goes to the hiring committee. They're evaluating not just whether your design was correct, but how you'd behave as an engineering partner. Were you collaborative? Did you listen? Did you handle pushback gracefully? Would they want to work with you on a design review?
After analyzing hundreds of system design interviews, the same anti-patterns appear over and over in candidates who get rejected. These aren't obscure gotchas — they're patterns that experienced engineers recognize immediately as signals of junior thinking.
Anti-Pattern #1: The Kitchen Sink Architecture
Candidate hears "Design a URL shortener" and immediately draws: microservices, Kafka, Redis, Cassandra, Elasticsearch, Kubernetes, and a machine learning model for spam detection. For a URL shortener serving 1M users. The kitchen sink architect uses every technology they know regardless of whether it's needed. This signals: no judgment about cost/complexity trade-offs, no ability to scope a problem, potentially will over-engineer everything on the job.
The counter-signal FAANG interviewers love: "For this scale, a single PostgreSQL instance with proper indexing will handle the QPS with headroom. I'll add Redis for the hot-path reads. I'd move to a distributed solution if we hit 10x this scale, but I'd design for what we need today and make migration straightforward."
The Top 8 System Design Anti-Patterns
The Final Minute Technique
In the last 2-3 minutes of the interview, say: "Let me quickly summarize the key decisions I made and the trade-offs: I chose fan-out on write for performance at the cost of write amplification. I'm accepting eventual consistency for feed freshness to avoid distributed transactions. The biggest risk is celebrity account fan-out — I'd monitor that write queue depth closely in production." This wrap-up shows L5+ thinking: you know what you decided, why, and what to watch out for. It leaves a strong final impression on the interviewer.
The most important meta-lesson: system design interviews test engineering judgment, not trivia. Interviewers aren't looking for the one perfect answer. They're looking for an engineer who thinks systematically, quantifies their decisions, acknowledges trade-offs honestly, and would be a good partner on a design review. That's a learnable set of behaviors — not innate talent.
graph TD
Start[Hear the Problem] --> R[Clarify Requirements
5-10 min]
R --> E[Capacity Estimation
5-10 min]
E --> H[High-Level Design
10-15 min]
H --> D[Deep Dive
15-20 min]
D --> W[Wrap-Up
3 min]
R --> R1[Functional Requirements]
R --> R2[Non-Functional: CALS]
E --> E1[DAU → QPS formula]
E --> E2[Storage calculation]
H --> H1[Draw 5 universal components]
H --> H2[Narrate request flow]
D --> D1[Pick the bottleneck]
D --> D2[Mechanism + failure mode]
W --> W1[Key decisions made]
W --> W2[Trade-offs acknowledged]The complete 45-minute system design interview framework — every phase, every deliverable
System design is a full interview round at every FAANG company for SDE II and above. At Meta, it's 2 rounds. At Google, it's 1-2 depending on level. The framework is not just for interviews — it's how senior engineers actually run design reviews. Engineers who can't structure a requirements conversation never make Staff level.
Common questions:
Key takeaways
You're 15 minutes into a system design interview and you haven't asked any requirements questions — you went straight to designing. The interviewer asks "What consistency model are you designing for?" What do you do?
Don't panic — course-correct immediately and cleanly. Say: "Good point — I should have established this upfront. Let me ask: do we need strong consistency, or is eventual consistency acceptable? And is there a specific latency SLA?" Get the answer, then explicitly update your design: "Okay, since eventual consistency is fine, I'll change from synchronous writes to all replicas to asynchronous replication with a write-ahead log, which will reduce write latency significantly." The ability to gracefully recover mid-design is itself a signal the interviewer is evaluating.
An interviewer says "Assume you're designing for 1 billion daily active users." You've never designed at that scale. How do you handle the estimation and design?
Break it down systematically and reason from first principles. 1B DAU × 10 requests/user/day = 10B requests/day. 10B / 86,400 = ~115,000 QPS average. Peak at 3x: ~350,000 QPS. No single machine handles this — you need horizontal scaling at every layer. For the database: no single DB handles 350K QPS — need sharding by user_id or consistent hashing, and Redis is mandatory for hot-path reads. For storage: at 1KB/request average, 10B requests = 10TB/day just for request logs. Use numbers to drive each architectural decision. Saying "I've never built at this scale, but let me reason through it" is perfectly fine — the math still works the same way.
What's the difference between how an L4 and L6 candidate would approach designing a distributed rate limiter?
An L4 candidate would describe the standard token bucket algorithm, mention Redis for distributed state, and call it done. An L6 candidate would: (1) First ask requirements — per-user, per-IP, or per-service? Sliding window, fixed window, or token bucket? What's the acceptable false-positive rate? (2) Explain why a centralized Redis approach has a single point of failure and latency problems at 100K+ QPS — and propose a hierarchical approach: local token bucket in each app server for fast rejection, synchronized with a central Redis counter every 100ms. (3) Discuss the consistency trade-off: with 100ms sync intervals, a user can briefly exceed the limit by up to 100ms×rate. (4) Mention that at Stripe scale, they use a combination of local rate limiting plus circuit breakers, not a single Redis key. The L6 candidate doesn't know more algorithms — they know more failure modes and have stronger opinions about production trade-offs.
💡 Analogy
System design interviews are exactly like being a doctor. A good doctor doesn't prescribe medicine before examining the patient. They ask about symptoms (requirements), check vital signs and run tests (estimation), form a diagnosis (high-level design), then drill into the specific condition (deep dive). A doctor who prescribes chemo to every patient with a headache is dangerous. An engineer who proposes sharding to every system with "lots of traffic" is expensive.
⚡ Core Idea
The 4-step framework works for any system: clarify → estimate → sketch → deep-dive. The framework's power is that it forces you to gather information before deciding on solutions. Most engineering failures come from solving the wrong problem. The framework prevents that by making requirements the first step, not an afterthought.
🎯 Why It Matters
FAANG spends $50K+ to hire each engineer and $500K+ per year to employ them. They're betting your judgment on billion-dollar systems. The interview is designed to answer one question: can this person design systems that scale, fail gracefully, and evolve? The framework gives you a repeatable way to demonstrate exactly those skills in 45 minutes.
Interview prep: 1 resource
Use these to reinforce this concept for interviews.
View all interview resources →Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.