Skip to main content
Career Paths
Concepts
System Design Framework
The Simplified Tech

Role-based learning paths to help you master cloud engineering with clarity and confidence.

Product

  • Career Paths
  • Interview Prep
  • Scenarios
  • AI Features
  • Cloud Comparison
  • Resume Builder
  • Pricing

Community

  • Join Discord

Account

  • Dashboard
  • Credits
  • Updates
  • Sign in
  • Sign up
  • Contact Support

Stay updated

Get the latest learning tips and updates. No spam, ever.

Terms of ServicePrivacy Policy

© 2026 TheSimplifiedTech. All rights reserved.

BackBack
Interactive Explainer

The System Design Interview Playbook

The battle-tested 45-minute framework every FAANG engineer uses: clarify requirements, estimate capacity, sketch the design, deep-dive bottlenecks. Master the process and you can design any system.

🎯Key Takeaways
The 4-step framework (clarify → estimate → sketch → deep-dive) works for every system design problem and is the single most important skill to internalize.
Spend 8-10 minutes on requirements, covering both functional and non-functional (CALS: Consistency, Availability, Latency, Scale). The non-functional requirements drive 80% of architectural decisions.
Always do capacity estimation with numbers on the board. DAU → QPS → storage. Derive at least one architectural implication from the numbers.
The deep dive separates L4 from L5+: describe the failure mode, not just the happy path. For every component, ask "what happens when this breaks?"
FAANG interviewers grade on process, trade-off discussion, and collaboration — not on whether your design matches the "right answer." There is no right answer; there are only well-reasoned trade-offs.

The System Design Interview Playbook

The battle-tested 45-minute framework every FAANG engineer uses: clarify requirements, estimate capacity, sketch the design, deep-dive bottlenecks. Master the process and you can design any system.

~26 min read
Be the first to complete!
What you'll learn
  • The 4-step framework (clarify → estimate → sketch → deep-dive) works for every system design problem and is the single most important skill to internalize.
  • Spend 8-10 minutes on requirements, covering both functional and non-functional (CALS: Consistency, Availability, Latency, Scale). The non-functional requirements drive 80% of architectural decisions.
  • Always do capacity estimation with numbers on the board. DAU → QPS → storage. Derive at least one architectural implication from the numbers.
  • The deep dive separates L4 from L5+: describe the failure mode, not just the happy path. For every component, ask "what happens when this breaks?"
  • FAANG interviewers grade on process, trade-off discussion, and collaboration — not on whether your design matches the "right answer." There is no right answer; there are only well-reasoned trade-offs.

Lesson outline

Why Most Candidates Fail System Design Interviews

System design interviews have a brutal pass rate at top tech companies — roughly 30-40% even among experienced engineers. The failure mode is almost always the same: candidates skip straight to drawing boxes. They hear "Design Twitter" and immediately start talking about microservices, Kafka, and sharding — before understanding what they're actually building.

Consider what the interviewer sees from the other side of the table. When a candidate jumps straight to solution mode, it signals a critical engineering weakness: they can't gather requirements. In real engineering, requirements gathering is the most important skill. A system built for the wrong requirements is worthless, regardless of how technically sophisticated it is.

The #1 Failure Mode: Premature Architecture

Interviewer says "Design a URL shortener." Candidate immediately says: "I'll use a distributed hash table, Cassandra for storage, Redis for caching, and Kafka for async processing." The interviewer hasn't said anything about scale, consistency requirements, or features. This candidate designed a $2M infrastructure for what might need $200/month.

The second most common failure is poor time management. Candidates spend 30 minutes on requirements and run out of time before drawing a single diagram. Or they over-engineer one component — spending 20 minutes designing the perfect database schema — while never discussing the hard parts: caching, sharding, failure modes.

The third failure is forgetting that system design interviews are collaborative. They're not a test where you write an answer and hand it in. The interviewer wants to have a technical conversation. They'll drop hints. They'll push back. Candidates who ignore this and monologue for 45 minutes miss the collaborative signal entirely.

The 5 Most Common System Design Interview Failure Modes

  • Premature Architecture — Jumping to solution without understanding requirements. Designing Twitter at Netflix scale when the interviewer wanted a simple prototype.
  • No Numbers — Saying "lots of traffic" instead of "500K QPS at peak." Numbers drive architecture decisions. Without them, design is guesswork.
  • Ignoring Trade-offs — Presenting a design as "the right answer" instead of discussing what you're optimizing for and what you're giving up.
  • Poor Time Management — Spending 30 minutes on requirements, never finishing the diagram. Or spending 25 minutes on one component and skipping the rest.
  • No Monitoring or Operations — Describing a beautiful architecture with zero mention of how you'd know when it breaks. Senior engineers always think about observability.

The good news: system design interviews are learnable. Unlike coding interviews where you either know the algorithm or you don't, system design follows a repeatable process. Learn the 4-step framework and the time budget, and you can design any system competently — even systems you've never built.

Step 1 — Clarify Requirements (5–10 minutes)

The most valuable thing you can do in the first 5-10 minutes of a system design interview is ask questions. Not design questions — requirements questions. You need to understand what you're building before you can design it.

There are two types of requirements you need to uncover: functional requirements (what the system does) and non-functional requirements (how well it does it). Most candidates only ask about functional requirements. Senior engineers ask both.

The Requirements Question Bank

Have these questions ready for every interview: Scale questions: How many DAU? What's peak vs. average traffic? Read-heavy or write-heavy? Consistency questions: Do we need strong consistency or is eventual okay? Can users see stale data? Latency questions: What's the acceptable p99 latency? Real-time or near-real-time? Availability questions: What's the SLA? Can we afford 1 hour of downtime/month? Feature scope: What are must-have vs. nice-to-have features? Any features we explicitly exclude?

For a concrete example, let's say the interviewer asks "Design a news feed system like Facebook." Before drawing anything, you should ask:

Requirements Questions for a News Feed System

→

01

Scale: How many DAU? 1M or 1B makes a 1000x architectural difference. At 1M DAU you can afford a monolith. At 1B you need distributed everything.

→

02

Feed generation: Fan-out on write (push to followers at write time) or fan-out on read (pull at read time)? This is the central trade-off in feed systems.

→

03

Content types: Text only? Photos? Videos? Video changes storage and bandwidth requirements by 100-1000x.

→

04

Freshness: How stale can the feed be? 5 seconds? 5 minutes? This determines caching strategy entirely.

→

05

Ranking: Chronological or algorithmic? Algorithmic feed requires ML infrastructure that is a completely separate system.

06

Interactions: Do we need likes, comments, shares? Each adds write amplification.

1

Scale: How many DAU? 1M or 1B makes a 1000x architectural difference. At 1M DAU you can afford a monolith. At 1B you need distributed everything.

2

Feed generation: Fan-out on write (push to followers at write time) or fan-out on read (pull at read time)? This is the central trade-off in feed systems.

3

Content types: Text only? Photos? Videos? Video changes storage and bandwidth requirements by 100-1000x.

4

Freshness: How stale can the feed be? 5 seconds? 5 minutes? This determines caching strategy entirely.

5

Ranking: Chronological or algorithmic? Algorithmic feed requires ML infrastructure that is a completely separate system.

6

Interactions: Do we need likes, comments, shares? Each adds write amplification.

A good requirements conversation might take 8 minutes. The interviewer is watching you closely during this time. They want to see: Do you think systematically? Do you cover both functional and non-functional? Do you set scope before designing?

The Scope Statement — Close the Requirements Phase Properly

Before moving to estimation, summarize what you're building: "So based on what we've discussed: we're designing a news feed for 100M DAU, fan-out on write for users with < 5K followers, read-heavy (100:1 read:write ratio), eventual consistency is fine, and latency target is < 300ms for feed load. I'm going to exclude the ranking algorithm and focus on the feed delivery infrastructure. Does that sound right?" This shows systems thinking and gives the interviewer a chance to correct scope before you design the wrong thing.

The number you absolutely must establish before moving on: DAU (daily active users) or MAU (monthly active users). This single number drives every capacity estimate. If you don't get this number, make an assumption and state it clearly: "I'll assume 10M DAU — is that in the right ballpark?"

DAUAvg Requests/DayQPS (avg)Peak QPS (3x)Architecture Implication
100K500K618Single server, SQLite or Postgres
1M5M58174Vertical scaling, single DB with cache
10M50M5781,734Horizontal scaling, read replicas needed
100M500M5,78717,361Sharded DB, CDN, full caching layer
1B5B57,870173,611Full distributed system, custom infra

Functional vs. Non-Functional Requirements

One of the clearest signals that separates L5+ candidates from L4 candidates at FAANG is understanding the difference between functional and non-functional requirements — and spending as much time on non-functional as functional.

Functional requirements answer: What does the system do? Non-functional requirements answer: How well does it do it, and under what constraints? Non-functional requirements are often more important for architecture because they determine which trade-offs to make.

Requirement TypeExample QuestionExample AnswerArchitectural Impact
FunctionalCan users upload photos?Yes, up to 10MB, JPG/PNGObject storage (S3), CDN, image processing pipeline
AvailabilityWhat's the SLA?99.99% (52 min downtime/year)Multi-AZ deployment, automatic failover, no single points of failure
ConsistencyCan users see stale data?Eventually consistent OK for feedCache-aside, async replication, no distributed transactions needed
LatencyWhat's acceptable p99?< 500ms for feed loadCDN for static, Redis for hot data, connection pooling
DurabilityCan we lose data?Never lose a messageSynchronous writes to 3 replicas, WAL, backup strategy
ScalabilityTraffic growth rate?3x per year for 2 yearsHorizontal scaling, auto-scaling groups, stateless services

The CALS Framework for Non-Functional Requirements

Remember CALS: Consistency (strong vs. eventual), Availability (what's the SLA?), Latency (p50/p99 targets), Scale (current + projected). Every system design interview answer should address all four. If you skip any, you're leaving architectural decisions to chance.

Availability is frequently misunderstood. "99% availability" sounds great until you realize that's 3.65 days of downtime per year. "99.9%" is 8.76 hours. "99.99%" is 52 minutes. "99.999%" (five nines) is 5 minutes. The difference between 99.9% and 99.99% is an order of magnitude more complexity and cost.

Availability SLA → Architecture Requirements

  • 99% (3.65 days/year) — Single server acceptable. Basic monitoring. No redundancy required.
  • 99.9% (8.76 hours/year) — Redundant servers. Health checks. Auto-restart on failure. Basic load balancing.
  • 99.99% (52 min/year) — Multi-AZ deployment. Auto-failover. Zero-downtime deployments. Circuit breakers.
  • 99.999% (5 min/year) — Multi-region active-active. Chaos engineering. Sub-second failover. Extremely expensive — only for critical payment/auth systems.

Consistency requirements determine your database choice more than anything else. If the interviewer says "users can see slightly stale data" — you have enormous flexibility: Redis, Cassandra, DynamoDB with eventual consistency. If they say "never show stale data, every user must see the same thing" — you need strong consistency, which means distributed coordination, which is expensive and slow.

graph TD
    A[Non-Functional Requirements] --> B[Availability SLA]
    A --> C[Consistency Model]
    A --> D[Latency Targets]
    A --> E[Scale Requirements]
    B --> F{99.99%?}
    F -->|Yes| G[Multi-AZ, Auto-failover]
    F -->|No| H[Single region OK]
    C --> I{Strong consistency?}
    I -->|Yes| J[PostgreSQL, MySQL with sync replication]
    I -->|No| K[DynamoDB, Cassandra, Redis]
    D --> L{Sub-100ms p99?}
    L -->|Yes| M[Cache everything, CDN at edge]
    L -->|No| N[Standard caching OK]
    E --> O{> 100K QPS?}
    O -->|Yes| P[Horizontal scaling, sharding]
    O -->|No| Q[Vertical scaling, read replicas]

Non-functional requirements drive every major architectural decision

Step 2 — Capacity Estimation (5–10 minutes)

Capacity estimation is where most candidates either impress or fall flat. The goal isn't to get exact numbers — it's to get numbers in the right order of magnitude, fast, while showing your reasoning. A ±2x error is fine. A ±10x error will lead you to design the wrong system.

The formula is always the same: DAU × Requests/User/Day ÷ 86,400 seconds = average QPS. Then multiply by 2-3x for peak. Let's walk through it for a Twitter-like system.

The Numbers You Must Memorize

86,400 = seconds in a day. 1KB = 1,024 bytes ≈ 1,000 for estimation. 1MB = 1,048,576 bytes ≈ 1M. 1GB = 1,073,741,824 bytes ≈ 1B. 1TB = 1,099,511,627,776 bytes ≈ 1T. 1M users × 1 request/day = ~12 QPS. 100M users × 10 requests/day = ~11,574 QPS ≈ 12K QPS. When in doubt, round aggressively.

For a Twitter-like system at 100M DAU:

Twitter Capacity Estimation Walkthrough

→

01

Write QPS: 100M DAU × 2 tweets/day = 200M tweets/day. 200M / 86,400 = ~2,315 writes/sec avg. Peak = 2,315 × 3 = ~7,000 tweets/sec.

→

02

Read QPS: Feeds are read-heavy. Assume 100:1 read:write ratio. 2,315 × 100 = ~230K reads/sec avg. Peak ~700K reads/sec.

→

03

Storage per tweet: 140 chars ≈ 300 bytes text. Plus metadata (user_id, timestamp, reply_to) ≈ 200 bytes. Total ≈ 500 bytes/tweet.

→

04

Daily storage: 200M tweets × 500 bytes = 100 GB/day.

→

05

5-year storage: 100 GB × 365 × 5 = 182.5 TB. Round to ~200 TB. This tells you: you need a distributed storage system, not a single Postgres instance.

06

Media storage: If 10% of tweets have images at 300KB avg: 20M images × 300KB = 6 TB/day. Over 5 years: 10.95 PB. This drives CDN and object storage decisions.

1

Write QPS: 100M DAU × 2 tweets/day = 200M tweets/day. 200M / 86,400 = ~2,315 writes/sec avg. Peak = 2,315 × 3 = ~7,000 tweets/sec.

2

Read QPS: Feeds are read-heavy. Assume 100:1 read:write ratio. 2,315 × 100 = ~230K reads/sec avg. Peak ~700K reads/sec.

3

Storage per tweet: 140 chars ≈ 300 bytes text. Plus metadata (user_id, timestamp, reply_to) ≈ 200 bytes. Total ≈ 500 bytes/tweet.

4

Daily storage: 200M tweets × 500 bytes = 100 GB/day.

5

5-year storage: 100 GB × 365 × 5 = 182.5 TB. Round to ~200 TB. This tells you: you need a distributed storage system, not a single Postgres instance.

6

Media storage: If 10% of tweets have images at 300KB avg: 20M images × 300KB = 6 TB/day. Over 5 years: 10.95 PB. This drives CDN and object storage decisions.

These numbers immediately tell you architectural things: 700K reads/sec is far beyond what a single database can handle (a well-tuned Postgres tops out around 50K QPS). You need caching, read replicas, or a NoSQL database. 6TB/day of images means you need CDN and object storage (S3), not local disk.

MetricCalculationResultArchitectural Implication
Write QPS (avg)100M × 2 / 86400~2.3K/secSingle primary DB can handle this
Read QPS (avg)2,300 × 100 read/write ratio~230K/secCannot hit DB — need Redis cache
Read QPS (peak)230K × 3~700K/secMultiple cache clusters needed
Text storage/day200M × 500 bytes100 GB/dayNoSQL or sharded SQL required
Image storage/day20M × 300KB6 TB/dayS3 + CDN mandatory
Cache memory700K QPS × 1KB hot data~700 MB working setSingle Redis node enough for hot data

Show Your Work — Even When You're Unsure

Never just say "we'll need a distributed cache." Always say "700K reads/sec exceeds a single DB's capacity of ~50K QPS, so we need Redis in front of the database. With 1KB per cached object, a 10GB Redis instance holds 10M hot objects — enough for 10% of our 100M users' most recent feed items." The numbers tell the story. No numbers = no credibility.

Step 3 — High-Level Design (10–15 minutes)

After requirements and estimation, you draw the boxes. The goal of the high-level design isn't to draw every component — it's to draw the critical data flows clearly and show that you understand which components need to exist and why.

Start with the client. Work your way to the data store. Draw every component your system absolutely needs: the client, the load balancer, the application servers, the cache layer, and the database. Label the connections. Show the direction of data flow with arrows.

graph LR
    C[Client
Mobile/Web] --> LB[Load Balancer]
    LB --> A1[App Server 1]
    LB --> A2[App Server 2]
    LB --> A3[App Server N]
    A1 --> CH[Cache Layer
Redis Cluster]
    A2 --> CH
    A3 --> CH
    CH --> DB[(Primary DB
Postgres)]
    DB --> R1[(Read Replica 1)]
    DB --> R2[(Read Replica 2)]
    A1 --> Q[Message Queue
Kafka]
    Q --> W[Worker Services
Async Processing]
    W --> OS[Object Storage
S3]
    OS --> CDN[CDN
CloudFront]
    CDN --> C

Standard high-level architecture: client → LB → app servers → cache → DB + async workers + CDN

There are five components that should appear in almost every system design:

The 5 Universal Components

  • Load Balancer — Distributes traffic across servers. Provides horizontal scalability and eliminates single point of failure on the application tier. Use L7 (HTTP) for web apps, L4 (TCP) for lower overhead.
  • Application Servers — Stateless business logic layer. Stateless is key — no session state stored in memory. Allows any server to handle any request, making horizontal scaling trivial.
  • Cache Layer (Redis) — In-memory store for hot data. Reduces DB load by 10-100x. Always ask: what's the cache key? What's the TTL? What's the eviction policy?
  • Primary Database — Source of truth. SQL for relational/transactional data. NoSQL for high write throughput or flexible schema. State what specific database and why.
  • CDN — Edge caching for static assets and user-uploaded media. Mandatory if you have images, video, or global users. Reduces latency from hundreds of ms to single-digit ms.

Once you have the basic components drawn, add the component that's specific to the problem. For a news feed: fan-out service. For a URL shortener: hash generation service. For a payment system: idempotency layer. This is where you demonstrate domain knowledge.

How to Draw Quickly and Clearly

In a whiteboard interview: use boxes for services, cylinders for databases, triangles for caches (convention). In a virtual interview (CoderPad, Excalidraw): still use consistent shapes. Always label components with their purpose AND their technology: "Feed Service (Node.js)" not just "Service". Label connections with protocol and data: "HTTP REST" or "gRPC streaming". Draw data flow arrows. Interviewers love seeing arrows — it shows you understand causality.

After drawing the diagram, narrate the main user request flow. Walk the interviewer through a typical request: "A user opens the app. The request hits the CDN first — if the static assets are cached, they're served from the edge. The feed request goes to the load balancer, which routes it to any available app server. The app server checks Redis for the user's pre-computed feed. Cache hit: return the feed in ~5ms. Cache miss: query the feed service, which aggregates posts from the user's follows, writes back to Redis, and returns the result in ~100ms."

sequenceDiagram
    participant U as User
    participant CDN as CDN
    participant LB as Load Balancer
    participant App as App Server
    participant Redis as Redis Cache
    participant DB as Database

    U->>CDN: GET /feed (static assets cached)
    CDN-->>U: Static assets (5ms)
    U->>LB: GET /api/v1/feed
    LB->>App: Route to available server
    App->>Redis: GET feed:user_123
    alt Cache Hit
        Redis-->>App: Return cached feed (1ms)
        App-->>U: 200 OK + feed (~10ms total)
    else Cache Miss
        App->>DB: SELECT posts FROM follows (complex JOIN)
        DB-->>App: Raw posts (50ms)
        App->>Redis: SET feed:user_123 TTL=5min
        App-->>U: 200 OK + feed (~80ms total)
    end

Request flow: CDN handles static assets, Redis handles hot feed data, DB only for cache misses

Step 4 — Deep Dive & Bottleneck Analysis (15–20 minutes)

The deep dive is where you earn the offer at senior levels. This is where you pick the hardest part of your design and explain exactly how it works, including the failure modes and trade-offs. Most candidates skip this or do it superficially.

Before diving in, ask the interviewer: "What would you like me to deep-dive on?" They usually have something specific in mind. If they say "it's up to you," pick the bottleneck that will cause the most pain at scale — usually the database, the cache invalidation strategy, or the most write-heavy component.

What to Deep-Dive: The Priority Order

1. The component you called out as "I'll come back to this" during requirements. 2. The hottest read or write path — what gets hammered most? 3. The component most likely to fail — single DB, network partition points. 4. The hardest consistency requirement — if there's any distributed transaction, that's your deep dive target.

For a news feed, the deep dive target is usually fan-out: how does a tweet from a celebrity with 10M followers get delivered to all their followers' feeds without killing your system?

StrategyHow it WorksWrite CostRead CostBest For
Fan-out on Write (Push)When user tweets, immediately write to all followers' feed tables in RedisO(followers) — expensive for celebritiesO(1) — just read pre-computed feedMost users (< 10K followers)
Fan-out on Read (Pull)When user opens feed, query all follows' recent posts and mergeO(1) — just write the tweetO(follows × posts) — expensive for users following manyCelebrity accounts with millions of followers
HybridPush to regular users, pull-on-read for celebrities. Merge at read time.O(non-celebrity followers)O(1) for regular follows + O(celebrity posts)Twitter, Instagram, Facebook — production approach

A strong deep dive doesn't just describe what you'd do — it describes the failure modes and how you handle them. For fan-out on write: what happens when Elon Musk tweets? 130M followers × write operation = massive write spike. The hybrid approach handles this, but you need to define the threshold for "celebrity" and handle the edge case where a user becomes a celebrity mid-session.

Deep Dive Structure Template

→

01

State the problem clearly: "The challenge is that celebrity accounts cause write amplification — one tweet triggers 10M fan-out writes, which spikes our write QPS from 7K/sec to millions."

→

02

State the naive approach and why it breaks: "Naive fan-out on write would queue 10M Redis writes per celebrity tweet. At 130M followers, that's 130M operations in seconds. Redis can do ~1M ops/sec, so this takes 130 seconds — far too slow."

→

03

Present your solution with the mechanism: "Hybrid fan-out: for users with < 10K followers, push to Redis at write time. For celebrities, skip Redis entirely at write time. At read time, merge the pre-computed feed (from Redis) with the last N tweets from each celebrity the user follows."

→

04

Address the failure mode: "The merge at read time adds latency. To bound this: only check celebrities the user follows (typically < 10). Cap celebrity lookups to last 50 tweets. This keeps read latency under 50ms extra."

05

Mention monitoring: "Track fan-out queue depth, celebrity threshold violations, and read-time merge duration. Alert if merge latency exceeds 100ms — that signals too many celebrity follows for a user."

1

State the problem clearly: "The challenge is that celebrity accounts cause write amplification — one tweet triggers 10M fan-out writes, which spikes our write QPS from 7K/sec to millions."

2

State the naive approach and why it breaks: "Naive fan-out on write would queue 10M Redis writes per celebrity tweet. At 130M followers, that's 130M operations in seconds. Redis can do ~1M ops/sec, so this takes 130 seconds — far too slow."

3

Present your solution with the mechanism: "Hybrid fan-out: for users with < 10K followers, push to Redis at write time. For celebrities, skip Redis entirely at write time. At read time, merge the pre-computed feed (from Redis) with the last N tweets from each celebrity the user follows."

4

Address the failure mode: "The merge at read time adds latency. To bound this: only check celebrities the user follows (typically < 10). Cap celebrity lookups to last 50 tweets. This keeps read latency under 50ms extra."

5

Mention monitoring: "Track fan-out queue depth, celebrity threshold violations, and read-time merge duration. Alert if merge latency exceeds 100ms — that signals too many celebrity follows for a user."

The Deep Dive Quality Signal

L4 engineers describe the happy path. L5 engineers describe the happy path and the failure mode. L6 engineers describe the happy path, the failure mode, how they'd monitor it, how they'd recover, and what trade-offs they chose and why. When you're practicing deep dives, force yourself to ask: "What breaks? How do I know it broke? How do I fix it?" for every component.

Managing Time in the Interview — Minute by Minute

Time management is the most underrated skill in system design interviews. A 45-minute interview can feel like forever and vanish in seconds. Without an explicit time budget, most candidates accidentally spend 20 minutes on requirements and never finish the design — or race through requirements in 2 minutes and design the wrong system.

Here is the exact time allocation used by candidates who consistently pass FAANG system design interviews:

PhaseTime BudgetWhat You're DoingWhat the Interviewer Is Evaluating
Requirements0-8 minFunctional + non-functional questions. Get DAU, consistency, latency targets.Can you scope? Do you think about SLA, not just features?
Scope Summary8-10 minState what you're building and explicitly what you're not building.Do you set boundaries? Can you simplify without being naive?
Estimation10-15 minDAU → QPS → storage → bandwidth. Show math on the board.Do you quantify? Can you derive architectural implications from numbers?
High-Level Design15-25 minDraw all major components. Narrate a main request flow.Can you design a complete system? Do you know the standard components?
Deep Dive25-42 minPick 1-2 hard problems. Explain mechanism, failure mode, trade-offs.Can you go deep? Do you know the hard parts of distributed systems?
Wrap-Up42-45 minRecap key decisions. What would you change at 10x scale?Do you think about evolution? Do you acknowledge trade-offs?

The Time Check at Minute 15

At the 15-minute mark, look at your diagram. If you haven't started drawing yet, you're in trouble. Cut requirements short, state "I'll assume X" for remaining questions, and start drawing immediately. A partial design that you narrate clearly is far better than perfect requirements with no design.

Interviewer signals to watch for during the interview: If the interviewer says "interesting, tell me more about that" — they want you to deep-dive into that component. If they say "let's move on" or "I think we have enough there" — they want to cover more ground. If they say "what would happen if..." — that's a failure mode question, answer it directly.

Time Recovery Strategies When You're Running Behind

  • Behind at minute 10 — State your remaining assumptions in one sentence: "I'll assume 100M DAU, 99.9% SLA, eventual consistency acceptable." Skip the discussion, start estimating.
  • Behind at minute 20 — Draw a minimal diagram: load balancer, app servers, cache, database. Skip the CDN and worker queues for now. Narrate in 2 minutes. Move to deep dive.
  • Behind at minute 35 — Skip the second deep dive topic. Do one thing deeply. State what you'd cover next if you had more time: "I'd also discuss the database sharding strategy if we had more time."
  • Behind at minute 42 — Stop wherever you are. Quickly state 2-3 key trade-offs you made and 1 thing you'd change at 10x scale. End strong with a summary even if the design is incomplete.

One more time tip: narrate your thinking out loud during transitions. When you switch from requirements to estimation, say "Okay, I have enough context — let me now estimate the scale." This signals to the interviewer that you're intentionally moving between phases, not wandering.

How FAANG Actually Grades System Design

FAANG system design interviews aren't graded on a binary pass/fail — they're scored on multiple dimensions, and the hiring committee calibrates them against level expectations. Understanding the rubric changes how you should approach the interview.

Amazon, Meta, Google, and Apple all use roughly similar evaluation criteria, though the exact language differs. Here are the four core dimensions and what each looks like at different levels:

DimensionL4 (SDE II / E4)L5 (Sr. SDE / E5)L6 (Staff SDE / E6)
Problem ScopingAsks 3-5 basic functional questions. Covers scale vaguely.Covers both functional and non-functional. Identifies 1-2 key constraints.Immediately identifies the hardest constraint. Pushes back on ambiguous requirements.
EstimationRough numbers, shows formula but may err by 5x.Accurate to 2x, derives architectural implications from numbers.Derives non-obvious implications: "This QPS means our cache miss rate needs to be < 0.1%."
Design QualityDraws standard components. Mentions trade-offs if prompted.Proactively mentions trade-offs. Discusses 2+ alternatives.Designs for the 10x scale case. Anticipates second-order effects.
Depth of KnowledgeKnows standard solutions. Struggles with edge cases.Knows failure modes. Can discuss consensus algorithms, CAP theorem.Can derive solutions from first principles. Has designed this type of system before.
CommunicationExplains what they're doing. Somewhat structured.Structured, narrates reasoning, responds well to interviewer hints.Drives the interview, involves the interviewer as a partner, summarizes clearly.

The Hidden Grading Dimension: Intellectual Honesty

Senior engineers at FAANG are also evaluating whether you'll be intellectually honest on the job. When you don't know something, say "I'm less familiar with the internals of Kafka's log compaction — let me reason through what I'd expect it to do." This scores far higher than confidently saying something wrong. Every engineer on the panel has seen candidates bluff their way through a deep-dive — they can tell, and it's a hard no.

The most important signal at L6 is proactive trade-off discussion. L4 candidates wait to be asked "what are the trade-offs?" L6 candidates say "I'm choosing fan-out on write for normal users because the 5ms read latency benefit outweighs the write amplification for the 99th percentile user — but I'm consciously accepting that celebrity accounts will need special handling, and here's how I'd do it."

What Makes a Strong System Design Interview at Each Level

  • L4 Strong Pass — Covers all requirements, draws a complete diagram with all major components, derives capacity estimates with correct formulas, discusses 1-2 trade-offs, handles follow-up questions reasonably.
  • L5 Strong Pass — Everything at L4, plus: proactively mentions trade-offs, drives to the hard problem, knows failure modes, can discuss distributed systems concepts (CAP, consensus), structures the interview well without prompting.
  • L6 Strong Pass — Everything at L5, plus: identifies the non-obvious hard constraint, proposes multiple approaches with pros/cons, thinks about 10x scale, discusses operational concerns (monitoring, on-call, deployment), and makes the interviewer feel like they're talking to a peer who's built systems like this before.

One more thing: FAANG interviewers write a detailed write-up after your interview that goes to the hiring committee. They're evaluating not just whether your design was correct, but how you'd behave as an engineering partner. Were you collaborative? Did you listen? Did you handle pushback gracefully? Would they want to work with you on a design review?

The Anti-Patterns That Get You Rejected

After analyzing hundreds of system design interviews, the same anti-patterns appear over and over in candidates who get rejected. These aren't obscure gotchas — they're patterns that experienced engineers recognize immediately as signals of junior thinking.

Anti-Pattern #1: The Kitchen Sink Architecture

Candidate hears "Design a URL shortener" and immediately draws: microservices, Kafka, Redis, Cassandra, Elasticsearch, Kubernetes, and a machine learning model for spam detection. For a URL shortener serving 1M users. The kitchen sink architect uses every technology they know regardless of whether it's needed. This signals: no judgment about cost/complexity trade-offs, no ability to scope a problem, potentially will over-engineer everything on the job.

The counter-signal FAANG interviewers love: "For this scale, a single PostgreSQL instance with proper indexing will handle the QPS with headroom. I'll add Redis for the hot-path reads. I'd move to a distributed solution if we hit 10x this scale, but I'd design for what we need today and make migration straightforward."

The Top 8 System Design Anti-Patterns

  • No Numbers — Describing scale as "lots of traffic" or "millions of users" without doing the QPS math. Interviewers hear this and write "lacks quantitative thinking." Always put numbers on the board.
  • Over-Engineering — Using microservices and Kafka for a system that processes 100 writes/sec. Proportionality of solution to problem is a core engineering judgment skill.
  • No Trade-offs — Describing your design as "the best approach" without acknowledging what you're optimizing for and what you're giving up. There are no universally correct architectures.
  • Forgetting Failure Modes — Never mentioning what happens when a server dies, the network partitions, or a third-party API returns errors. Senior engineers think about failure cases first.
  • No Monitoring or Observability — Designing a system with no mention of metrics, logging, alerting, or how you'd know the system is degraded. This is a huge red flag for on-call reliability.
  • Jumping Past Requirements — Starting to design before establishing DAU, consistency model, and latency targets. This is the fastest way to design a beautifully wrong system.
  • Ignoring the Interviewer — Monologuing for 45 minutes without checking in, not responding to hints, not pausing for questions. System design is a conversation, not a lecture.
  • Vague Components — Drawing a box labeled "Database" without specifying what database, what schema, how it's replicated, or why you chose it over alternatives. Specificity signals experience.

The Final Minute Technique

In the last 2-3 minutes of the interview, say: "Let me quickly summarize the key decisions I made and the trade-offs: I chose fan-out on write for performance at the cost of write amplification. I'm accepting eventual consistency for feed freshness to avoid distributed transactions. The biggest risk is celebrity account fan-out — I'd monitor that write queue depth closely in production." This wrap-up shows L5+ thinking: you know what you decided, why, and what to watch out for. It leaves a strong final impression on the interviewer.

The most important meta-lesson: system design interviews test engineering judgment, not trivia. Interviewers aren't looking for the one perfect answer. They're looking for an engineer who thinks systematically, quantifies their decisions, acknowledges trade-offs honestly, and would be a good partner on a design review. That's a learnable set of behaviors — not innate talent.

graph TD
    Start[Hear the Problem] --> R[Clarify Requirements
5-10 min]
    R --> E[Capacity Estimation
5-10 min]
    E --> H[High-Level Design
10-15 min]
    H --> D[Deep Dive
15-20 min]
    D --> W[Wrap-Up
3 min]

    R --> R1[Functional Requirements]
    R --> R2[Non-Functional: CALS]
    E --> E1[DAU → QPS formula]
    E --> E2[Storage calculation]
    H --> H1[Draw 5 universal components]
    H --> H2[Narrate request flow]
    D --> D1[Pick the bottleneck]
    D --> D2[Mechanism + failure mode]
    W --> W1[Key decisions made]
    W --> W2[Trade-offs acknowledged]

The complete 45-minute system design interview framework — every phase, every deliverable

How this might come up in interviews

System design is a full interview round at every FAANG company for SDE II and above. At Meta, it's 2 rounds. At Google, it's 1-2 depending on level. The framework is not just for interviews — it's how senior engineers actually run design reviews. Engineers who can't structure a requirements conversation never make Staff level.

Common questions:

  • L4: "Design a URL shortener like bit.ly." [Tests: basic estimation, standard components, simple database design]
  • L4-L5: "Design a notification system that sends 10M push notifications per day." [Tests: async processing, fan-out, reliability, retry logic]
  • L5: "Design a distributed rate limiter." [Tests: distributed state management, consistency trade-offs, failure modes]
  • L5-L6: "Design Twitter's home timeline." [Tests: fan-out architecture, caching strategy, CAP theorem application, celebrity problem]
  • L6: "Design a globally distributed messaging system like WhatsApp." [Tests: consistency models, multi-region replication, exactly-once delivery, partition tolerance]
  • L6-L7: "Design Google's Bigtable." [Tests: first-principles distributed systems design, SSTable/LSM-tree knowledge, compaction strategies, data model design]

Key takeaways

  • The 4-step framework (clarify → estimate → sketch → deep-dive) works for every system design problem and is the single most important skill to internalize.
  • Spend 8-10 minutes on requirements, covering both functional and non-functional (CALS: Consistency, Availability, Latency, Scale). The non-functional requirements drive 80% of architectural decisions.
  • Always do capacity estimation with numbers on the board. DAU → QPS → storage. Derive at least one architectural implication from the numbers.
  • The deep dive separates L4 from L5+: describe the failure mode, not just the happy path. For every component, ask "what happens when this breaks?"
  • FAANG interviewers grade on process, trade-off discussion, and collaboration — not on whether your design matches the "right answer." There is no right answer; there are only well-reasoned trade-offs.
Before you move on: can you answer these?

You're 15 minutes into a system design interview and you haven't asked any requirements questions — you went straight to designing. The interviewer asks "What consistency model are you designing for?" What do you do?

Don't panic — course-correct immediately and cleanly. Say: "Good point — I should have established this upfront. Let me ask: do we need strong consistency, or is eventual consistency acceptable? And is there a specific latency SLA?" Get the answer, then explicitly update your design: "Okay, since eventual consistency is fine, I'll change from synchronous writes to all replicas to asynchronous replication with a write-ahead log, which will reduce write latency significantly." The ability to gracefully recover mid-design is itself a signal the interviewer is evaluating.

An interviewer says "Assume you're designing for 1 billion daily active users." You've never designed at that scale. How do you handle the estimation and design?

Break it down systematically and reason from first principles. 1B DAU × 10 requests/user/day = 10B requests/day. 10B / 86,400 = ~115,000 QPS average. Peak at 3x: ~350,000 QPS. No single machine handles this — you need horizontal scaling at every layer. For the database: no single DB handles 350K QPS — need sharding by user_id or consistent hashing, and Redis is mandatory for hot-path reads. For storage: at 1KB/request average, 10B requests = 10TB/day just for request logs. Use numbers to drive each architectural decision. Saying "I've never built at this scale, but let me reason through it" is perfectly fine — the math still works the same way.

What's the difference between how an L4 and L6 candidate would approach designing a distributed rate limiter?

An L4 candidate would describe the standard token bucket algorithm, mention Redis for distributed state, and call it done. An L6 candidate would: (1) First ask requirements — per-user, per-IP, or per-service? Sliding window, fixed window, or token bucket? What's the acceptable false-positive rate? (2) Explain why a centralized Redis approach has a single point of failure and latency problems at 100K+ QPS — and propose a hierarchical approach: local token bucket in each app server for fast rejection, synchronized with a central Redis counter every 100ms. (3) Discuss the consistency trade-off: with 100ms sync intervals, a user can briefly exceed the limit by up to 100ms×rate. (4) Mention that at Stripe scale, they use a combination of local rate limiting plus circuit breakers, not a single Redis key. The L6 candidate doesn't know more algorithms — they know more failure modes and have stronger opinions about production trade-offs.

🧠Mental Model

💡 Analogy

System design interviews are exactly like being a doctor. A good doctor doesn't prescribe medicine before examining the patient. They ask about symptoms (requirements), check vital signs and run tests (estimation), form a diagnosis (high-level design), then drill into the specific condition (deep dive). A doctor who prescribes chemo to every patient with a headache is dangerous. An engineer who proposes sharding to every system with "lots of traffic" is expensive.

⚡ Core Idea

The 4-step framework works for any system: clarify → estimate → sketch → deep-dive. The framework's power is that it forces you to gather information before deciding on solutions. Most engineering failures come from solving the wrong problem. The framework prevents that by making requirements the first step, not an afterthought.

🎯 Why It Matters

FAANG spends $50K+ to hire each engineer and $500K+ per year to employ them. They're betting your judgment on billion-dollar systems. The interview is designed to answer one question: can this person design systems that scale, fail gracefully, and evolve? The framework gives you a repeatable way to demonstrate exactly those skills in 45 minutes.

Interview prep: 1 resource

Use these to reinforce this concept for interviews.

  • System Design Primer (GitHub)
View all interview resources →

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Sign in to track your progress and mark lessons complete.

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A

Sign in to start or join a thread.