Interactive Explainer

🎯Key Takeaways

The core formula: DAU × requests/user/day ÷ 86,400 = average QPS. Multiply by 2-5x for peak. This single formula drives every major architecture decision.

Know the QPS architecture thresholds: < 10K (single server), < 100K (cache + replicas), < 1M (sharding + Redis cluster), > 1M (NoSQL + CDN + distributed everything).

Storage grows inexorably — always estimate 5-year capacity and note the 3x replication multiplier. The 5-year number reveals whether you need object storage (> 1TB), NoSQL (> 10TB), or tiered storage (> 100TB).

Bandwidth > 1 GB/sec mandates CDN — otherwise, data transfer costs alone become prohibitive. Media-heavy systems (YouTube, Instagram, Netflix) are almost entirely CDN problems.

Always derive architectural implications from your numbers: "230K read QPS — which means a single DB cannot handle this, so we need Redis in front of the database." The number is meaningless without the implication.

Back-of-the-Envelope Estimation

The math every FAANG engineer does in their head: QPS, storage, bandwidth, memory. Real numbers from Twitter, YouTube, Instagram, WhatsApp. Learn to be accurate within 2x, fast.

~22 min read

Be the first to complete!

What you'll learn

The core formula: DAU × requests/user/day ÷ 86,400 = average QPS. Multiply by 2-5x for peak. This single formula drives every major architecture decision.
Know the QPS architecture thresholds: < 10K (single server), < 100K (cache + replicas), < 1M (sharding + Redis cluster), > 1M (NoSQL + CDN + distributed everything).
Storage grows inexorably — always estimate 5-year capacity and note the 3x replication multiplier. The 5-year number reveals whether you need object storage (> 1TB), NoSQL (> 10TB), or tiered storage (> 100TB).
Bandwidth > 1 GB/sec mandates CDN — otherwise, data transfer costs alone become prohibitive. Media-heavy systems (YouTube, Instagram, Netflix) are almost entirely CDN problems.
Always derive architectural implications from your numbers: "230K read QPS — which means a single DB cannot handle this, so we need Redis in front of the database." The number is meaningless without the implication.

Lesson outline

Why Estimation Matters More Than You Think

Back-of-the-envelope estimation is the most underrated skill in system design. Most engineers treat it as a formality before drawing boxes — something you rush through to get to the "real" design. This is backwards. The estimates *are* the design. They tell you which architecture you need before you draw a single box.

Consider two engineers asked to design a photo sharing service. Engineer A skips estimation and draws: a single Postgres database, a Python API server, and an S3 bucket. Engineer B does the math first: 50M DAU × 3 photo views/day = 150M photo reads/day = 1,736 photo reads/sec average, 5,208/sec peak. At 500KB per photo, that's 2.6 GB/sec of bandwidth at peak. A single server cannot serve 2.6 GB/sec. Engineer A's design is wrong before it's even drawn.

Estimation Is Architecture Triage

Every capacity estimate answers a binary question: does the naive solution (single server, single DB) work, or do we need distributed infrastructure? 10K QPS → single DB fine. 100K QPS → read replicas needed. 1M QPS → sharding or NoSQL required. You can't make this decision without the numbers.

In a FAANG interview, estimation signals three things to the interviewer: (1) You quantify before you decide — the most critical engineering judgment skill. (2) You know the rough performance limits of common components. (3) You can derive architectural implications from numbers, not just recite patterns.

The good news: estimation is a skill with a formula, and the formula always works. There are only four estimation types you'll ever need: QPS, storage, bandwidth, and memory. Master these four and you can estimate any system.

The 4 Estimation Types and When Each Drives Architecture

QPS (Queries Per Second) — How many requests per second hit each component. Drives: horizontal scaling decisions, load balancer capacity, cache sizing. Rule: > 10K QPS needs caching. > 100K QPS needs sharding.
Storage — How much disk space you need now and in 5 years. Drives: database choice (SQL vs NoSQL), sharding strategy, object storage vs relational. Rule: > 1TB needs a storage growth strategy.
Bandwidth — Network bytes in/out per second. Drives: CDN necessity, server egress costs, media encoding decisions. Rule: > 100MB/sec egress needs a CDN or you'll spend a fortune on data transfer.
Memory / Cache Size — How much data fits in Redis or in-process memory. Drives: cache tiering strategy, cluster sizing. Rule: Calculate working set size (hot 20% of data) and size cache to hold it.

The Golden Numbers Cheat Sheet

Before you can estimate, you need a reference table of numbers to work from. These are the numbers experienced engineers have internalized. In an interview, you're expected to know these approximately — not perfectly, but within 2-3x.

Storage Unit	Bytes	Real-World Reference
1 KB	1,000 bytes	A short tweet with metadata (~280 chars + fields)
100 KB	100,000 bytes	A compressed profile photo thumbnail
1 MB	1,000,000 bytes	A high-quality profile photo or 1 minute of audio
10 MB	10,000,000 bytes	A 1080p video clip (10 seconds)
100 MB	100,000,000 bytes	A compressed feature-length movie (low quality)
1 GB	1,000,000,000 bytes	A full HD movie (standard quality)
1 TB	1,000,000,000,000 bytes	1,000 movies, or 1 year of detailed server logs
1 PB	1,000,000,000,000,000 bytes	Netflix's content library; YouTube uploads per month

Time Unit	Seconds	Use For
1 minute	60 sec	Rate limiting windows
1 hour	3,600 sec	Cache TTL calculation
1 day	86,400 sec	The most important number: DAU → QPS conversion
1 month	2,592,000 sec (~2.6M)	Monthly storage growth
1 year	31,536,000 sec (~31.5M)	Annual storage capacity planning

The Numbers That Drive Every Estimation

86,400 = seconds per day (memorize this). 1M DAU × 1 req/day ÷ 86,400 = ~12 QPS. 1M DAU × 10 req/day = ~116 QPS. 1M DAU × 100 req/day = ~1,160 QPS. The pattern: multiply DAU × requests/user/day, then divide by 86,400. That gives average QPS. Multiply by 2-3x for peak.

Latency numbers are equally important — they tell you when a solution is "fast enough" and when you need a cache:

Operation	Latency	What This Means in Practice
L1 cache hit	1 ns	Fastest possible — in CPU register
L2 cache hit	4 ns	Still extremely fast — in CPU cache
RAM access	100 ns	In-process memory (e.g., local variable)
SSD random read	100 µs (0.1ms)	1,000x slower than RAM
Network round trip (same datacenter)	500 µs (0.5ms)	Redis, inter-service call
HDD random read	10 ms	100x slower than SSD
Cross-datacenter network round trip	50-150 ms	US East to US West ~60ms; US to Europe ~150ms
Database query (no index)	100-1,000 ms	Full table scan — never acceptable in production

The latency table reveals why caching is so powerful: the difference between an in-memory Redis lookup (0.5ms) and a database query without index (100ms+) is 200x. If your system does 100K reads/sec and 99% hit Redis, only 1,000/sec hit the DB — a completely manageable load.

QPS Calculation: The Core Formula

QPS (Queries Per Second) is the single most important number in system design estimation. It determines whether you need a monolith or microservices, a single database or sharding, one cache node or a cluster.

The formula is always: QPS = DAU × requests_per_user_per_day ÷ 86,400

Then for peak: Peak QPS = Average QPS × peak_multiplier (typically 2-5x depending on traffic pattern)

QPS Calculation Walkthrough: Instagram-like App

→

Get the DAU: Assume 500M DAU (Instagram's actual figure). If you don't know, ask in the interview or make an explicit assumption.

→

Estimate requests per user per day: Mix of reads and writes. Assume: 20 feed scroll events/day (each loading 20 photos) = 400 photo reads. Plus 2 photo uploads = 2 writes. Total: ~402 requests/user/day. Round to 400 for simplicity.

→

Calculate average QPS: 500M × 400 / 86,400 = 200,000,000,000 / 86,400 ≈ 2,314,815 QPS total. Round to ~2.3M QPS.

→

Split read vs. write: 500M × 400 reads / 86,400 ≈ 2.3M read QPS. 500M × 2 uploads / 86,400 ≈ 11,574 write QPS. Read:write ratio ≈ 200:1.

→

Calculate peak QPS: Instagram's peak is during lunch and evening hours — roughly 3x average. Read peak: 2.3M × 3 = ~7M QPS. Write peak: 11,574 × 3 = ~35K QPS.

Derive architecture: 7M read QPS cannot be served by a single database (max ~50K-100K QPS for a very well-tuned Postgres). You need: CDN for static assets, Redis cluster for feed cache, multiple DB shards for user data. 35K write QPS: possible on a single primary DB with write buffering, but you'd want replicas.

Get the DAU: Assume 500M DAU (Instagram's actual figure). If you don't know, ask in the interview or make an explicit assumption.

Calculate average QPS: 500M × 400 / 86,400 = 200,000,000,000 / 86,400 ≈ 2,314,815 QPS total. Round to ~2.3M QPS.

Split read vs. write: 500M × 400 reads / 86,400 ≈ 2.3M read QPS. 500M × 2 uploads / 86,400 ≈ 11,574 write QPS. Read:write ratio ≈ 200:1.

Calculate peak QPS: Instagram's peak is during lunch and evening hours — roughly 3x average. Read peak: 2.3M × 3 = ~7M QPS. Write peak: 11,574 × 3 = ~35K QPS.

Common Mistake: Forgetting the Peak Multiplier

Average QPS is not what you design for. You design for peak QPS, because your system must handle the worst moment — Super Bowl halftime, a viral tweet, midnight on New Year's. Typical peak multipliers: Consumer apps (lunch/evening spikes): 3-5x. B2B apps (9-5 business hours): 4-8x during business hours vs. overnight. Event-driven apps (live sports, concerts): can be 10-50x during the event. Always ask: "What's the traffic pattern? Does it spike around events?"

graph LR
    A[DAU
e.g. 100M] --> B[× Requests/User/Day
e.g. 10]
    B --> C[= Daily Requests
1 Billion]
    C --> D[÷ 86,400 sec]
    D --> E[= Average QPS
~11,574]
    E --> F[× Peak Multiplier
3x]
    F --> G[= Peak QPS
~35,000]
    G --> H{Architectural Decision}
    H -->|< 10K QPS| I[Single DB
no cache needed]
    H -->|10K-100K QPS| J[Read replicas
+ Redis cache]
    H -->|100K-1M QPS| K[DB sharding
+ Redis cluster]
    H -->|> 1M QPS| L[NoSQL + CDN
+ distributed everything]

The QPS formula and how it drives architectural decisions

One important nuance: not all requests are equal. A "request" for a tweet feed requires 1 DB query. A "request" for a search might require 100 operations. When estimating, break mixed workloads into categories: read requests, write requests, search requests, media requests. Each has a different backend cost.

Storage Estimation: Planning for 5 Years

Storage estimation determines your database type, sharding strategy, and data retention policy. The key insight: you don't just estimate current storage — you estimate 5-year growth, because migrating a database at 10TB is much harder than planning for it at 10GB.

The formula: Daily storage = writes_per_day × avg_object_size. Then multiply by 365 × 5 for 5-year capacity. Add metadata overhead (typically 20-30% extra).

Content Type	Avg Size	Compression	Effective Size
Tweet / Short text post	280 chars + metadata	2x compression	~250 bytes effective
User profile record	Name, bio, settings, etc.	Low compressibility	~2 KB
Instagram photo (display)	1080p compressed JPG	Already compressed	~300 KB
Instagram photo (original)	12MP HEIC from iPhone	Minimal	~8 MB
WhatsApp text message	~200 bytes + metadata	2x	~100 bytes effective
YouTube video (1080p, 10 min)	H.264 encoded	Already compressed	~500 MB
Spotify audio track (3 min)	320kbps MP3	Already compressed	~7 MB
Database row (typical)	ID + 10 fields + indexes	N/A	~1-10 KB with index overhead

Twitter Storage Estimation: Full Walkthrough

→

Tweet metadata storage: 300M DAU × 5 tweets/day = 1.5B tweets/day. Wait — daily active users don't all tweet. Realistically ~10% of users tweet daily. So 300M × 10% × 5 = 150M tweets/day.

→

Bytes per tweet: tweet_id (8B) + user_id (8B) + text (280 bytes avg ≈ 200 compressed) + timestamps (8B) + metadata (100B) ≈ 324 bytes. Round to 350 bytes.

→

Daily text storage: 150M × 350B = 52.5 GB/day.

→

5-year text storage: 52.5 GB × 365 × 5 = 95.8 TB ≈ 100 TB. This means: you cannot fit this in a single Postgres instance (max practical size ~10-20TB before it gets painful). You need partitioning or a NoSQL database like Cassandra.

→

Media storage: Assume 30% of tweets have images at 150KB average (compressed for display). 45M images/day × 150KB = 6.75 TB/day. 5-year media: 6.75 TB × 365 × 5 = 12.3 PB. This requires object storage (S3) with lifecycle policies to archive cold media to Glacier.

Cache storage: You're not caching everything — just hot recent data. Latest 100 tweets per user in feed cache: 300M users × 100 tweets × 350 bytes = 10.5 TB. Even 10% of users active at once = 1.05 TB of cache needed. This requires a Redis cluster — definitely not a single node.

Tweet metadata storage: 300M DAU × 5 tweets/day = 1.5B tweets/day. Wait — daily active users don't all tweet. Realistically ~10% of users tweet daily. So 300M × 10% × 5 = 150M tweets/day.

Bytes per tweet: tweet_id (8B) + user_id (8B) + text (280 bytes avg ≈ 200 compressed) + timestamps (8B) + metadata (100B) ≈ 324 bytes. Round to 350 bytes.

Daily text storage: 150M × 350B = 52.5 GB/day.

The Storage → Architecture Decision Map

< 100 GB: Single Postgres instance. Use RDS Multi-AZ for availability. 100 GB - 10 TB: Postgres with partitioning by date/user_id. Or DynamoDB for write-heavy workloads. > 10 TB: Must shard. Either PostgreSQL with horizontal sharding (Citus), or NoSQL (Cassandra, DynamoDB). > 1 PB: Object storage (S3/GCS) is mandatory. Use tiered storage (hot → warm → cold → archive).

Bandwidth Estimation: Network Is Never Free

Bandwidth estimation is often skipped in interviews, but it's critical for two reasons: it determines whether you need a CDN (which changes your architecture significantly) and it drives cloud infrastructure costs (AWS charges $0.09/GB for data egress — this adds up fast).

The formula: Bandwidth = QPS × avg_response_size. For ingress (upload): Ingress = write_QPS × avg_object_size.

YouTube Bandwidth Estimation

→

Context: YouTube has 2.7B logged-in users. Assume 500M DAU. ~1B videos are watched per day.

→

Watch QPS: 1B views/day ÷ 86,400 = ~11,574 video streams/sec average. Peak (2-3x): ~30,000 concurrent streams.

→

Video bitrate: A 1080p YouTube stream ≈ 5 Mbps (megabits per second). A 720p stream ≈ 2.5 Mbps. Assume average 3 Mbps.

→

Egress bandwidth: 30,000 concurrent streams × 3 Mbps = 90,000 Mbps = 90 Gbps egress at peak. This is why YouTube has its own CDN (Google's Bandwidth Sharing network) — $0.09/GB × 90 Gbps × 3600 sec = $29,160/hour if they used AWS CDN. They save ~$200M/year by peering directly with ISPs.

→

Upload bandwidth: 500 hours of video uploaded per minute (YouTube's real stat). 500 hours × 60 min × 500 MB (avg HD upload) ÷ 60 sec = ~250 GB/sec ingress. This is why uploads go through specialized upload servers with aggressive compression pipelines.

Storage cost implication: 500 hours × 60 min × 500 MB = 15,000 TB/minute = 15 PB/minute. At $0.023/GB on S3, that's $345,000/minute of storage cost if unprocessed. In reality, YouTube transcodes to multiple resolutions (360p, 720p, 1080p, 4K) and deletes originals after a few days.

Context: YouTube has 2.7B logged-in users. Assume 500M DAU. ~1B videos are watched per day.

Watch QPS: 1B views/day ÷ 86,400 = ~11,574 video streams/sec average. Peak (2-3x): ~30,000 concurrent streams.

Video bitrate: A 1080p YouTube stream ≈ 5 Mbps (megabits per second). A 720p stream ≈ 2.5 Mbps. Assume average 3 Mbps.

Bandwidth Threshold → CDN Decision

Rule of thumb: if your static asset or media bandwidth exceeds 1 GB/sec, you need a CDN. Without CDN, your origin servers spend all their time serving bytes instead of processing requests. With a CDN, 95%+ of asset requests are served from edge nodes close to the user, cutting origin load by 20x and reducing latency from 100ms to < 10ms.

graph LR
    U[User Devices
1B requests/day] -->|No CDN| OS[Origin Servers
11,574 req/sec]
    OS --> DB[(Database)]

    U2[User Devices
1B requests/day] -->|95% cache hit| CDN[CDN Edge
~11,000 req/sec cached]
    CDN -->|5% miss| OS2[Origin Servers
~578 req/sec]
    OS2 --> DB2[(Database)]

    note1[Without CDN: 11,574 req/sec to origin]
    note2[With CDN: 578 req/sec to origin
20x reduction]

CDN reduces origin server load by 20x for read-heavy media workloads

Service	Daily Bandwidth	Peak Bandwidth	CDN Strategy
Twitter (text only)	~50 GB/day	~10 MB/sec	CDN for media only; API is lightweight
Instagram	~500 TB/day	~10 GB/sec	Multi-CDN (Akamai + FB PoPs), image CDN
YouTube	~15 EB/day uploads, ~500 PB/day views	~90 Gbps streaming	Own CDN (Google Global Cache), ISP peering
Netflix	~400 PB/day streaming	~100 Gbps	Open Connect CDN, placed inside ISPs
WhatsApp	~100 PB/day media	~30 Gbps	Multiple CDNs + direct ISP peering

Memory and Cache Sizing

Cache sizing is the estimation that most engineers get wrong, because they either over-provision (wasting money) or under-provision (causing cache thrashing that kills database performance). The right approach is to estimate the working set — the "hot" data that gets accessed repeatedly.

The 80-20 rule applies to caching: 20% of data accounts for 80% of reads. This is the data that should be cached. If you size your cache to hold 20% of your total data, you'll absorb 80% of your read load.

Redis Cache Sizing for a Twitter-like Feed

→

Total data: 300M users × 100 recent tweets in feed × 350 bytes/tweet = 10.5 TB total feed data.

→

Working set (hot 20%): 10.5 TB × 20% = 2.1 TB. But not all hot users are active simultaneously.

→

Concurrently active users: At peak, Twitter sees ~50M concurrent sessions. 50M ÷ 300M = 16.7% of users active.

→

Cache target: Size cache to hold feed data for concurrent active users. 50M × 100 tweets × 350 bytes = 1.75 TB. Round to 2 TB for headroom.

→

Redis nodes needed: A single Redis node maxes out at ~25-50 GB of memory before management overhead degrades performance. 2 TB ÷ 25 GB = 80 Redis nodes. Use Redis Cluster with consistent hashing. Add 50% overhead for replication: 120 nodes total.

Cost implication: 120 × r6g.xlarge Redis nodes on AWS ≈ 120 × $0.226/hour = $27.12/hour = $237,811/year just for feed cache. This is why Twitter invested in their own Nighthawk cache system.

Total data: 300M users × 100 recent tweets in feed × 350 bytes/tweet = 10.5 TB total feed data.

Working set (hot 20%): 10.5 TB × 20% = 2.1 TB. But not all hot users are active simultaneously.

Concurrently active users: At peak, Twitter sees ~50M concurrent sessions. 50M ÷ 300M = 16.7% of users active.

Cache target: Size cache to hold feed data for concurrent active users. 50M × 100 tweets × 350 bytes = 1.75 TB. Round to 2 TB for headroom.

Cost implication: 120 × r6g.xlarge Redis nodes on AWS ≈ 120 × $0.226/hour = $27.12/hour = $237,811/year just for feed cache. This is why Twitter invested in their own Nighthawk cache system.

Cache Sizing Rules of Thumb

For any cache sizing question: (1) Calculate total data size for the domain. (2) Take 20% for working set. (3) Account for serialization overhead (Redis stores objects with key + value + metadata — typically 2-3x raw data size). (4) Single Redis node: 25-50 GB usable. (5) Add 50% overhead for replicas. The formula: cache_nodes = (working_set_GB × 2.5 overhead × 1.5 replication) ÷ 30 GB per node.

Cache Type	Typical Capacity	Latency	Use Case
L1 (in-process, Guava/Caffeine)	1-10 GB per JVM	< 1 µs	Frequently accessed config, auth tokens, computed values
L2 (Redis single node)	25-50 GB	0.5-2 ms	Session data, user profiles, small working sets
L3 (Redis Cluster)	100 GB - 100 TB	1-5 ms	Feed data, product catalog, large working sets
CDN Edge Cache	Unlimited (edge-distributed)	1-50 ms	Static assets, media files, API responses with long TTL

Real Case Study: Estimating Twitter at Scale

Let's put it all together with a real estimation exercise. We'll estimate Twitter at its actual scale (pre-Elon acquisition, ~2022), then derive the architecture directly from the numbers.

Twitter's Real Numbers (2022)

Twitter had ~238M mDAU (monetizable daily active users). ~500M tweets per day. ~350K QPS at peak (verified by Twitter Engineering blog). ~1PB of new data per day including media. Their infrastructure ran on ~100,000 servers.

Full Twitter Estimation: Matching Reality

→

Write QPS (tweets): 500M tweets/day ÷ 86,400 = 5,787 tweets/sec average. Peak (3x): ~17,361 tweets/sec. Validation: Twitter Engineering has confirmed ~350K total QPS including reads — this tweet-write rate is plausible.

→

Read QPS (timeline reads): Twitter has ~100:1 read:write ratio for feeds. 5,787 × 100 = ~578,700 read QPS avg. Peak: ~1.7M read QPS. Additional: search queries, notifications, DMs — total peaks around 350K-500K QPS (lower than pure feed math because not everyone loads feed constantly).

→

Tweet storage: 500M tweets/day × 500 bytes = 250 GB/day of tweet text. 5-year tweet storage: 250 GB × 365 × 5 = 456 TB ≈ 0.5 PB. This drove Twitter to Cassandra (NoSQL) — too large for a single Postgres.

→

Media storage: ~30% of tweets have media. 150M media tweets × 500KB avg compressed = 75 TB/day of new media. 5-year: 75 TB × 365 × 5 = 136.875 PB ≈ 137 PB. Twitter uses their own Blobstore + CDN for this.

→

Fan-out cost: Avg follower count: ~200. Celebrity accounts: up to 130M followers. When Taylor Swift tweets: 130M × 1 Redis write = 130M operations. At 1M Redis ops/sec, that's 130 seconds to fan out — unacceptable. This drives the hybrid fan-out architecture Twitter calls "Flock."

Cache size: 238M DAU. Feed cache: top 100 tweets per user × 500 bytes × 238M users = 11.9 TB working set. At peak concurrent users (50M at once): 50M × 100 × 500B = 2.5 TB needed in Redis. Twitter runs ~2,000+ Redis nodes total for various cache layers.

graph TB
    Numbers["Twitter Numbers
238M DAU, 500M tweets/day"] --> QPS["Write QPS: 5,787/sec
Read QPS: 578K/sec peak"]
    Numbers --> Storage["Tweet Storage: 250GB/day
Media: 75TB/day
5-year: 137PB media"]
    Numbers --> Cache["Feed Cache: 2.5TB
Requires Redis Cluster
~2000+ nodes at Twitter"]
    QPS --> Arch1["→ Cassandra for tweet storage
(too much for Postgres)"]
    QPS --> Arch2["→ Redis mandatory
(100:1 read:write)"]
    Storage --> Arch3["→ Object storage + CDN
(137PB needs S3-scale)"]
    Cache --> Arch4["→ Hybrid fan-out
(celebrity accounts overflow Redis)"]

Twitter estimation → architecture decisions: every box in the architecture is justified by a number

Real Case Study: Estimating YouTube Video Storage

YouTube is the canonical example for bandwidth and storage estimation because the numbers are staggeringly large and well-documented. This case study shows how media-heavy services are fundamentally different from text-heavy services.

YouTube's actual stats (2023): 500 hours of video uploaded per minute. 2.7B logged-in users. 1B+ hours of video watched per day.

YouTube Storage and Bandwidth Deep Dive

→

Upload rate: 500 hours/min × 60 min × 24 hours = 720,000 hours of video per day. At 1 GB per hour of raw 1080p video: 720,000 GB = 720 TB of raw uploads per day.

→

Transcoded storage: YouTube stores videos at 7 quality levels: 144p, 240p, 360p, 480p, 720p, 1080p, 4K. Combined, transcoded versions ≈ 3x the size of the original. 720 TB × 3 = 2.16 PB added per day.

→

5-year total video storage: 2.16 PB/day × 365 × 5 = 3.942 EB ≈ ~4 Exabytes. Only Google-scale infrastructure can handle this. YouTube uses Google's Colossus distributed file system.

→

Viewing bandwidth: 1B hours/day of video watched. At 2 Mbps average bitrate (mix of 720p, 1080p, mobile): 1B hours × 3600 sec × 2 Mbps = 7,200,000,000,000 Mbps = 7.2 × 10^12 Mbps. Per second: 1,157,407 Mbps ≈ 1.16 Tbps average egress. Peak (3x): ~3.5 Tbps.

→

CDN implication: 3.5 Tbps peak egress at $0.09/GB = $0.09 × (3.5 × 10^12 bits / 8 bits per byte) per second = $39,375/second = $141M/hour at standard CDN pricing. This is why YouTube/Google runs its own CDN (Google Global Cache), peering directly with major ISPs at no data transfer cost.

Video processing QPS: When a video is uploaded, YouTube processes it across a transcoding pipeline — one task per resolution × bitrate variant. 720,000 hours uploaded/day ÷ 86,400 sec = 8.33 hours of video uploaded/sec. At 7 transcoding jobs per upload: ~58 transcode jobs/sec. This requires a dedicated job queue (YouTube uses custom Borgqueue workers on Google's cluster management system).

Upload rate: 500 hours/min × 60 min × 24 hours = 720,000 hours of video per day. At 1 GB per hour of raw 1080p video: 720,000 GB = 720 TB of raw uploads per day.

5-year total video storage: 2.16 PB/day × 365 × 5 = 3.942 EB ≈ ~4 Exabytes. Only Google-scale infrastructure can handle this. YouTube uses Google's Colossus distributed file system.

The Estimation → Architecture Reveal for YouTube

3.5 Tbps peak egress → Google runs its own CDN (Google Global Cache boxes inside ISP networks). 4 EB of video storage → Google Colossus distributed file system (custom, not S3). 58 transcode jobs/sec → Dedicated async job queue, not in-band processing. 1B hours watched/day → Recommendation system must serve personalized suggestions at < 100ms. These aren't random architecture choices — every one is forced by the numbers.

Common Estimation Mistakes and How to Avoid Them

After seeing hundreds of estimation exercises in system design interviews, there are consistent mistakes that engineers make. These mistakes either make the interviewer lose confidence in the estimates, or worse, lead to designing the wrong architecture.

The 8 Most Common Estimation Mistakes

Confusing bits and bytes — Network bandwidth is measured in bits/sec (Mbps, Gbps). Storage is measured in bytes (MB, GB). 1 Gbps network bandwidth = 125 MB/sec throughput. Confusing these causes 8x errors. Always specify: "1 Gbps = 125 MB/sec = 125,000 KB/sec."
Forgetting the peak multiplier — Average QPS is not design QPS. Design for 2-5x peak. A system that handles 10K avg QPS but crashes at 30K peak has failed.
Not accounting for metadata overhead — A 100-byte message doesn't use 100 bytes. Add: message headers (~50 bytes), database row metadata (~100 bytes), index storage (~200 bytes per indexed field). Real overhead is typically 2-3x raw data.
Ignoring replication overhead — With 3x replication (standard for durability), your storage needs triple. 1 TB of user data = 3 TB of actual disk needed across replicas.
Estimating total users, not DAU — Total registered users can be 10-30x DAU. Estimating for total users massively overestimates QPS. Always clarify: "Do you mean MAU or DAU? I'll use DAU for request estimation."
Assuming uniform traffic — Traffic has daily and weekly patterns. APIs get 4x more traffic on weekday mornings than Saturday nights. Event-driven traffic (elections, sports finals) can spike 50-100x.
Not deriving architectural implications — The goal of estimation isn't the numbers — it's the decisions. Always follow a number with "which means X." "230K read QPS — which means we cannot hit the database directly and need a Redis cache in front."
Over-precision — Saying "1,736.111 QPS" instead of "~1.7K QPS" wastes time and signals lack of engineering judgment. Round aggressively: 1.7K, 230K, 1.5M. The goal is to be in the right order of magnitude.

The Estimation Sanity Check

After any estimation, do a quick sanity check against known real-world systems: Is your QPS estimate higher than Google Search (~8.5 billion queries/day = ~100K QPS)? Then something is probably wrong. Is your storage estimate larger than Amazon S3 (hundreds of exabytes)? Recheck. Known reference points: Netflix peak streaming: ~500 Gbps. Visa transaction peak: ~24,000 TPS. AWS S3: 100+ trillion objects. Google Search: ~99K QPS.

graph TD
    Start[Start Estimation] --> DAU[Get DAU from requirements]
    DAU --> ReqPerUser[Estimate requests per user per day
by use case]
    ReqPerUser --> AvgQPS[Average QPS = DAU × req ÷ 86400]
    AvgQPS --> PeakQPS[Peak QPS = Average × 2-5x]
    PeakQPS --> Decision{What does Peak QPS tell us?}
    Decision -->|< 1K| Mono[Monolith + single DB]
    Decision -->|1K - 10K| SingleDB[Single DB + cache]
    Decision -->|10K - 100K| Replicas[Read replicas + Redis]
    Decision -->|100K - 1M| Sharding[DB sharding + Redis cluster]
    Decision -->|> 1M| NoSQL[NoSQL + CDN + custom infra]

    Mono --> Storage[Calculate storage]
    SingleDB --> Storage
    Replicas --> Storage
    Sharding --> Storage
    NoSQL --> Storage
    Storage --> Bandwidth[Calculate bandwidth]
    Bandwidth --> CDN{Bandwidth > 1 GB/s?}
    CDN -->|Yes| AddCDN[Add CDN to architecture]
    CDN -->|No| NoCDN[CDN optional]

Complete estimation decision flowchart: from DAU to architecture decisions

How this might come up in interviews

Every FAANG system design interview includes an estimation section, usually after requirements. The interviewer is explicitly watching for: (1) Does the candidate write numbers on the board? (2) Do they show the formula? (3) Do they derive architecture from numbers rather than recite patterns? Some interviewers ask standalone estimation questions: "Without designing a system, just estimate: how many storage servers does Instagram need?"

Common questions:

L4: "Estimate the QPS for a URL shortener serving 100M URLs/day." [Tests: basic formula, DAU → QPS conversion, simple storage estimate]
L4-L5: "Estimate storage requirements for a Twitter-like system at 300M DAU." [Tests: tweet size estimation, media vs. text separation, 5-year projection]
L5: "Instagram has 1B users and 50B photos stored. How many storage servers does Instagram need?" [Tests: working set estimation, replication factor, hardware sizing]
L5-L6: "Netflix streams 500M hours of video per day. Estimate the bandwidth cost and what this tells you about Netflix's CDN strategy." [Tests: bandwidth math, cost implications, CDN architecture derivation]
L6: "Estimate the cache size needed for Twitter's home timeline feature. Then tell me how this number changes the architecture." [Tests: working set calculation, Redis cluster sizing, hybrid fan-out implications]
L6-L7: "Given that YouTube uploads 500 hours of video per minute, estimate the full pipeline: ingestion bandwidth, transcoding CPU, storage, and CDN bandwidth. What does each number tell you about YouTube's infrastructure?" [Tests: multi-system estimation, pipeline capacity matching, cost optimization reasoning]

Key takeaways

The core formula: DAU × requests/user/day ÷ 86,400 = average QPS. Multiply by 2-5x for peak. This single formula drives every major architecture decision.
Know the QPS architecture thresholds: < 10K (single server), < 100K (cache + replicas), < 1M (sharding + Redis cluster), > 1M (NoSQL + CDN + distributed everything).
Storage grows inexorably — always estimate 5-year capacity and note the 3x replication multiplier. The 5-year number reveals whether you need object storage (> 1TB), NoSQL (> 10TB), or tiered storage (> 100TB).
Bandwidth > 1 GB/sec mandates CDN — otherwise, data transfer costs alone become prohibitive. Media-heavy systems (YouTube, Instagram, Netflix) are almost entirely CDN problems.
Always derive architectural implications from your numbers: "230K read QPS — which means a single DB cannot handle this, so we need Redis in front of the database." The number is meaningless without the implication.

Before you move on: can you answer these?

An interviewer says: "Design a Dropbox-like file storage system for 100M users, where each user stores an average of 200 files at 5MB each." Estimate the storage requirements for 5 years.

Current storage: 100M users × 200 files × 5MB = 100 TB total data. With 3x replication for durability: 300 TB. With 20% metadata overhead: 360 TB today. Growth rate: assume 20% year-over-year growth in files and 10% in average file size. Year 1: 360 TB × 1.2 = 432 TB. Year 2: 518 TB. Year 3: 622 TB. Year 4: 746 TB. Year 5: 895 TB ≈ ~1 PB. Architectural implication: 1 PB mandates object storage (S3 or equivalent) with tiered storage (frequently accessed files on hot storage, rarely accessed on Glacier/cold tier). Cannot use relational DB for file bytes — only for file metadata (owner, path, version history). This also requires chunking: Dropbox splits files into 4MB chunks and only syncs changed chunks, reducing upload bandwidth by 70%.

You're estimating a real-time chat system like WhatsApp. 2B users, 60B messages/day. How do you estimate storage and what database would you use?

Storage per message: user_id (8B) + recipient_id (8B) + message (avg 100 bytes compressed) + timestamps (16B) + metadata (30B) ≈ 162 bytes. Round to 200 bytes. Daily storage: 60B messages × 200B = 12 TB/day. 5-year storage: 12 TB × 365 × 5 = 21.9 PB ≈ 22 PB. With 3x replication: 66 PB. This immediately rules out SQL — no SQL database at reasonable cost handles 66 PB. WhatsApp actually uses Erlang + an in-house built distributed database. For an interview answer: Cassandra is ideal. Why? Messages are written once and read a few times. Access pattern is always "get messages for conversation X after timestamp Y" — perfect Cassandra partition key (conversation_id) + clustering column (timestamp) model. Note: WhatsApp chose to NOT store messages long-term on the server — messages are stored on devices and servers only as a relay. This reduces server storage by 99% but requires clients to be the source of truth.

Your interview question is "Design YouTube." Your estimation shows 3.5 Tbps of peak egress bandwidth. The interviewer asks: "What does that number tell you about the architecture?" How do you answer?

3.5 Tbps of egress has massive cost and architectural implications. First, cost: at standard CDN rates of $0.09/GB, 3.5 Tbps = 437.5 GB/sec × $0.09 = $39,375/second = $141M/hour. This is economically impossible to route through a commercial CDN. Second, architectural implication: YouTube must own its CDN infrastructure. This is exactly what they built — Google Global Cache (GGC), a network of edge servers placed physically inside ISP data centers, serving video directly to users without paying data transfer costs. YouTube negotiates peering agreements with ISPs rather than paying per-byte. Third, content placement: at 3.5 Tbps peak, you need to pre-place popular videos at edge nodes close to users. 20% of videos account for 80% of views (Zipf distribution). Caching only the popular 20% at edge handles 80% of bandwidth. This drives the "cache warming" pipeline that YouTube runs — predicting which videos will trend and pre-seeding them to edge nodes before the traffic spike hits.

🧠Mental Model

💡 Analogy

Back-of-the-envelope estimation is exactly like weather forecasting. A meteorologist doesn't need to know the exact temperature at every point in the atmosphere to accurately predict rain. They use known patterns, simplified models, and error bounds to get a forecast that's "right enough" to be useful. Similarly, you don't need exact QPS — you need to be within 2x to make the right architectural decision. ±2x is fine. ±10x (confusing millions for thousands) means you design the wrong system entirely.

⚡ Core Idea

Every estimation follows the same formula: DAU × action_rate → QPS → architectural boundary. QPS < 10K: single server. QPS < 100K: read replicas + cache. QPS < 1M: sharding + Redis cluster. QPS > 1M: NoSQL + CDN + distributed everything. The numbers aren't just academic — they are the architecture. If you skip estimation, you're guessing at which architecture to use.

🎯 Why It Matters

The difference between a 10K QPS system and a 1M QPS system isn't just 100x more servers. It's an entirely different architecture: different database type, different caching strategy, different deployment model, 100x more operational complexity. Wrong-order-of-magnitude estimates cause engineers to over-engineer (wasting millions in infrastructure) or under-engineer (causing production outages). Estimation is the skill that prevents both.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

Back-of-the-Envelope Estimation

Why Estimation Matters More Than You Think

The Golden Numbers Cheat Sheet

QPS Calculation: The Core Formula

Storage Estimation: Planning for 5 Years

Bandwidth Estimation: Network Is Never Free

Memory and Cache Sizing

Real Case Study: Estimating Twitter at Scale

Real Case Study: Estimating YouTube Video Storage

Common Estimation Mistakes and How to Avoid Them

Discussion

In-app Q&A

Back-of-the-Envelope Estimation

Why Estimation Matters More Than You Think

The Golden Numbers Cheat Sheet

QPS Calculation: The Core Formula

Storage Estimation: Planning for 5 Years

Bandwidth Estimation: Network Is Never Free

Memory and Cache Sizing

Real Case Study: Estimating Twitter at Scale

Real Case Study: Estimating YouTube Video Storage

Common Estimation Mistakes and How to Avoid Them

Discussion

In-app Q&A