Skip to main content
Career Paths
Concepts
System Design Thinking
The Simplified Tech

Role-based learning paths to help you master cloud engineering with clarity and confidence.

Product

  • Career Paths
  • Interview Prep
  • Scenarios
  • AI Features
  • Cloud Comparison
  • Pricing

Community

  • Join Discord

Account

  • Dashboard
  • Credits
  • Updates
  • Sign in
  • Sign up
  • Contact Support

Stay updated

Get the latest learning tips and updates. No spam, ever.

Terms of ServicePrivacy Policy

© 2026 TheSimplifiedTech. All rights reserved.

BackBack
Interactive Explainer

System Design Thinking

How senior engineers approach architecture from scratch: requirements → scale estimation → components → bottlenecks → trade-offs.

🎯Key Takeaways
System design thinking is a process: Requirements → Estimation → Components → Evaluation (RECE).
Start with requirements: clarify functional, non-functional, scale, and out-of-scope before drawing any boxes.
Back-of-envelope math: estimate reads/second, writes/second, storage, and bandwidth to choose the right component class.
Standard components and when to use each: cache (high read:write ratio), message queue (async decoupling), sharding (write throughput exhausted), read replicas (read-heavy, eventual consistency acceptable).
Identify bottlenecks explicitly: "Where does this design break first if traffic doubles?"

System Design Thinking

How senior engineers approach architecture from scratch: requirements → scale estimation → components → bottlenecks → trade-offs.

~6 min read
Be the first to complete!
What you'll learn
  • System design thinking is a process: Requirements → Estimation → Components → Evaluation (RECE).
  • Start with requirements: clarify functional, non-functional, scale, and out-of-scope before drawing any boxes.
  • Back-of-envelope math: estimate reads/second, writes/second, storage, and bandwidth to choose the right component class.
  • Standard components and when to use each: cache (high read:write ratio), message queue (async decoupling), sharding (write throughput exhausted), read replicas (read-heavy, eventual consistency acceptable).
  • Identify bottlenecks explicitly: "Where does this design break first if traffic doubles?"

Lesson outline

The difference between a junior and senior engineer in a design review

A junior engineer asked "Design a URL shortener" immediately starts drawing boxes: "We need a database, an API server, and a cache."

A senior engineer asks questions first: "How many URLs per day? Read-heavy or write-heavy? Do short links expire? Analytics required? Global or regional?" Only after understanding the constraints do they propose components.

System design thinking is not a list of components. It is a process of reasoning under uncertainty. The output is not the "right answer" — it is a defensible design whose trade-offs you can explain.

The RECE framework

Requirements → Estimation → Components → Evaluation. A structured approach to any system design problem: clarify requirements, estimate scale, design components to meet that scale, then evaluate bottlenecks and trade-offs.

Step 1: Requirements clarification (spend 5–10 minutes here)

Most design failures come from building the wrong thing, not from building the thing wrong. Requirements clarification is not a formality — it is where you discover constraints that change the entire design.

The four categories of requirements to clarify

  • Functional requirements — What does the system do? What are the core user actions? For a Twitter clone: post tweets, follow users, view timeline. Be specific: "view timeline" means what? Chronological? Ranked? Limited to who they follow or global?
  • Non-functional requirements — Availability, latency, consistency, durability. "How important is it that users always see the latest data?" (consistency vs availability trade-off). "What is the acceptable latency for a tweet appearing in a follower's feed?" These constraints drive architecture choices.
  • Scale requirements — 100 users or 100 million? 10 requests/second or 100,000? How many writes vs reads (write:read ratio)? How much data (storage estimates)? These numbers determine whether you need caching, sharding, global distribution, or a CDN.
  • Constraints and out-of-scope — What are you NOT building? Authentication? Payment? Content moderation? Clarifying scope prevents you from designing a system that is too complex to discuss in the time available.

Step 2: Scale estimation (back-of-envelope math)

Scale estimation tells you what class of problem you are solving. It determines whether you need a single database or a globally distributed cluster.

The numbers every engineer should know

Latency (approximate): - L1 cache: 0.5ns | L2 cache: 7ns | RAM: 100ns - SSD random read: 100µs | HDD seek: 10ms - Network: same DC = 0.5ms | cross-region = 30-150ms Bandwidth: - SSD throughput: 500MB/s | HDD: 100MB/s - 1Gbps network: 125MB/s | 10Gbps: 1.25GB/s Storage: - 1M users × 1KB profile = 1GB - 1B images × 300KB average = 300TB - 86,400 seconds/day | 31M seconds/month

Back-of-envelope estimation for a social media app (100M DAU)

→

01

Reads: 100M DAU × 20 timeline refreshes/day = 2B read requests/day = ~23,000 reads/second

→

02

Writes: 100M DAU × 2 posts/day = 200M writes/day = ~2,300 writes/second

→

03

Write:read ratio = 1:10 → read-heavy, cache is critical

→

04

Storage: 200M posts/day × 280 chars × 2 bytes = ~112GB/day → 40TB/year of text alone

→

05

Bandwidth: 23,000 reads/second × 10KB average timeline payload = 230MB/s read bandwidth needed

06

Conclusion: multiple database replicas needed, read cache mandatory, CDN for media, write-through or write-behind cache for hot timelines

1

Reads: 100M DAU × 20 timeline refreshes/day = 2B read requests/day = ~23,000 reads/second

2

Writes: 100M DAU × 2 posts/day = 200M writes/day = ~2,300 writes/second

3

Write:read ratio = 1:10 → read-heavy, cache is critical

4

Storage: 200M posts/day × 280 chars × 2 bytes = ~112GB/day → 40TB/year of text alone

5

Bandwidth: 23,000 reads/second × 10KB average timeline payload = 230MB/s read bandwidth needed

6

Conclusion: multiple database replicas needed, read cache mandatory, CDN for media, write-through or write-behind cache for hot timelines

Step 3: Component design

After requirements and estimation, design the components to meet those constraints. Start simple, then layer on complexity where the numbers demand it.

Standard architectural components and when you need each

  • Load balancer — When: more than one app server. Function: distribute traffic, health checks, TLS termination. AWS: ALB (L7, HTTP routing), NLB (L4, TCP, low latency).
  • Cache (Redis/Memcached) — When: read:write ratio > 5:1, latency < 10ms required for hot data, repeated queries for the same data. Cache user sessions, timeline feeds, popular product pages. Never cache financial balances without strict invalidation.
  • Message queue (SQS, Kafka) — When: decoupling producers from consumers, async processing, absorbing traffic spikes. Email sends, image processing, order fulfillment — anything that can be async should be.
  • CDN (CloudFront) — When: static assets (JS, CSS, images), edge caching for API responses, DDoS protection. Reduces origin server load dramatically — 90% of Netflix traffic served from CDN edge.
  • Database sharding — When: single database cannot handle write load even with optimized queries and connection pooling. Typically needed at >100k writes/second. Shard by user ID, geographic region, or consistent hash.
  • Read replicas — When: read:write ratio > 10:1 and cache cannot fully absorb reads. Route read queries to replicas, writes to primary. Introduces replication lag — acceptable for eventually consistent data (comments), not for financial data.
Quick check

A system has 5,000 reads/second and 500 writes/second. Which component should you prioritize adding first?

Step 4: Identify bottlenecks and trade-offs

Every design has bottlenecks and trade-offs. The ability to identify and articulate them is what separates a senior engineer from a junior engineer in a design review.

ComponentCommon bottleneckSolutionTrade-off created
DatabaseWrite throughput exceeds single-node capacitySharding by user IDCross-shard queries become expensive; joins across shards are impossible
CacheCache invalidation bugs (stale data)TTL + explicit invalidation on writesAdds complexity; eventual consistency window (brief stale reads)
Fan-out (Twitter model)Celebrity with 50M followers: 50M writes on each tweetPull model for celebrities (compute timeline on read)Higher read latency for users who follow celebrities
Consistency vs availabilityNetwork partition splits primary and replicaChoose: reject writes (CP) or allow divergence (AP)CAP theorem — you can only choose two of three for any distributed system
Session managementSticky sessions prevent autoscalingExternalize sessions to RedisAdds Redis as a dependency; single point of failure if Redis is not HA

The bottleneck-first mindset

After designing a system, ask: "Where is the first thing that breaks if traffic doubles?" Then double it again: "What breaks next?" This iterative bottleneck identification is how real systems are designed — not by predicting all future problems, but by solving the current limiting factor.

How this might come up in interviews

System design interviews at every level above junior. Used at FAANG and major tech companies to assess architectural reasoning. The process matters as much as the solution.

Common questions:

  • Walk me through how you would approach designing a URL shortener.
  • What questions do you ask before starting a system design?
  • How do you estimate scale for a system design problem?
  • What is the difference between SQL and NoSQL, and when would you choose each?
  • How would you design a feed (Twitter, Instagram) for 100M users?

Key takeaways

  • System design thinking is a process: Requirements → Estimation → Components → Evaluation (RECE).
  • Start with requirements: clarify functional, non-functional, scale, and out-of-scope before drawing any boxes.
  • Back-of-envelope math: estimate reads/second, writes/second, storage, and bandwidth to choose the right component class.
  • Standard components and when to use each: cache (high read:write ratio), message queue (async decoupling), sharding (write throughput exhausted), read replicas (read-heavy, eventual consistency acceptable).
  • Identify bottlenecks explicitly: "Where does this design break first if traffic doubles?"
Before you move on: can you answer these?

What is the first question to ask when approaching any system design problem?

Clarify requirements — both functional (what it does) and non-functional (availability, latency, consistency). The constraints define the design; designing without them is guessing.

A system has a 1:100 write:read ratio. What is the first architectural component to consider?

A caching layer (Redis/Memcached) to absorb the overwhelming read traffic. With 1:100 write:read, most reads can be served from cache, dramatically reducing database load without adding complexity.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Sign in to track your progress and mark lessons complete.

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A

Sign in to start or join a thread.