The architecture that works at 1k users will break at 100k. Plan ahead, not too far.
The architecture that works at 1k users will break at 100k. Plan ahead, not too far.
Instagram had 13 employees and 30 million users at acquisition by Facebook in 2012. They ran PostgreSQL, Redis, and Gearman on a handful of machines. They scaled by being pragmatic, not by over-engineering early.
The Scaling Rule of Thumb
Build for 10× your current scale. If you have 10k users, build for 100k. Don't build for 1 billion users on day one — the architecture for 1 billion is completely different from 100k, and over-engineering kills startups.
| Phase | Users | Architecture | Key Additions |
|---|---|---|---|
| Phase 1 | 0–10k | Monolith + single DB | Focus on product, not infrastructure |
| Phase 2 | 10k–100k | Monolith + read replicas + CDN + Redis cache | Add Redis, deploy to multiple AZs |
| Phase 3 | 100k–1M | Modular monolith or 3–5 services + message queue | Queue async work, separate auth/search |
| Phase 4 | 1M–10M | 5–20 services + multi-region + sharding | Geographic distribution, data sharding |
| Phase 5 | 10M–1B+ | 50–500+ services + custom infra | Custom storage, global load balancing, edge compute |
| Type | How | When to Use | Limitation |
|---|---|---|---|
| Vertical (scale up) | Bigger machine: more RAM/CPU | Stateful services (databases), before adding complexity | Hardware limits; single point of failure; expensive |
| Horizontal (scale out) | More instances behind load balancer | Stateless services (API servers, workers) | Requires stateless design; more ops complexity |
| Read replicas | Route reads to replicas | Read-heavy workloads (>70% reads) | Replication lag; eventual consistency |
| Sharding | Split data across multiple DB instances | When single DB can't handle write throughput | No cross-shard JOINs; complex ops — last resort |
The Stateless Service Rule
Horizontal scaling only works if services are stateless. No in-memory sessions, no local file storage. All state goes to Redis (sessions), PostgreSQL (data), S3 (files). If your service can be killed and restarted without losing user data, it scales horizontally.
1// Making services stateless for horizontal scaling23// ❌ Stateful — breaks with multiple instances4class StatefulService {5private processingOrders = new Map<string, Order>(); // in-memory67async getStatus(orderId: string) {In-memory state breaks with multiple instances — load balancer routes to different instances8return this.processingOrders.get(orderId); // ❌ Only works on THIS instance9}10}1112// ✅ Stateless — works with any number of instances13class StatelessService {14async startProcessing(orderId: string) {15const order = await db.orders.findById(orderId);1617// Store state in shared Redis — visible to ALL instancesRedis is visible to all instances — the shared source of truth for distributed state18await redis.setex(19`processing:${orderId}`,203600,21JSON.stringify({ status: 'processing', startedAt: Date.now() })22);23}24Any instance can answer — they all read from the same Redis cluster25async getStatus(orderId: string) {26const data = await redis.get(`processing:${orderId}`);27return data ? JSON.parse(data) : null;28// ✅ Any instance can answer — all see the same Redis29}Kubernetes HPA scales pod count automatically based on CPU/memory/custom metrics30}3132// Kubernetes HorizontalPodAutoscaler33// apiVersion: autoscaling/v234// kind: HorizontalPodAutoscaler35// spec:36// scaleTargetRef:37// name: order-service38// minReplicas: 239// maxReplicas: 5040// metrics:41// - type: Resource42// resource:43// name: cpu44// target:45// averageUtilization: 70 # scale when avg CPU > 70%
The Scalability Pattern Toolkit
The Four Golden Signals (Google SRE)
Monitor for every service: (1) Latency — how long requests take. (2) Traffic — requests/sec. (3) Errors — error rate. (4) Saturation — how "full" the service is (CPU%, queue depth, connection pool%). Alert on Golden Signals — they represent everything users care about.
Scalability is the most common senior/staff interview topic. Show you understand the progression from simple to complex and don't jump to "use Kafka and microservices" for 1000 users.
Common questions:
Strong answers include:
Red flags:
Quick check · Scalable Backend: From Thousands to Billions of Users
1 / 3
Key takeaways
From the books
Designing Data-Intensive Applications — Martin Kleppmann (2017)
Part II: Distributed Data
The most comprehensive treatment of distributed databases, replication, partitioning, and transactions. Read before making any database architecture decision.
The Architecture of Open Source Applications (Volume 2) — Amy Brown, Greg Wilson (2012)
Instagram's architecture
How Instagram scaled to 30M users with 13 engineers and simple technology choices. The lesson: simplicity and operational clarity beat cutting-edge complexity.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.