Skip to main content
Career Paths
Concepts
Compute Models
The Simplified Tech

Role-based learning paths to help you master cloud engineering with clarity and confidence.

Product

  • Career Paths
  • Interview Prep
  • Scenarios
  • AI Features
  • Cloud Comparison
  • Resume Builder
  • Pricing

Community

  • Join Discord

Account

  • Dashboard
  • Credits
  • Updates
  • Sign in
  • Sign up
  • Contact Support

Stay updated

Get the latest learning tips and updates. No spam, ever.

Terms of ServicePrivacy Policy

© 2026 TheSimplifiedTech. All rights reserved.

BackBack
Interactive Explainer

Compute Models: VMs vs Containers vs Serverless

A practical guide to choosing between virtual machines, containers, and serverless functions — covering cost, isolation, cold starts, density, and the real-world trade-offs that determine the right model for each workload.

🎯Key Takeaways
VMs give full isolation and unlimited execution time at the cost of idle billing and high operational overhead
Containers deliver 30–40% better cost efficiency than VMs through bin-packing, with moderate orchestration overhead
Serverless has zero idle cost but hard execution time limits, cold start latency, and surprise billing at high throughput
VPC-attached Lambda functions have 1–3 second cold starts; always use RDS Proxy for database connections
Use all three models together: serverless for event-driven glue, containers for sustained APIs, VMs for stateful services

Compute Models: VMs vs Containers vs Serverless

A practical guide to choosing between virtual machines, containers, and serverless functions — covering cost, isolation, cold starts, density, and the real-world trade-offs that determine the right model for each workload.

~5 min read
Be the first to complete!
What you'll learn
  • VMs give full isolation and unlimited execution time at the cost of idle billing and high operational overhead
  • Containers deliver 30–40% better cost efficiency than VMs through bin-packing, with moderate orchestration overhead
  • Serverless has zero idle cost but hard execution time limits, cold start latency, and surprise billing at high throughput
  • VPC-attached Lambda functions have 1–3 second cold starts; always use RDS Proxy for database connections
  • Use all three models together: serverless for event-driven glue, containers for sustained APIs, VMs for stateful services

The Three Compute Primitives

Every workload you deploy sits on one of three compute abstractions: a virtual machine, a container, or a serverless function. Each trades off isolation, density, operational burden, and cost differently. Choosing wrong costs real money and causes real pain — either over-paying for idle capacity or hitting cold-start latency walls at 3 AM.

PrimitiveAWSGCPAzureUnit of billing
Virtual MachineEC2Compute Engine (GCE)Azure VMsPer-hour/second (instance running)
Container (managed)ECS / EKSGKE / Cloud RunAKS / Container AppsPer vCPU-second or instance-hour
Serverless (FaaS)LambdaCloud Functions / Cloud RunAzure FunctionsPer invocation + GB-second

The hotel analogy

VMs are like leasing a full house — you pay 24/7 whether you are home or not, but nothing is shared and you can hang pictures wherever you like. Containers are like renting an apartment — you share the building structure (kernel) but have your own space. Serverless is like booking a hotel room — you pay only for the nights you stay, the hotel manages everything, but you cannot leave heavy furniture there.

When each model wins

  • VMs: long-running, stateful, or legacy workloads — Databases, ML training jobs, Windows workloads, applications that need a specific kernel version or hardware access (GPUs, network interfaces). The per-hour cost is higher but you pay nothing extra for high sustained CPU.
  • Containers: microservices with consistent, predictable load — HTTP APIs, background workers, batch processing at scale. Better density than VMs (run 50 containers per node vs 5 VMs). You manage the container runtime and orchestrator (K8s), but not the host OS kernel.
  • Serverless: event-driven, spiky, or unpredictable traffic — Webhooks, image processing, scheduled cron jobs, API backends with millions of requests/day but zero requests at midnight. You pay per invocation. Zero idle cost. But cold starts add 100ms–3s of latency on first invocation.

The Trade-offs Nobody Tells You

Every compute model has failure modes that only appear at scale or under specific traffic patterns. Understanding these ahead of time prevents expensive migrations.

Trade-offVMsContainersServerless
Cold start latencyMinutes (AMI boot)Seconds (image pull + start)100ms–3s (Lambda), 1–10s (Cloud Run)
Max execution timeUnlimitedUnlimited15 min (Lambda), 60 min (Cloud Run)
State persistenceFull local diskEphemeral (must use volumes)None — stateless by design
Networking modelFull VPC controlKubernetes networking layerVPC optional; cold starts worse in VPC
Cost at low trafficExpensive (idle billing)Moderate (pod idle cost)Near zero (pay per call)
Cost at high trafficPredictable, flatLinear, predictableCan spike to 10x expected (surprise bills)
Operational burdenHigh (OS, patches, scaling)Medium (K8s, image builds)Low (runtime managed by cloud)
ObservabilityFull access (OS metrics)Container-level + K8s metricsLimited (function logs + X-Ray/Trace)

Serverless concurrency limits are per-account, not per-function

AWS Lambda has a default concurrent execution limit of 1,000 per region per account. If one function has a traffic spike and consumes all 1,000 slots, every other Lambda in that account gets throttled. This is the "noisy neighbour" problem inside your own account. Use reserved concurrency to protect critical functions and request limit increases before you need them.

Container density math

A c5.2xlarge EC2 instance (8 vCPU, 16 GB RAM, ~$0.34/hr) can run approximately 20–30 Node.js containers each allocated 256 MB and 0.25 vCPU. The same workload as 20 individual t3.micro VMs would cost ~$0.46/hr. Containers deliver 30–40% cost savings at this scale purely through bin-packing efficiency.

lambda-concurrency.sh
1# Check Lambda concurrency usage across all functions in a region
List all functions and check their reserved concurrency settings
2 aws lambda list-functions --query 'Functions[].FunctionName' --output text | \
3 tr '\t' '\n' | \
4 xargs -I{} aws lambda get-function-concurrency --function-name {}
5
6 # Set reserved concurrency for a critical payment function
7 aws lambda put-function-concurrency \
@initDuration > 0 means this was a cold start invocation
8 --function-name payment-processor \
9 --reserved-concurrent-executions 200
10
11 # View cold start duration in CloudWatch Logs Insights
12 # Run in CloudWatch console:
13 # fields @timestamp, @duration, @initDuration
14 # | filter @initDuration > 0
15 # | sort @timestamp desc
16 # | limit 50

Decision Framework: Picking the Right Model

The right compute model depends on four questions: How predictable is your traffic? How long does each unit of work run? Do you need persistent local state? What is your tolerance for operational complexity?

Compute model selection decision tree

Does the workload run longer than 15 minutes continuously?
YesUse VMs or containers. Serverless has hard execution time limits.
NoContinue to next question.
Is traffic highly unpredictable or bursty (zero to peak in seconds)?
YesServerless if execution is short and stateless. Otherwise containers with auto-scaling.
NoContinue to next question.
Do you need to optimise for lowest cost at near-zero traffic?
YesServerless (pay per invocation, zero idle cost).
NoContainers on a managed cluster offer best density at sustained load.

Use serverless for the edges, containers for the core

A mature cloud architecture often uses all three models together: Lambda for event-driven glue (S3 trigger → resize image → store result), containers (ECS/GKE) for stateless HTTP APIs serving sustained load, and VMs for stateful services (PostgreSQL, Redis, ML inference with GPU). Each model handles the workload it is best suited for. Fighting the model — such as running a long-running job on Lambda — creates fragile hacks.

How this might come up in interviews

System design rounds for cloud engineer, platform engineer, and senior software engineer roles. Expect "design a scalable image processing pipeline" or "migrate this monolith to microservices — what compute would you use?" questions that require you to justify the model choice with cost and trade-off reasoning.

Common questions:

  • When would you choose containers over serverless for a new microservice?
  • What is a Lambda cold start and how do you mitigate it?
  • A team reports their Lambda costs tripled this month. What are the likely causes?
  • Explain the difference between Lambda reserved concurrency and provisioned concurrency.
  • You need to run a 45-minute data transformation job. What compute model do you choose and why?

Try this question: What is the expected traffic pattern — steady, bursty, or batch? What are the latency SLOs? Is there a hard budget ceiling? Are there existing VPC dependencies that affect serverless cold-start behaviour?

Strong answer: Immediately asks about traffic pattern and execution duration before recommending a model. Mentions RDS Proxy unprompted when discussing Lambda + database architectures. Distinguishes between reserved and provisioned concurrency.

Red flags: Recommends serverless for everything because "it scales automatically" without discussing cold starts, execution limits, or cost at high throughput. Cannot explain why connection pooling is different in Lambda vs a long-running service.

Key takeaways

  • VMs give full isolation and unlimited execution time at the cost of idle billing and high operational overhead
  • Containers deliver 30–40% better cost efficiency than VMs through bin-packing, with moderate orchestration overhead
  • Serverless has zero idle cost but hard execution time limits, cold start latency, and surprise billing at high throughput
  • VPC-attached Lambda functions have 1–3 second cold starts; always use RDS Proxy for database connections
  • Use all three models together: serverless for event-driven glue, containers for sustained APIs, VMs for stateful services
🧠Mental Model

💡 Analogy

Think of compute models as accommodation options. A VM is a full house you lease: you control every room, but you pay the mortgage whether or not you are home. A container is an apartment: you share the building infrastructure with neighbours (the host kernel) but have your own front door and keys. Serverless is a hotel room: you pay only for the nights you stay, housekeeping handles everything, but you cannot install a new bathtub and your luggage must fit in a carry-on (stateless, short-lived).

⚡ Core Idea

Each compute primitive trades off control and cost differently: VMs give full control at high idle cost; containers give good density at medium operational overhead; serverless gives zero idle cost at the expense of execution time limits, cold starts, and reduced observability.

🎯 Why It Matters

Choosing the wrong compute model is one of the most expensive architectural mistakes a cloud team can make. Going all-in on serverless for a latency-sensitive API causes cold-start incidents. Going all-in on VMs for a bursty ETL pipeline means paying for 100 instances 24/7 when you only need them for 30 minutes at midnight. Getting this decision right at design time avoids painful — and costly — migrations later.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Sign in to track your progress and mark lessons complete.

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A

Sign in to start or join a thread.