Skip to main content
Career Paths
Concepts
Load Balancers
The Simplified Tech

Role-based learning paths to help you master cloud engineering with clarity and confidence.

Product

  • Career Paths
  • Interview Prep
  • Scenarios
  • AI Features
  • Cloud Comparison
  • Resume Builder
  • Pricing

Community

  • Join Discord

Account

  • Dashboard
  • Credits
  • Updates
  • Sign in
  • Sign up
  • Contact Support

Stay updated

Get the latest learning tips and updates. No spam, ever.

Terms of ServicePrivacy Policy

© 2026 TheSimplifiedTech. All rights reserved.

BackBack
Interactive Explainer

Load Balancer Types: ALB, NLB & GLB

The practical differences between Application Load Balancer (Layer 7), Network Load Balancer (Layer 4), and Gateway Load Balancer (Layer 3) — when to use each, their routing capabilities, performance characteristics, and the mistakes teams make when they choose wrong.

🎯Key Takeaways
ALB operates at Layer 7 and can route based on path, host, headers, and query parameters; NLB operates at Layer 4 and routes only on IP address and port
NLB provides static Elastic IPs per AZ; ALB uses DNS-based addressing with IPs that change
Use NLB for UDP traffic, extreme throughput, sub-millisecond latency, or static IP requirements
ALB integrates with AWS WAF; NLB does not — never put an internet-facing API behind an NLB without an additional WAF layer
Connection draining prevents in-flight request drops during deployments — tune the deregistration delay for your workload's request duration

Load Balancer Types: ALB, NLB & GLB

The practical differences between Application Load Balancer (Layer 7), Network Load Balancer (Layer 4), and Gateway Load Balancer (Layer 3) — when to use each, their routing capabilities, performance characteristics, and the mistakes teams make when they choose wrong.

~7 min read
Be the first to complete!
What you'll learn
  • ALB operates at Layer 7 and can route based on path, host, headers, and query parameters; NLB operates at Layer 4 and routes only on IP address and port
  • NLB provides static Elastic IPs per AZ; ALB uses DNS-based addressing with IPs that change
  • Use NLB for UDP traffic, extreme throughput, sub-millisecond latency, or static IP requirements
  • ALB integrates with AWS WAF; NLB does not — never put an internet-facing API behind an NLB without an additional WAF layer
  • Connection draining prevents in-flight request drops during deployments — tune the deregistration delay for your workload's request duration

Lesson outline

Why Load Balancer Type Matters

AWS offers three distinct load balancer types, each operating at a different layer of the networking stack. Choosing the wrong type means you either cannot implement the routing logic you need (NLB for path-based routing) or you overpay for protocol inspection you do not need (ALB for raw TCP performance). The decision is architectural — changing load balancer type later requires DNS changes, listener reconfiguration, and potential application-level changes.

FeatureALB (Layer 7)NLB (Layer 4)GLB (Layer 3/4)
LayerApplication (HTTP/HTTPS/gRPC/WebSocket)Transport (TCP/UDP/TLS)Network (IP packets)
Routing logicPath, host, header, query string, methodIP, port, protocolRoutes all traffic to security appliances
Use caseHTTP microservices, API gateways, WebSocketsHigh-throughput TCP/UDP, low latency, static IPsThird-party firewalls, IDS/IPS, deep packet inspection
Latency overhead~1–2ms (HTTP parsing)<1ms (pass-through)Dependent on appliance
Static IPNo (DNS-based)Yes (Elastic IP per AZ)No
TLS terminationYes (ACM integration)Yes (TLS passthrough option)No (appliance handles it)
WAF integrationYes (AWS WAF)NoNo
Sticky sessionsYes (duration-based or application cookie)Yes (source IP)N/A
WebSocketNative supportYes (transparent)N/A
gRPCYes (native)Yes (as TCP)N/A

The traffic inspection analogy

ALB is a smart receptionist who reads your visitor badge (HTTP headers), checks which department you need (path-based routing), and directs you to the right floor. NLB is a fast revolving door — it checks nothing except the address on the envelope (IP + port) and passes you through at maximum speed. GLB is a security checkpoint that funnels everyone through a third-party X-ray machine (security appliance) before letting them into the building.

ALB: Layer 7 Routing in Practice

The Application Load Balancer is the right choice for the vast majority of HTTP-based workloads. Its ability to route based on request content — path, hostname, headers, query parameters — is what makes microservices architectures practical without an additional API gateway for every use case.

ALB routing rules you will actually use

  • Path-based routing: /api/* → API service, /static/* → S3 origin — One ALB, multiple target groups. Route /api/v1/* to the backend API ECS service and /static/* to an S3 bucket via an HTTP endpoint. Eliminates the need for a separate load balancer per service in many architectures.
  • Host-based routing: api.example.com → API, app.example.com → frontend — Multiple domains behind one ALB. Each listener rule matches on the Host header. This is cheaper than running one ALB per subdomain (ALB costs ~$18/month base + $0.008/LCU-hour).
  • Header/query routing: X-Version: v2 → canary target group — Canary deployments at the load balancer level. Route 5% of traffic with a specific header to a new version. No service mesh required for simple weighted routing.
  • Redirect rules: HTTP → HTTPS (301) — Enforce HTTPS without code changes. ALB listener rule: if protocol = HTTP, redirect to HTTPS same host same path. Also handles apex-to-www redirects.

ALB connection draining prevents in-flight request drops during deployments

Enable deregistration delay (default 300s, consider 30s for fast deployments) on target groups. When a target is deregistered (e.g., during ECS rolling update), the ALB stops sending new requests to it but waits for existing connections to complete. Without draining, in-flight requests to terminating containers return 502 errors. Reduce the delay to 30s for stateless APIs; keep 300s for long-lived connections like WebSockets.

ALB access logs are not free — they are S3 PutObject charges

Enabling ALB access logs writes every request to S3. At high traffic (10k req/s = 864M req/day), this adds meaningful S3 storage and PUT costs. Use Athena to query them on-demand rather than streaming to CloudWatch Logs (10x more expensive). Alternatively, enable logs only on staging for debugging and use sampling in production.

alb-operations.sh
1# Create an ALB listener rule for path-based routing
2 # Route /api/* to the API target group, everything else to the frontend
3 aws elbv2 create-rule \
4 --listener-arn arn:aws:elasticloadbalancing:us-east-1:123:listener/app/my-alb/abc/xyz \
5 --priority 10 \
Priority 10 — lower number = higher priority. Default rule (no condition match) has priority "default"
6 --conditions '[{"Field":"path-pattern","Values":["/api/*"]}]' \
7 --actions '[{"Type":"forward","TargetGroupArn":"arn:aws:elasticloadbalancing:us-east-1:123:targetgroup/api-tg/abc"}]'
8
9 # Check target group health — any unhealthy targets will cause 502 errors
10 aws elbv2 describe-target-health \
11 --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123:targetgroup/api-tg/abc \
Unhealthy targets cause 502s — check this first when seeing 5xx errors
12 --query 'TargetHealthDescriptions[?TargetHealth.State!=`healthy`]'
13
14 # View ALB access logs with Athena (after enabling logging to S3)
15 # CREATE EXTERNAL TABLE alb_logs (...) PARTITIONED BY (year, month, day, hour)
16 # SELECT request_url, target_status_code, COUNT(*) as count
17 # FROM alb_logs WHERE target_status_code = '502'
18 # GROUP BY 1, 2 ORDER BY 3 DESC LIMIT 20;

NLB: When Raw Performance Matters

The Network Load Balancer operates at Layer 4 (TCP/UDP). It does not inspect HTTP headers, parse paths, or add overhead. It passes packets through with sub-millisecond latency and preserves the client source IP (unlike ALB which uses proxy protocol or X-Forwarded-For headers). Use NLB when latency, throughput, or source IP preservation are requirements.

NLB use cases that actually appear in production

  • Static Elastic IP addresses required (regulatory, firewall whitelisting) — ALB uses DNS and IP addresses change. NLB allocates a static Elastic IP per AZ that never changes. Clients that whitelist IP addresses (legacy on-premises systems, financial counterparties) require NLB.
  • Gaming, VoIP, IoT: UDP protocol required — ALB does not support UDP. NLB handles UDP/TCP at wire speed. DNS servers, STUN/TURN servers, time synchronisation (NTP), and IoT telemetry protocols commonly use UDP.
  • End-to-end TLS with mutual TLS (mTLS) passthrough — NLB in TLS passthrough mode forwards encrypted traffic directly to the target. The application handles the TLS handshake, enabling mTLS client certificate validation without the load balancer being in the trust chain.
  • Extreme throughput: millions of requests per second — NLB scales to handle millions of connections per second. ALB tops out at approximately 60,000 connections per second per AZ before you need additional ALBs. For high-frequency trading, streaming ingest, and real-time bidding systems, NLB is the right choice.

NLB cannot do path-based routing — this is not a configuration, it is a design constraint

A team migrated from an ALB to an NLB to get static Elastic IP addresses without realising they relied on path-based routing rules (/api vs /admin). NLB routes purely on IP address and port — it cannot read the HTTP path. If you need both static IPs and path-based routing, the solution is: NLB → ALB (ALB behind NLB on a fixed port) — the NLB provides the static IP, the ALB provides the HTTP routing. This adds latency and cost but solves both requirements.

Health Checks, Sticky Sessions, and Connection Draining

The operational details that catch teams off guard are not in the routing logic — they are in health checks, stickiness, and draining. Getting these wrong causes intermittent 5xx errors that are very hard to debug.

Debugging intermittent 502/503 errors on an ALB

→

01

Check target group health: aws elbv2 describe-target-health. Any "unhealthy" or "draining" targets are likely causing the 502s.

→

02

Check health check configuration: what path, protocol, and port is the ALB using? Is the application returning 200 on that path? Returning 302 redirect on a health check path causes the target to be marked unhealthy.

→

03

Check deregistration delay: if you see 502s only during deployments, the draining window is too short (targets are terminated before in-flight requests complete).

→

04

Check security group rules: the ALB's security group must allow traffic to the target's security group on the health check port. If you changed the app port and forgot to update the SG rule, health checks fail.

→

05

Check slow start: new targets get hit immediately at full capacity. If your app takes 30 seconds to warm up, enable slow start mode to ramp traffic gradually.

06

Check listener certificate expiry: ALB returns 502 if the ACM certificate has expired. Certificate Manager auto-renews, but if DNS validation has broken, renewal fails silently.

1

Check target group health: aws elbv2 describe-target-health. Any "unhealthy" or "draining" targets are likely causing the 502s.

2

Check health check configuration: what path, protocol, and port is the ALB using? Is the application returning 200 on that path? Returning 302 redirect on a health check path causes the target to be marked unhealthy.

3

Check deregistration delay: if you see 502s only during deployments, the draining window is too short (targets are terminated before in-flight requests complete).

4

Check security group rules: the ALB's security group must allow traffic to the target's security group on the health check port. If you changed the app port and forgot to update the SG rule, health checks fail.

5

Check slow start: new targets get hit immediately at full capacity. If your app takes 30 seconds to warm up, enable slow start mode to ramp traffic gradually.

6

Check listener certificate expiry: ALB returns 502 if the ACM certificate has expired. Certificate Manager auto-renews, but if DNS validation has broken, renewal fails silently.

Sticky sessions are a scaling trap

Sticky sessions tie a user to a specific target for the duration of the session. This prevents horizontal scaling: if a target with 1,000 sticky sessions is overloaded, ALB cannot move those sessions to less-loaded targets. The correct fix is to move session state out of the application tier (into ElastiCache/Redis) so any target can handle any request. Use stickiness only as a temporary measure during stateful application migrations, never as a permanent architecture choice.

How this might come up in interviews

Cloud engineer and DevOps engineer interviews, often in the context of "design an internet-facing API" or "how do you achieve zero-downtime deployments." The NLB vs ALB distinction also appears in security reviews when WAF integration is required.

Common questions:

  • When would you use an NLB over an ALB?
  • What is the difference between Layer 4 and Layer 7 load balancing?
  • How does connection draining work and why does it matter for zero-downtime deployments?
  • Your ALB is returning 502 errors intermittently. Walk me through how you would debug this.
  • A customer requires a static IP address for your API. Your application needs path-based routing. What do you do?

Try this question: What protocols does the service use (HTTP/HTTPS, TCP, UDP)? Are static IPs required? What are the throughput and latency SLOs? Are there WAF or DDoS protection requirements? Does the application use WebSockets or gRPC?

Strong answer: Mentions WAF integration as an ALB-only feature immediately when the use case involves internet-facing traffic. Explains the NLB → ALB chaining pattern for static IP + routing requirements. Describes connection draining as an operational concern for deployments, not just a feature checkbox.

Red flags: Recommends NLB for all workloads because "it is faster." Does not know that NLB cannot do path-based routing. Cannot explain why the ALB security group needs to allow traffic to the target security group on the health check port.

Key takeaways

  • ALB operates at Layer 7 and can route based on path, host, headers, and query parameters; NLB operates at Layer 4 and routes only on IP address and port
  • NLB provides static Elastic IPs per AZ; ALB uses DNS-based addressing with IPs that change
  • Use NLB for UDP traffic, extreme throughput, sub-millisecond latency, or static IP requirements
  • ALB integrates with AWS WAF; NLB does not — never put an internet-facing API behind an NLB without an additional WAF layer
  • Connection draining prevents in-flight request drops during deployments — tune the deregistration delay for your workload's request duration
🧠Mental Model

💡 Analogy

ALB is a smart receptionist at a company reception desk. She reads your visitor badge, asks who you are here to see, checks the directory, and escorts you to the correct department. She understands English (HTTP). NLB is a fast revolving door — it does not read your badge, does not ask questions, just physically moves you through the building entrance at maximum throughput. It knows only which building you came to (IP address and port). GLB is a security checkpoint with a third-party X-ray machine: everyone must pass through it for inspection before entering, and the checkpoint itself is provided by a specialised security vendor.

⚡ Core Idea

ALB reads HTTP content (path, host, headers) to make routing decisions; NLB reads only IP address and port and passes through without inspection. Use ALB when you need content-based routing, WAF integration, or WebSocket support. Use NLB when you need static IPs, UDP, sub-millisecond latency, or extreme throughput. Use GLB when all traffic must pass through third-party security appliances.

🎯 Why It Matters

Choosing ALB when you need NLB's static IPs — or NLB when you need ALB's path-based routing — requires a complete architecture change to fix. The decision affects DNS configuration, security group rules, WAF policies, access logging, and application-level IP handling. Understanding the Layer 4 vs Layer 7 distinction and its practical implications is foundational knowledge for any cloud engineer designing production workloads.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Sign in to track your progress and mark lessons complete.

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A

Sign in to start or join a thread.