The practical differences between Application Load Balancer (Layer 7), Network Load Balancer (Layer 4), and Gateway Load Balancer (Layer 3) — when to use each, their routing capabilities, performance characteristics, and the mistakes teams make when they choose wrong.
The practical differences between Application Load Balancer (Layer 7), Network Load Balancer (Layer 4), and Gateway Load Balancer (Layer 3) — when to use each, their routing capabilities, performance characteristics, and the mistakes teams make when they choose wrong.
Lesson outline
AWS offers three distinct load balancer types, each operating at a different layer of the networking stack. Choosing the wrong type means you either cannot implement the routing logic you need (NLB for path-based routing) or you overpay for protocol inspection you do not need (ALB for raw TCP performance). The decision is architectural — changing load balancer type later requires DNS changes, listener reconfiguration, and potential application-level changes.
| Feature | ALB (Layer 7) | NLB (Layer 4) | GLB (Layer 3/4) |
|---|---|---|---|
| Layer | Application (HTTP/HTTPS/gRPC/WebSocket) | Transport (TCP/UDP/TLS) | Network (IP packets) |
| Routing logic | Path, host, header, query string, method | IP, port, protocol | Routes all traffic to security appliances |
| Use case | HTTP microservices, API gateways, WebSockets | High-throughput TCP/UDP, low latency, static IPs | Third-party firewalls, IDS/IPS, deep packet inspection |
| Latency overhead | ~1–2ms (HTTP parsing) | <1ms (pass-through) | Dependent on appliance |
| Static IP | No (DNS-based) | Yes (Elastic IP per AZ) | No |
| TLS termination | Yes (ACM integration) | Yes (TLS passthrough option) | No (appliance handles it) |
| WAF integration | Yes (AWS WAF) | No | No |
| Sticky sessions | Yes (duration-based or application cookie) | Yes (source IP) | N/A |
| WebSocket | Native support | Yes (transparent) | N/A |
| gRPC | Yes (native) | Yes (as TCP) | N/A |
The traffic inspection analogy
ALB is a smart receptionist who reads your visitor badge (HTTP headers), checks which department you need (path-based routing), and directs you to the right floor. NLB is a fast revolving door — it checks nothing except the address on the envelope (IP + port) and passes you through at maximum speed. GLB is a security checkpoint that funnels everyone through a third-party X-ray machine (security appliance) before letting them into the building.
The Application Load Balancer is the right choice for the vast majority of HTTP-based workloads. Its ability to route based on request content — path, hostname, headers, query parameters — is what makes microservices architectures practical without an additional API gateway for every use case.
ALB routing rules you will actually use
ALB connection draining prevents in-flight request drops during deployments
Enable deregistration delay (default 300s, consider 30s for fast deployments) on target groups. When a target is deregistered (e.g., during ECS rolling update), the ALB stops sending new requests to it but waits for existing connections to complete. Without draining, in-flight requests to terminating containers return 502 errors. Reduce the delay to 30s for stateless APIs; keep 300s for long-lived connections like WebSockets.
ALB access logs are not free — they are S3 PutObject charges
Enabling ALB access logs writes every request to S3. At high traffic (10k req/s = 864M req/day), this adds meaningful S3 storage and PUT costs. Use Athena to query them on-demand rather than streaming to CloudWatch Logs (10x more expensive). Alternatively, enable logs only on staging for debugging and use sampling in production.
1# Create an ALB listener rule for path-based routing2# Route /api/* to the API target group, everything else to the frontend3aws elbv2 create-rule \4--listener-arn arn:aws:elasticloadbalancing:us-east-1:123:listener/app/my-alb/abc/xyz \5--priority 10 \Priority 10 — lower number = higher priority. Default rule (no condition match) has priority "default"6--conditions '[{"Field":"path-pattern","Values":["/api/*"]}]' \7--actions '[{"Type":"forward","TargetGroupArn":"arn:aws:elasticloadbalancing:us-east-1:123:targetgroup/api-tg/abc"}]'89# Check target group health — any unhealthy targets will cause 502 errors10aws elbv2 describe-target-health \11--target-group-arn arn:aws:elasticloadbalancing:us-east-1:123:targetgroup/api-tg/abc \Unhealthy targets cause 502s — check this first when seeing 5xx errors12--query 'TargetHealthDescriptions[?TargetHealth.State!=`healthy`]'1314# View ALB access logs with Athena (after enabling logging to S3)15# CREATE EXTERNAL TABLE alb_logs (...) PARTITIONED BY (year, month, day, hour)16# SELECT request_url, target_status_code, COUNT(*) as count17# FROM alb_logs WHERE target_status_code = '502'18# GROUP BY 1, 2 ORDER BY 3 DESC LIMIT 20;
The Network Load Balancer operates at Layer 4 (TCP/UDP). It does not inspect HTTP headers, parse paths, or add overhead. It passes packets through with sub-millisecond latency and preserves the client source IP (unlike ALB which uses proxy protocol or X-Forwarded-For headers). Use NLB when latency, throughput, or source IP preservation are requirements.
NLB use cases that actually appear in production
NLB cannot do path-based routing — this is not a configuration, it is a design constraint
A team migrated from an ALB to an NLB to get static Elastic IP addresses without realising they relied on path-based routing rules (/api vs /admin). NLB routes purely on IP address and port — it cannot read the HTTP path. If you need both static IPs and path-based routing, the solution is: NLB → ALB (ALB behind NLB on a fixed port) — the NLB provides the static IP, the ALB provides the HTTP routing. This adds latency and cost but solves both requirements.
The operational details that catch teams off guard are not in the routing logic — they are in health checks, stickiness, and draining. Getting these wrong causes intermittent 5xx errors that are very hard to debug.
Debugging intermittent 502/503 errors on an ALB
01
Check target group health: aws elbv2 describe-target-health. Any "unhealthy" or "draining" targets are likely causing the 502s.
02
Check health check configuration: what path, protocol, and port is the ALB using? Is the application returning 200 on that path? Returning 302 redirect on a health check path causes the target to be marked unhealthy.
03
Check deregistration delay: if you see 502s only during deployments, the draining window is too short (targets are terminated before in-flight requests complete).
04
Check security group rules: the ALB's security group must allow traffic to the target's security group on the health check port. If you changed the app port and forgot to update the SG rule, health checks fail.
05
Check slow start: new targets get hit immediately at full capacity. If your app takes 30 seconds to warm up, enable slow start mode to ramp traffic gradually.
06
Check listener certificate expiry: ALB returns 502 if the ACM certificate has expired. Certificate Manager auto-renews, but if DNS validation has broken, renewal fails silently.
Check target group health: aws elbv2 describe-target-health. Any "unhealthy" or "draining" targets are likely causing the 502s.
Check health check configuration: what path, protocol, and port is the ALB using? Is the application returning 200 on that path? Returning 302 redirect on a health check path causes the target to be marked unhealthy.
Check deregistration delay: if you see 502s only during deployments, the draining window is too short (targets are terminated before in-flight requests complete).
Check security group rules: the ALB's security group must allow traffic to the target's security group on the health check port. If you changed the app port and forgot to update the SG rule, health checks fail.
Check slow start: new targets get hit immediately at full capacity. If your app takes 30 seconds to warm up, enable slow start mode to ramp traffic gradually.
Check listener certificate expiry: ALB returns 502 if the ACM certificate has expired. Certificate Manager auto-renews, but if DNS validation has broken, renewal fails silently.
Sticky sessions are a scaling trap
Sticky sessions tie a user to a specific target for the duration of the session. This prevents horizontal scaling: if a target with 1,000 sticky sessions is overloaded, ALB cannot move those sessions to less-loaded targets. The correct fix is to move session state out of the application tier (into ElastiCache/Redis) so any target can handle any request. Use stickiness only as a temporary measure during stateful application migrations, never as a permanent architecture choice.
Cloud engineer and DevOps engineer interviews, often in the context of "design an internet-facing API" or "how do you achieve zero-downtime deployments." The NLB vs ALB distinction also appears in security reviews when WAF integration is required.
Common questions:
Try this question: What protocols does the service use (HTTP/HTTPS, TCP, UDP)? Are static IPs required? What are the throughput and latency SLOs? Are there WAF or DDoS protection requirements? Does the application use WebSockets or gRPC?
Strong answer: Mentions WAF integration as an ALB-only feature immediately when the use case involves internet-facing traffic. Explains the NLB → ALB chaining pattern for static IP + routing requirements. Describes connection draining as an operational concern for deployments, not just a feature checkbox.
Red flags: Recommends NLB for all workloads because "it is faster." Does not know that NLB cannot do path-based routing. Cannot explain why the ALB security group needs to allow traffic to the target security group on the health check port.
Key takeaways
💡 Analogy
ALB is a smart receptionist at a company reception desk. She reads your visitor badge, asks who you are here to see, checks the directory, and escorts you to the correct department. She understands English (HTTP). NLB is a fast revolving door — it does not read your badge, does not ask questions, just physically moves you through the building entrance at maximum throughput. It knows only which building you came to (IP address and port). GLB is a security checkpoint with a third-party X-ray machine: everyone must pass through it for inspection before entering, and the checkpoint itself is provided by a specialised security vendor.
⚡ Core Idea
ALB reads HTTP content (path, host, headers) to make routing decisions; NLB reads only IP address and port and passes through without inspection. Use ALB when you need content-based routing, WAF integration, or WebSocket support. Use NLB when you need static IPs, UDP, sub-millisecond latency, or extreme throughput. Use GLB when all traffic must pass through third-party security appliances.
🎯 Why It Matters
Choosing ALB when you need NLB's static IPs — or NLB when you need ALB's path-based routing — requires a complete architecture change to fix. The decision affects DNS configuration, security group rules, WAF policies, access logging, and application-level IP handling. Understanding the Layer 4 vs Layer 7 distinction and its practical implications is foundational knowledge for any cloud engineer designing production workloads.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.