Container networking is built on virtual ethernet pairs, network namespaces, bridges, and iptables/eBPF rules. Every pod-to-pod communication, every Service ClusterIP, and every CNI plugin decision builds on these primitives. Understanding how packets actually move is the difference between guessing at network problems and knowing exactly which hop to inspect.
Know that every pod gets a unique IP. Pods on the same node communicate via a virtual bridge. Pods on different nodes use CNI overlay (VXLAN or BGP routing). Kubernetes requires pod-to-pod communication without NAT.
Trace the packet path from pod A to pod B using ip route, ip addr, and arp on a node. Understand veth pairs and how they connect pod network namespaces to the node bridge. Debug cross-node pod communication failures by checking each hop.
Evaluate CNI plugins by networking model and security properties. Design NetworkPolicy for zero-trust pod networking. Debug MTU mismatches. Understand eBPF-based networking vs iptables-based networking and their scale implications.
Own the CNI selection decision. Evaluate security properties: ARP isolation, encryption, multi-tenant isolation. Design network architecture for PCI or HIPAA workloads. Evaluate Cilium vs Calico vs VPC-native networking against specific compliance requirements.
Container networking is built on virtual ethernet pairs, network namespaces, bridges, and iptables/eBPF rules. Every pod-to-pod communication, every Service ClusterIP, and every CNI plugin decision builds on these primitives. Understanding how packets actually move is the difference between guessing at network problems and knowing exactly which hop to inspect.
Attacker deploys malicious pod to a node shared with Tenant B pods
Attacker pod sends gratuitous ARP replies claiming Tenant B pod IP, poisoning node ARP cache
Traffic from other pods destined for Tenant B's IP redirected to attacker pod via poisoned ARP
Attacker reads plaintext HTTP API calls including authentication tokens
Anomalous traffic pattern detected -- attacker pod receiving traffic for IPs it did not own
The question this raises
If every pod has its own network namespace, how can one pod intercept traffic intended for another -- and what layer of the network stack is missing protection?
A pod on Node A cannot reach a pod on Node B, but pods on the SAME node communicate fine. Both pods show Running with IPs assigned. What is the most likely cause?
Lesson outline
The Kubernetes networking model: flat IPs, no NAT
Kubernetes requires that every pod can communicate with every other pod directly using its IP, without NAT. This is fundamentally different from Docker's default NAT model where containers use private IPs and the host rewrites addresses. The flat pod network makes routing simpler but requires the CNI plugin to create and maintain routes across all nodes so that 10.244.1.7 is reachable from any pod in the cluster.
veth pair
Use for: Virtual ethernet cable with two ends. One end is eth0 inside the pod network namespace. The other end is on the host and connects to the bridge. Packets entering one end exit the other -- exactly like a physical network cable.
Linux bridge (cni0)
Use for: Virtual Layer 2 switch on each node. All pods on the node connect to this bridge. The bridge forwards packets using MAC addresses and ARP -- exactly like a physical Ethernet switch in a data center.
VXLAN tunnel
Use for: Encapsulates pod-to-pod traffic in UDP packets for cross-node communication. Creates a virtual Layer 2 network over the physical Layer 3 network. The flannel.1 or calico-vxlan interfaces handle encapsulation and decapsulation transparently.
CNI plugin binary
Use for: Executed by kubelet to set up networking for each new pod. Creates the veth pair, assigns the IP from pod CIDR, adds routes to the bridge, and configures the overlay tunnel entries. Runs as a privileged binary on each node, not as a daemon.
NODE 1 (physical IP: 192.168.1.10) NODE 2 (physical IP: 192.168.1.11)
Pod A (10.244.0.5) Pod B (10.244.1.7)
+---------------------------+ +---------------------------+
| eth0: 10.244.0.5 | | eth0: 10.244.1.7 |
| route: 0.0.0.0 via .0.1 | | route: 0.0.0.0 via .1.1 |
+-----------|---------------+ +-----------|---------------+
| veth pair | veth pair
v ^
+---------------------------+ +---------------------------+
| cni0 bridge: 10.244.0.1 | | cni0 bridge: 10.244.1.1 |
| route: 10.244.1.0/24 | | |
| -> via flannel.1 | | |
+-----------|---------------+ +-----------|---------------+
| VXLAN encapsulate | VXLAN decapsulate
v ^
+---------------------------+ +---------------------------+
| flannel.1 (VXLAN iface) | UDP --> | flannel.1 (VXLAN iface) |
| src=192.168.1.10:8472 | | dst=192.168.1.11:8472 |
| inner: 0.5 -> 1.7 | | inner: 0.5 -> 1.7 |
+---------------------------+ +---------------------------+
| ^
v |
[Physical Network: 192.168.1.0/24] ----------+
NOTE: Cloud security groups MUST allow UDP 8472 (Flannel) or 4789 (Calico/Cilium)
between all node IPs -- this is the most common cross-node networking failure.The VXLAN tunnel wraps pod IP packets (inner) inside UDP packets (outer) using node IPs for transport. The inner packet preserves the original pod IPs throughout -- no NAT at any hop.
Container networking misconceptions
Pod A sends a packet to Pod B on a different node
“Pods use NAT -- the packet goes out through the node IP, gets rewritten at the node, and the destination node rewrites it back. Like how home network devices share one public IP.”
“Pods communicate using their actual pod IPs with no NAT. The VXLAN tunnel encapsulates the pod-IP packet in a UDP packet using node IPs for transport, then unwraps it on the destination node, delivering the original pod-IP packet. Pod IPs are preserved end-to-end.”
A NetworkPolicy is applied to a namespace
“The Kubernetes API server enforces the NetworkPolicy, blocking traffic at the control plane level when pods try to communicate.”
“NetworkPolicy is enforced by the CNI plugin running on each node, which translates the policy into iptables rules or eBPF programs on that node. The API server stores the policy object but does not enforce it. If your CNI plugin does not support NetworkPolicy (e.g., Flannel), the policies are stored but silently ignored -- no blocking occurs.”
Every hop a cross-node packet takes
01
1. Pod A sends to 10.244.1.7 -- the container process opens a socket. The kernel in Pod A's network namespace looks up its routing table. Route: 0.0.0.0/0 via 10.244.0.1 (the cni0 bridge gateway). Packet exits eth0 (pod side of veth pair).
02
2. veth pair delivers to cni0 bridge -- the packet crosses the veth pair and arrives at the host namespace side. The bridge sees destination 10.244.1.7 -- not a local pod (local pods are 10.244.0.x). Bridge forwards to the host routing table.
03
3. Host routing table selects VXLAN interface -- the node's routing table has: 10.244.1.0/24 via dev flannel.1 (added by the CNI plugin when the cluster was set up or when a new node joined). Packet handed to flannel.1.
04
4. VXLAN encapsulation for cross-node transport -- flannel.1 looks up its FDB (forwarding database) to find which node owns 10.244.1.0/24 (Node 2 at 192.168.1.11). It encapsulates the pod IP packet in UDP: outer src=192.168.1.10 dst=192.168.1.11, UDP port 8472. This packet travels over the physical network.
05
5. Destination node decapsulates and delivers -- Node 2 receives UDP on port 8472. The VXLAN module decapsulates it, recovering the inner pod IP packet (src=10.244.0.5, dst=10.244.1.7). Host routing table forwards to cni0 bridge. ARP finds the MAC of Pod B via its veth pair. Pod B receives the packet on eth0.
1. Pod A sends to 10.244.1.7 -- the container process opens a socket. The kernel in Pod A's network namespace looks up its routing table. Route: 0.0.0.0/0 via 10.244.0.1 (the cni0 bridge gateway). Packet exits eth0 (pod side of veth pair).
2. veth pair delivers to cni0 bridge -- the packet crosses the veth pair and arrives at the host namespace side. The bridge sees destination 10.244.1.7 -- not a local pod (local pods are 10.244.0.x). Bridge forwards to the host routing table.
3. Host routing table selects VXLAN interface -- the node's routing table has: 10.244.1.0/24 via dev flannel.1 (added by the CNI plugin when the cluster was set up or when a new node joined). Packet handed to flannel.1.
4. VXLAN encapsulation for cross-node transport -- flannel.1 looks up its FDB (forwarding database) to find which node owns 10.244.1.0/24 (Node 2 at 192.168.1.11). It encapsulates the pod IP packet in UDP: outer src=192.168.1.10 dst=192.168.1.11, UDP port 8472. This packet travels over the physical network.
5. Destination node decapsulates and delivers -- Node 2 receives UDP on port 8472. The VXLAN module decapsulates it, recovering the inner pod IP packet (src=10.244.0.5, dst=10.244.1.7). Host routing table forwards to cni0 bridge. ARP finds the MAC of Pod B via its veth pair. Pod B receives the packet on eth0.
1# Get pod IPs and node assignments2$ kubectl get pods -o wide3NAME READY IP NODE4pod-a 1/1 10.244.0.5 node-15pod-b 1/1 10.244.1.7 node-267# From pod-a, test connectivity to pod-b8$ kubectl exec -it pod-a -- ping 10.244.1.79# If this fails, narrow down the layer:1011# ON NODE-1: check routing table shows route to pod-b subnet12$ ip route show | grep 10.244.11310.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink14# Missing? CNI plugin may not have set up routes for node-2The route 10.244.1.0/24 via flannel.1 is added by the CNI plugin. If it is missing, the CNI did not successfully register the node or its routes.1516# ON NODE-1: check VXLAN FDB (which node owns which pod subnet)17$ bridge fdb show dev flannel.11852:c5:3a:b3:e4:f1 dst 192.168.1.11 self permanent19# Maps MAC to node-2 IP -- this is how VXLAN knows where to sendThe VXLAN FDB maps remote pod MAC addresses to the node IP that owns them. This is populated by the CNI control plane (Flannel daemon) watching for new nodes.2021# ON NODE-1: can node-1 reach node-2 on VXLAN port?22$ nc -vzu 192.168.1.11 847223# If fails: cloud security group blocking UDP 8472 -- MOST COMMON CAUSEUDP port 8472 (Flannel) or 4789 (Calico/Cilium VXLAN) MUST be allowed between all node IPs in cloud security groups. This is the #1 cause of cross-node pod networking failures.2425# Check for MTU issues (VXLAN adds ~50 bytes overhead)26$ ping -M do -s 1450 192.168.1.1127# If 1450 fails but 1400 works: MTU mismatch28# Fix: set pod MTU to 1450 in CNI config (default for VXLAN overlay)
Blast radius: container networking failure modes
No NetworkPolicy -- flat network allows any pod to reach any pod
# Default: NO NetworkPolicy in namespace
# Any pod anywhere in the cluster can reach the production database.
# A compromised pod in 'dev' namespace connects to 'production' DB.
apiVersion: apps/v1
kind: Deployment
metadata:
name: production-database
namespace: production
spec:
template:
spec:
containers:
- name: postgres
image: postgres:15
ports:
- containerPort: 5432
# No NetworkPolicy exists -- any pod in the cluster can connect# Default-deny all ingress, then allow only specific sources
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-access-policy
namespace: production
spec:
podSelector:
matchLabels:
role: database
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
env: production # Only production namespace
podSelector:
matchLabels:
app: api-server # Only api-server pods
ports:
- port: 5432
protocol: TCP
egress: [] # No egress allowed (DB only responds, never initiates)Without NetworkPolicy, the pod network is flat -- every pod can reach every other pod regardless of namespace. NetworkPolicy adds L3/L4 firewall rules enforced by the CNI plugin using iptables/eBPF on each node. Apply a default-deny policy per namespace and explicitly whitelist only required connections. Note: Flannel does not enforce NetworkPolicy -- you need Calico, Cilium, or Weave for enforcement.
| CNI Plugin | Networking Model | NetworkPolicy | Performance | Complexity |
|---|---|---|---|---|
| Flannel | VXLAN overlay | No (needs Calico policy overlay) | Medium (VXLAN overhead) | Low -- simple to operate |
| Calico | BGP or VXLAN | Yes (iptables or eBPF) | High (BGP native: no overlay) | Medium -- BGP config for native mode |
| Cilium | eBPF (no iptables) | Yes (L3/L4/L7 via eBPF) | Highest (bypasses conntrack) | High -- modern kernel required |
| AWS VPC CNI | VPC-native (ENI) | Yes (via Calico or native policy) | Highest (native VPC routing) | Medium -- AWS-only, IP exhaustion risk |
| Weave Net | VXLAN or fast datapath | Yes (iptables) | Medium | Low -- automated mesh setup |
How pod-to-pod traffic moves
📖 What the exam expects
Each pod has a network namespace with a veth pair: one end (eth0) in the pod namespace, one end on the host connected to the cni0 bridge. Cross-node traffic uses CNI overlay (VXLAN encapsulates pod IP packets in UDP) or BGP routing (Calico native mode routes pod IPs directly). No NAT between pods.
Toggle between what certifications teach and what production actually requires
Appears in network troubleshooting questions, CNI selection discussions, and security architecture interviews. "Trace a packet from Pod A to Pod B on a different node" is a classic senior Kubernetes interview question.
Common questions:
Strong answer: Mentioning conntrack table exhaustion (Cloudflare 2019 incident). Knowing Cilium bypasses iptables with eBPF. Discussing MTU and VXLAN overhead. Being able to use ip route, ip addr, and bridge fdb on a node to trace packet paths.
Red flags: Thinking pods use NAT to communicate (Kubernetes requires direct pod-to-pod routing). Not knowing what a veth pair is. Believing NetworkPolicy is enforced by the API server (it is enforced by the CNI plugin on each node).
Related concepts
Explore topics that connect to this one.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.