Kubernetes Networking Deep Dive: Surviving the Overlay Chaos in Production

Let's be honest: Kubernetes networking is where the abstraction leaks like a sieve. You deploy a Service, everything looks green in the dashboard, but curl times out and you're suddenly staring at three thousand lines of iptables rules wondering where your packet went to die.

I’ve spent the last month debugging a microservices cluster for a fintech client in Oslo. The symptoms? Intermittent 502 errors and latency spikes that didn't show up in application APM tools. The culprit wasn't code; it was the network overlay choking on packet encapsulation because the underlying virtual machines were fighting for CPU cycles.

In this deep dive, we aren't looking at "Hello World." We are looking at how packets actually move in K8s v1.15+, why your CNI plugin choice matters more than you think, and why running this on subpar infrastructure is a death sentence for performance.

The Flat Network Lie

Kubernetes promises a flat network structure: every pod gets an IP, and every pod can talk to every other pod without NAT. It sounds elegant. Under the hood, it is a complex beast of routing tables, bridges, and encapsulation.

When you run kubectl get pods -o wide, you see IPs like 10.244.1.5. Those IPs don't exist on your physical router. They exist inside the virtual network space created by your CNI (Container Network Interface) plugin.

The CNI War: Flannel vs. Calico

In 2019, if you aren't using a managed cloud CNI, you are likely deciding between Flannel and Calico. This isn't just a preference; it's an architectural decision.

Flannel (VXLAN): The default for many. It encapsulates Layer 2 frames inside UDP packets (Layer 4). It is simple but adds overhead. Every packet is wrapped and unwrapped. If your VPS has weak CPU performance or "noisy neighbors" stealing cycles, this encapsulation/decapsulation process (encap/decap) introduces jitter.
Calico (BGP): Uses the Border Gateway Protocol to distribute routing information. No encapsulation (in pure Layer 3 mode). It is faster, but requires an underlying network that allows BGP peering or at least doesn't block the traffic.

Pro Tip: If you are running on CoolVDS, I recommend testing Calico with IPIP encapsulation disabled if your architecture permits, or sticking to high-performance VXLAN backends. Our KVM instances provide the raw CPU instructions needed to handle encapsulation without the latency spikes seen on OpenVZ containers.

Service Discovery: IPVS is the New King

Until recently, kube-proxy used iptables to handle Service routing. When traffic hit a Service IP, the kernel ran through a list of rules to forward packets to a Pod. This works for 50 services. It fails hard at 5,000 services.

As of Kubernetes 1.11, IPVS (IP Virtual Server) became generally available, and in late 2019, you should be using it. IPVS uses hash tables instead of linear lists. The lookup time is constant, O(1), regardless of cluster size.

To enable IPVS mode in kube-proxy, you need to edit the config map. Here is how we enforce it on our clusters:

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  excludeCIDRs: null
  minSyncPeriod: 0s
  scheduler: "rr"  # Round Robin is usually fine, try 'lc' (Least Connection) for long-lived sockets
  strictARP: false
  syncPeriod: 30s

Before applying this, ensure your nodes have the kernel modules loaded:

# Check for IPVS modules
lsmod | grep ip_vs

# If missing, load them (persists depending on your distro)
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh

Ingress: The Gatekeeper

Exposing services via NodePort or LoadBalancer is fine for testing, but in production, you need an Ingress Controller. NGINX remains the battle-standard here. It terminates SSL and routes traffic based on Host headers.

However, a common mistake is neglecting the keep-alive settings and buffer sizes, leading to dropped connections under load. Here is a production-ready snippet for nginx-configuration ConfigMap ensuring high throughput for a Norwegian e-commerce site we host:

kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
data:
  keep-alive: "75"
  keep-alive-requests: "1000"
  upstream-keepalive-connections: "64"
  worker-processes: "auto"
  # Crucial for maximizing I/O on CoolVDS NVMe instances
  log-format-upstream: '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_length $request_time [$proxy_upstream_name] $upstream_addr $upstream_response_length $upstream_response_time $upstream_status $req_id'

Troubleshooting: When It Breaks

When a pod can't reach another pod, don't guess. Use nsenter to step into the pod's network namespace directly from the node. This bypasses the container runtime abstraction and lets you use the node's tools.

Find the Process ID (PID) of the container:

docker inspect --format '{{.State.Pid}}' <container-id>

Enter the namespace:

nsenter -t <PID> -n ip addr show

If you see the interface but no traffic flows, check the MTU (Maximum Transmission Unit). A common issue with overlay networks is that the VXLAN header adds 50 bytes. If your physical interface is 1500 MTU and your CNI tries to push 1500 MTU packets inside the tunnel, they get dropped or fragmented. Set your CNI MTU to 1450 to be safe.

The Physical Layer: Why "Where" Matters

You can optimize iptables and tune NGINX all day, but you cannot configure your way out of physics. Network latency is a killer for microservices. If Service A calls Service B, which calls Service C, a 20ms latency between nodes compounds rapidly.

This is particularly relevant for Norwegian businesses targeting local customers. Routing traffic through Frankfurt or Amsterdam adds unnecessary milliseconds. You need data residency within Norway, not just for GDPR compliance, but for the speed of light.

Latency and Etcd

Kubernetes stores its state in etcd. Etcd uses the Raft consensus algorithm, which is extremely sensitive to disk write latency (fsync). If your disk is slow, the leader election times out, and your cluster goes down. I have seen entire clusters fail because they were running on standard SATA SSDs or, heaven forbid, spinning rust.

This is why we standardized on NVMe storage at CoolVDS. The I/O wait is negligible.

Feature	Standard VPS	CoolVDS NVMe
Storage Latency	2-10ms	<0.5ms
Network Drivers	VirtIO (often unoptimized)	VirtIO-Net (Tuned)
Virtualization	Container/OpenVZ	KVM (Kernel Isolation)

Conclusion: Own Your Traffic

Kubernetes networking in 2019 is powerful, but it assumes you have the underlying hardware to support it. Don't layer complex overlays on top of congested, oversold infrastructure.

Whether you are adhering to strict data privacy regulations or just want your API to respond in under 50ms, the foundation is everything. Stop fighting the "noisy neighbors" on cheap shared hosting.

Ready to build a cluster that actually performs? Deploy a high-performance KVM instance in Oslo with CoolVDS today and see the difference NVMe makes to your etcd convergence times.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Kubernetes Networking Deep Dive: Surviving the Overlay Chaos in Production (2019 Edition)

Kubernetes Networking Deep Dive: Surviving the Overlay Chaos in Production

The Flat Network Lie

The CNI War: Flannel vs. Calico

Service Discovery: IPVS is the New King

Ingress: The Gatekeeper

Troubleshooting: When It Breaks

The Physical Layer: Why "Where" Matters

Latency and Etcd

Conclusion: Own Your Traffic

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025