Console Login

Kubernetes Networking in Production: Surviving the Packet Jungle

Kubernetes Networking in Production: Surviving the Packet Jungle

Most developers treat Kubernetes networking as a black box. You define a Service, maybe an Ingress, and magic happens. But when you are running a high-frequency trading algorithm or a high-traffic media server targeting users in Oslo, "magic" is just another word for "latency I don't understand yet."

I've spent too many nights debugging CrashLoopBackOff caused not by application code, but by obscure packet drops and MTU mismatches. The abstraction leak in Kubernetes is real. If you don't understand the underlying plumbing—iptables, eBPF, VXLAN, and the physical network of your provider—you are building on quicksand.

The CNI Battlefield: Overlay vs. Direct Routing

The Container Network Interface (CNI) is where the rubber meets the road. In 2024, if you are still using the default flannel setup without thinking, you are leaving performance on the table. The choice usually boils down to an overlay network (encapsulation) or direct routing.

Overlay networks (like VXLAN) are easy to set up. They encapsulate packets, creating a virtual network on top of the physical one. But encapsulation costs CPU cycles and adds overhead to the packet header. In a high-throughput scenario, this fragmentation kills performance.

Direct routing (often BGP-based, like Calico or Cilium in native routing mode) maps Pod IPs directly to the underlay network. This is where CoolVDS shines. Because our instances provide raw, unadulterated access to high-performance KVM network drivers, you can push packet processing speeds that rival bare metal.

Code Block: Checking Your CNI Mode

If you are running Cilium, verify you aren't accidentally falling back to encapsulation when you intended direct routing:

kubectl -n kube-system exec -ti cilium-x8s9f -- cilium status --verbose

# Look for this section:
# KubeProxyReplacement:   Strict   [PKS, GKE, EKS, AKS]
# IPv4 BIG TCP:           Enabled
# BandwidthManager:       Enabled
# Routing:                Network (Native) 
# Masquerading:           eBPF
Pro Tip: If your K8s nodes are spanning multiple data centers (e.g., Oslo and Stockholm), calculate your MTU carefully. The standard 1500 bytes usually drops to 1450 or less with VXLAN. If your application tries to push 1500-byte payloads, fragmentation will destroy your latency. Always set mtu: 8900 or jumbo frames if your provider supports it (we do on private LANs).

eBPF: The Death of iptables

For years, Kubernetes relied on iptables to manage Service VIPs (Virtual IPs). It works fine for 50 services. But I've seen a cluster with 5,000 services grind to a halt because the kernel had to traverse a massive linear list of rules for every single packet.

By March 2024, eBPF (Extended Berkeley Packet Filter) has become the gold standard. Tools like Cilium use eBPF to bypass iptables entirely, handling routing logic directly in the kernel. This reduces latency and CPU overhead significantly.

Here is a comparison of how packet processing methods stack up:

Feature iptables (Legacy) IPVS (Better) eBPF (Best)
Scalability O(n) - Linear O(1) - Constant O(1) - Constant
Latency High at scale Low Lowest
Observability Limited Standard Deep (Layer 7)

War Story: The "Norwegian" Firewall Issue

We recently migrated a large logistics platform to a Kubernetes cluster hosted on standard cloud instances. The requirement was strict: all data related to Norwegian shipments had to stay within the EEA, and latency to the Oslo warehouse terminals had to be sub-20ms. The app was working fine in staging but timed out in production.

The culprit? conntrack table exhaustion. The application opened thousands of short-lived connections to external APIs. The default Linux kernel settings on the nodes were too conservative.

We fixed it by tuning the kernel parameters directly on the nodes. This is why having root access to a high-performance VPS is critical. You can't just rely on managed K8s defaults.

Code Block: Tuning Kernel for High Concurrency

# /etc/sysctl.d/99-k8s-networking.conf

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535

# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase max open files
fs.file-max = 2097152

# Increase conntrack table size (Crucial for high load K8s)
net.netfilter.nf_conntrack_max = 524288

# Reduce keepalive time
net.ipv4.tcp_keepalive_time = 600

Apply with sysctl -p. If you are on a locked-down managed service, good luck getting support to change these. On CoolVDS, you own the kernel.

Ingress and Security: Don't Expose Naked Pods

Exposing a Service via NodePort or LoadBalancer is fine for testing, but in production, you need an Ingress Controller. NGINX is the standard, but configuration matters. We see attacks targeting Nordic infrastructure constantly. If you aren't rate-limiting at the edge, your pods will drown.

A robust NetworkPolicy is your second line of defense. By default, K8s allows all traffic. This is a security nightmare. If one pod is compromised, the attacker can scan your entire internal network.

Code Block: A "Deny-All" Default Policy

Start with this and whitelist only what is necessary:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Then, allow traffic only from the Ingress controller to your frontend pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ingress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: frontend
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    ports:
    - protocol: TCP
      port: 80

The Hardware Layer: Why Your VPS Provider Matters

You can tune sysctl and optimize eBPF maps all day, but if the physical interface (NIC) is saturated or the hypervisor is stealing CPU cycles, your latency will spike. This is the "noisy neighbor" effect common in budget hosting.

At CoolVDS, we isolate resources aggressively. When you spin up a node for your cluster, you are getting dedicated NVMe storage bandwidth and guaranteed CPU time. For Kubernetes networking, where packet processing is CPU-intensive (especially with encryption/decryption in mesh setups like Istio), this stability is non-negotiable.

Furthermore, data sovereignty is critical. With servers located in Europe, we help you comply with GDPR and local regulations like those enforced by Datatilsynet. You know exactly where your packets are physically flowing.

Debugging Network Latency

When things go wrong, kubectl logs isn't enough. You need to get into the pod network namespace. Since many production containers (distroless) don't have shells, use kubectl debug with a sidecar containing network tools.

Code Block: Ephemeral Debug Container

kubectl debug -it pod/backend-api-7f8b9c --image=nicolaka/netshoot --target=backend-api -- bash

# Now inside the pod's network namespace:
tcpdump -i eth0 -n "port 8080 and host 10.244.1.5"

# Check for latency/retransmissions
ss -ti

Conclusion

Kubernetes networking is powerful, but it's not magic. It requires a solid understanding of Linux networking fundamentals and an infrastructure that doesn't get in your way. Whether you are running a microservices mesh or a monolithic legacy app, the underlying hardware determines your ceiling.

Don't let I/O wait times or hypervisor lag kill your cluster's performance. Build your infrastructure on a foundation designed for speed and sovereignty.

Ready to see what raw performance looks like? Deploy a CoolVDS NVMe instance in Oslo today and benchmark your network throughput.