Console Login

Kubernetes Networking Deep Dive: Surviving the Packet Jungle in Production

Kubernetes Networking Deep Dive: Surviving the Packet Jungle in Production

Let’s be honest for a minute. When you first deployed Kubernetes v1.13, you probably thought, "Great, services just talk to each other." Then you tried to debug a 502 Bad Gateway or a random latency spike, and you realized that the "flat network" promise is held together by a terrifying amount of iptables rules and routing tables.

I've spent the last month migrating a high-traffic FinTech workload here in Oslo from legacy VMs to a K8s cluster. The compute part was easy. The networking? That’s where the war stories are born. If you treat Kubernetes networking as a black box, it will eventually crush your uptime. Today, we are going to rip open that box.

The CNI Dilemma: Overlay vs. Direct Routing

Kubernetes doesn't provide networking; it defines an interface (CNI) and expects you to bring a plugin. In 2019, your choice largely boils down to two architectures: VXLAN encapsulation (like Flannel) or BGP routing (like Calico).

Many managed services default to overlays. It's easy. It works. But every packet is encapsulated, adding CPU overhead and reducing the Maximum Transmission Unit (MTU). On a standard VPS with "noisy neighbors," this extra processing kills your throughput. We learned this the hard way when our API latency jumped 40% during peak hours.

If you are running self-managed K8s on CoolVDS, you have the KVM isolation needed to run high-performance CNI configurations. Here is why I prefer Calico for production workloads that demand low latency.

Calico Configuration for Performance

Instead of wrapping packets, Calico uses BGP to update routing tables on the host. It’s essentially native Linux networking. However, you need to ensure your IP pools are configured correctly to avoid conflicts with your host network.

apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
  name: default-ipv4-ippool
spec:
  cidr: 192.168.0.0/16
  ipipMode: CrossSubnet
  natOutgoing: true

Setting ipipMode: CrossSubnet is a pragmatic middle ground. It only encapsulates traffic when crossing subnet boundaries but routes natively within the L2 domain. This keeps your packet overhead minimal when pods communicate on the same rack.

The DNS Latency Trap (ndots:5)

This is the silent killer in modern stacks. We noticed that external API calls from our PHP-FPM pods were taking exactly 5 seconds to resolve occasionally. It wasn't the network; it was the configuration.

By default, Kubernetes sets ndots:5 in /etc/resolv.conf inside pods. This means if you look up google.com, the system first tries to resolve it as:

  1. google.com.my-namespace.svc.cluster.local
  2. google.com.svc.cluster.local
  3. google.com.cluster.local
  4. ...and so on.

This generates a storm of A and AAAA records hitting CoreDNS. If a UDP packet gets dropped (common in high-concurrency environments), the glibc resolver waits 5 seconds before retrying.

Pro Tip: If your application does a lot of external calling, optimize your dnsConfig in the Pod spec. Don't let default settings destroy your response times.
apiVersion: v1
kind: Pod
metadata:
  name: high-performance-pod
spec:
  dnsConfig:
    options:
      - name: ndots
        value: "2"
      - name: single-request-reopen
  containers:
  - name: app
    image: my-app:1.2

Kube-Proxy: IPVS vs. Iptables

Historically, kube-proxy used iptables to handle Service VIPs. This works fine for 50 services. But when you scale to 1,000 services, the kernel has to traverse a sequential list of rules for every packet. It’s O(n) complexity, and it hurts.

As of Kubernetes 1.11, IPVS (IP Virtual Server) is generally available. IPVS uses hash tables, making it O(1). Switching to IPVS was the single biggest performance boost we saw on our clusters.

To enable this, you need to ensure the IPVS kernel modules are loaded on your CoolVDS nodes before starting the kubelet:

# Load required modules
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack_ipv4

# Check if they are loaded
lsmod | grep -e ip_vs -e nf_conntrack_ipv4

Running IPVS requires a stable kernel. This is where the infrastructure matters. Budget VPS providers often run outdated, heavily modified kernels that panic when you start messing with IPVS hashing. CoolVDS KVM instances provide a clean, modern kernel environment where these advanced networking features actually function reliably.

Ingress and The "Real IP" Problem

Exposing your application to the world usually involves an Ingress Controller. In 2019, the NGINX Ingress Controller is the de-facto standard. But there is a catch: preserving the client IP address.

When traffic hits a Load Balancer and then your Ingress Pod, the source IP often gets replaced by the node's internal IP. For compliance with GDPR and audit logs, losing the source IP is unacceptable.

To fix this, you need to leverage externalTrafficPolicy: Local in your Service definition. This forces traffic to go only to the node running the pod, preserving the source IP, but potentially causing uneven load balancing if your pods aren't spread out.

apiVersion: v1
kind: Service
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  ports:
  - name: http
    port: 80
    targetPort: 80
  selector:
    app.kubernetes.io/name: ingress-nginx

Why Infrastructure Choice is not Just About CPU

You can tune sysctl parameters until you are blue in the face, but you cannot software-engineer your way out of bad hardware or congested uplinks. Kubernetes networking is incredibly chatty. Pods are constantly health-checking, replicating data, and resolving DNS.

In Norway, we have the benefit of NIX (Norwegian Internet Exchange) for local peering, but that only helps if your provider connects to it effectively. Latency to Oslo isn't just about distance; it's about the quality of the network stack.

CoolVDS instances are built on NVMe storage and dedicated KVM resources. Why does storage matter for networking? Because high-load logging (like Fluentd pushing logs) can choke I/O, causing the system to stall network packet processing. It’s all connected. We tested a Kafka cluster on standard spinning-disk VPS vs. CoolVDS NVMe, and the network throughput stability on NVMe was visibly superior because the CPU wasn't waiting on I/O Wait.

Final Thoughts

Kubernetes is powerful, but it assumes your underlying network is robust. Don't build a Ferrari engine and put it inside a rusted tractor. Understanding the packet flow—from CNI to IPVS to the physical interface—is what separates a fragile cluster from a production-grade platform.

If you are planning to run Kubernetes in production this year, start with a solid foundation. Stop fighting with stolen CPU cycles and variable latency.

Spin up a high-performance KVM instance on CoolVDS today, install kubeadm, and see what stable networking actually looks like.