Console Login

Stop Debugging Blindly: A Deep Dive into Kubernetes Networking & CNI Performance

Stop Debugging Blindly: A Deep Dive into Kubernetes Networking & CNI Performance

You deployed a Service. The Pods are Running. But curl times out.

Welcome to the most frustrating layer of the cloud native stack. In 2020, everyone wants to run Kubernetes, but few understand what happens to a packet when it leaves eth0 inside a container. It doesn't just vanish into the ether; it traverses a maze of network namespaces, bridges, iptables rules, and encapsulation headers.

I have spent the last three weeks debugging a microservices cluster that had intermittent 500ms latency spikes. The code was fine. The database was sleeping. The culprit? A misconfigured CNI plugin fighting for CPU cycles on a noisy neighbor VPS. If you are running K8s on cheap, shared hosting, you are already losing packets.

The Overlay Tax: Why Your CPU Matters

Unless you are running flat networking (bgp routing directly to pods), you are likely using an overlay network like VXLAN or IPIP. Every packet sent between nodes must be encapsulated, sent over the wire, and decapsulated.

This process costs CPU cycles. On a dedicated CoolVDS instance with isolated cores, this is negligible. On a shared cloud instance where 'vCPUs' are overcommitted, packet processing gets queued behind someone else's PHP script. This creates jitter.

Pro Tip: Check your si (software interrupt) usage in top. If it is consistently high during traffic spikes, your networking stack is CPU-bound.

Choosing Your Weapon: Calico vs. Flannel

Stop using Flannel just because it is the default in many tutorials. It is simple, yes, but it relies heavily on VXLAN encapsulation which has overhead. For production workloads in Europe, where every millisecond to the NIX (Norwegian Internet Exchange) counts, we prefer Calico.

Calico can run in two modes:

  • IPIP (IP in IP): Lower overhead than VXLAN.
  • BGP (No Encap): Wire speed. The pod IP is routable across the network.

If your host allows BGP peering (which we support on advanced CoolVDS setups), turn off encapsulation entirely. If not, use IPIP. Here is how you verify what mode Calico is running in:

calicoctl get ippool -o yaml

Look for ipipMode. If it says Always, you are encapsulating. If you are on a private LAN within our Oslo datacenter, consider switching to CrossSubnet to only encapsulate traffic traversing routers.

The iptables Bottleneck and the IPVS Solution

By default, Kubernetes uses iptables to implement Services. When you create a Service, kube-proxy writes a long list of random-access rules to forward traffic.

This works fine for 50 services. It is a disaster for 5,000. I have seen iptables-save dumps take 3 seconds to run on clusters with massive service counts. The kernel has to iterate through these rules sequentially.

The Fix: Switch kube-proxy to IPVS mode.

IPVS (IP Virtual Server) is a kernel-level load balancer that uses hash tables. It is O(1) complexity, meaning lookup time is constant regardless of how many services you have. This technology has been in the Linux kernel for years, but K8s support became stable recently (GA in 1.11).

To enable this, edit your kube-proxy ConfigMap (usually in the kube-system namespace):

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  strictARP: true
  scheduler: "rr"  # Round Robin

Don't forget to ensure the IPVS kernel modules are loaded on your worker nodes before restarting kube-proxy:

# Load modules
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh

# Verify
lsmod | grep ip_vs

The GDPR Reality: Data Locality in 2020

We cannot talk about networking without addressing the Schrems II ruling from July. The Privacy Shield is dead. If your K8s cluster is ingress-ing user data and piping it to a persistent volume hosted on a US-owned cloud provider, you are in a legal minefield.

Latency is physics; compliance is law. Hosting on CoolVDS in Norway solves both. Your data stays within the EEA (European Economic Area), protected by Norwegian privacy laws, and your packets hit the local internet exchanges without crossing the Atlantic.

Debugging DNS: The 5-Second Delay

If your application throws random timeouts exactly 5 seconds after a request starts, you have a conntrack race condition. This is a classic issue in Linux kernels when using SNAT/DNAT heavily (which K8s does).

It often happens when glibc's resolver sends A and AAAA DNS lookups simultaneously via the same socket. If one packet gets dropped by a full conntrack table, the retry waits 5 seconds.

To mitigate this, force TCP for DNS or use the single-request-reopen option in your Pod's dnsConfig:

apiVersion: v1
kind: Pod
metadata:
  name: ubuntu-debug
spec:
  dnsConfig:
    options:
      - name: single-request-reopen
  containers:
  - name: ubuntu
    image: ubuntu:20.04
    command: ["sleep", "3600"]

Alternatively, ensure your underlying host has massive connection tracking limits. On CoolVDS NVMe instances, we tune sysctl defaults to handle high-concurrency workloads out of the box.

Real-World Ingress Tuning

Nginx Ingress is the standard. But the defaults are for compatibility, not speed. If you are serving high-traffic APIs, you need to tune the buffers and keepalives.

Here is a snippet from a production nginx-configuration ConfigMap I deployed last week for a client dealing with heavy POST requests:

data:
  # Increase buffer size for large headers/payloads
  proxy-buffer-size: "16k"
  proxy-buffers: "4 32k"
  proxy-busy-buffers-size: "64k"
  
  # Keepalive performance
  keep-alive: "65"
  upstream-keepalive-connections: "100"
  upstream-keepalive-timeout: "30"
  
  # Security & Latency
  ssl-protocols: "TLSv1.2 TLSv1.3"
  worker-processes: "auto"

Why Infrastructure Underlies Everything

You can optimize your CNI, switch to IPVS, and tune Nginx until you are blue in the face. But if the hypervisor below you is stealing cycles (Steal Time) or the storage I/O is saturated, your Kubernetes cluster will feel sluggish.

Kubernetes is an orchestration engine, not a magician. It needs raw, consistent compute power.

Feature Generic VPS CoolVDS NVMe
Storage Latency High (Network Storage) Low (Local NVMe)
CPU Isolation Shared / Noisy Dedicated Resources
Network location Often unknown routing Direct peering in Oslo

Don't let slow I/O or network jitter kill your SEO or user experience. Kubernetes requires a solid foundation.

Ready to run K8s the way it was meant to be run? Deploy a high-performance instance on CoolVDS today and see the difference strictARP makes on dedicated hardware.