Stop Debugging Blindly: A Deep Dive into Kubernetes Networking & CNI Performance
You deployed a Service. The Pods are Running. But curl times out.
Welcome to the most frustrating layer of the cloud native stack. In 2020, everyone wants to run Kubernetes, but few understand what happens to a packet when it leaves eth0 inside a container. It doesn't just vanish into the ether; it traverses a maze of network namespaces, bridges, iptables rules, and encapsulation headers.
I have spent the last three weeks debugging a microservices cluster that had intermittent 500ms latency spikes. The code was fine. The database was sleeping. The culprit? A misconfigured CNI plugin fighting for CPU cycles on a noisy neighbor VPS. If you are running K8s on cheap, shared hosting, you are already losing packets.
The Overlay Tax: Why Your CPU Matters
Unless you are running flat networking (bgp routing directly to pods), you are likely using an overlay network like VXLAN or IPIP. Every packet sent between nodes must be encapsulated, sent over the wire, and decapsulated.
This process costs CPU cycles. On a dedicated CoolVDS instance with isolated cores, this is negligible. On a shared cloud instance where 'vCPUs' are overcommitted, packet processing gets queued behind someone else's PHP script. This creates jitter.
Pro Tip: Check yoursi(software interrupt) usage intop. If it is consistently high during traffic spikes, your networking stack is CPU-bound.
Choosing Your Weapon: Calico vs. Flannel
Stop using Flannel just because it is the default in many tutorials. It is simple, yes, but it relies heavily on VXLAN encapsulation which has overhead. For production workloads in Europe, where every millisecond to the NIX (Norwegian Internet Exchange) counts, we prefer Calico.
Calico can run in two modes:
- IPIP (IP in IP): Lower overhead than VXLAN.
- BGP (No Encap): Wire speed. The pod IP is routable across the network.
If your host allows BGP peering (which we support on advanced CoolVDS setups), turn off encapsulation entirely. If not, use IPIP. Here is how you verify what mode Calico is running in:
calicoctl get ippool -o yaml
Look for ipipMode. If it says Always, you are encapsulating. If you are on a private LAN within our Oslo datacenter, consider switching to CrossSubnet to only encapsulate traffic traversing routers.
The iptables Bottleneck and the IPVS Solution
By default, Kubernetes uses iptables to implement Services. When you create a Service, kube-proxy writes a long list of random-access rules to forward traffic.
This works fine for 50 services. It is a disaster for 5,000. I have seen iptables-save dumps take 3 seconds to run on clusters with massive service counts. The kernel has to iterate through these rules sequentially.
The Fix: Switch kube-proxy to IPVS mode.
IPVS (IP Virtual Server) is a kernel-level load balancer that uses hash tables. It is O(1) complexity, meaning lookup time is constant regardless of how many services you have. This technology has been in the Linux kernel for years, but K8s support became stable recently (GA in 1.11).
To enable this, edit your kube-proxy ConfigMap (usually in the kube-system namespace):
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
strictARP: true
scheduler: "rr" # Round Robin
Don't forget to ensure the IPVS kernel modules are loaded on your worker nodes before restarting kube-proxy:
# Load modules
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
# Verify
lsmod | grep ip_vs
The GDPR Reality: Data Locality in 2020
We cannot talk about networking without addressing the Schrems II ruling from July. The Privacy Shield is dead. If your K8s cluster is ingress-ing user data and piping it to a persistent volume hosted on a US-owned cloud provider, you are in a legal minefield.
Latency is physics; compliance is law. Hosting on CoolVDS in Norway solves both. Your data stays within the EEA (European Economic Area), protected by Norwegian privacy laws, and your packets hit the local internet exchanges without crossing the Atlantic.
Debugging DNS: The 5-Second Delay
If your application throws random timeouts exactly 5 seconds after a request starts, you have a conntrack race condition. This is a classic issue in Linux kernels when using SNAT/DNAT heavily (which K8s does).
It often happens when glibc's resolver sends A and AAAA DNS lookups simultaneously via the same socket. If one packet gets dropped by a full conntrack table, the retry waits 5 seconds.
To mitigate this, force TCP for DNS or use the single-request-reopen option in your Pod's dnsConfig:
apiVersion: v1
kind: Pod
metadata:
name: ubuntu-debug
spec:
dnsConfig:
options:
- name: single-request-reopen
containers:
- name: ubuntu
image: ubuntu:20.04
command: ["sleep", "3600"]
Alternatively, ensure your underlying host has massive connection tracking limits. On CoolVDS NVMe instances, we tune sysctl defaults to handle high-concurrency workloads out of the box.
Real-World Ingress Tuning
Nginx Ingress is the standard. But the defaults are for compatibility, not speed. If you are serving high-traffic APIs, you need to tune the buffers and keepalives.
Here is a snippet from a production nginx-configuration ConfigMap I deployed last week for a client dealing with heavy POST requests:
data:
# Increase buffer size for large headers/payloads
proxy-buffer-size: "16k"
proxy-buffers: "4 32k"
proxy-busy-buffers-size: "64k"
# Keepalive performance
keep-alive: "65"
upstream-keepalive-connections: "100"
upstream-keepalive-timeout: "30"
# Security & Latency
ssl-protocols: "TLSv1.2 TLSv1.3"
worker-processes: "auto"
Why Infrastructure Underlies Everything
You can optimize your CNI, switch to IPVS, and tune Nginx until you are blue in the face. But if the hypervisor below you is stealing cycles (Steal Time) or the storage I/O is saturated, your Kubernetes cluster will feel sluggish.
Kubernetes is an orchestration engine, not a magician. It needs raw, consistent compute power.
| Feature | Generic VPS | CoolVDS NVMe |
|---|---|---|
| Storage Latency | High (Network Storage) | Low (Local NVMe) |
| CPU Isolation | Shared / Noisy | Dedicated Resources |
| Network location | Often unknown routing | Direct peering in Oslo |
Don't let slow I/O or network jitter kill your SEO or user experience. Kubernetes requires a solid foundation.
Ready to run K8s the way it was meant to be run? Deploy a high-performance instance on CoolVDS today and see the difference strictARP makes on dedicated hardware.