Kubernetes Networking Explained: CNI, IPVS, and Debugging Production Clusters
Let’s be honest: Kubernetes networking is usually the layer that keeps us up at night. The scheduling logic is elegant, but the networking stack—pod-to-pod communication, service discovery, and ingress—is a complex beast of iptables rules, routing tables, and encapsulation. I recently spent three days debugging a microservices setup for a client in Oslo where random packets were dropping between the frontend and the payment gateway. The culprit wasn't code; it was a saturated conntrack table on the host nodes.
If you treat Kubernetes networking as a black box, you will eventually face an outage you can't explain. Today, we are tearing that box open. We will look at choosing the right CNI (Container Network Interface), the performance implications of the new IPVS mode in kube-proxy, and why your choice of underlying VPS provider in Norway impacts your overlay network overhead more than you think.
The CNI Jungle: Flannel vs. Calico in 2018
When you initialize a cluster with kubeadm init, you aren't done until you apply a CNI plugin. In 2018, the two heavyweights are Flannel and Calico. Your choice here dictates your network performance.
Flannel is the simple choice. It typically uses VXLAN to encapsulate Layer 2 Ethernet frames within Layer 3 UDP packets. It works everywhere, but that encapsulation comes with a CPU cost. Every packet leaving a pod is wrapped, sent across the wire, and unwrapped. On a high-traffic node, this CPU overhead adds up.
Calico, on the other hand, can run in pure Layer 3 mode using BGP (Border Gateway Protocol). No encapsulation headers, just pure routing. If your underlying infrastructure supports it, this is the performance winner.
Pro Tip: If you are running on a provider that blocks BGP or filters unknown MAC addresses (common in cheap shared clouds), you might be forced into VXLAN. CoolVDS KVM instances provide the isolation needed to run these protocols without the "noisy neighbor" interference that plagues standard shared hosting.
Configuring Calico for Policy Enforcement
One major reason we lean towards Calico at the moment is support for NetworkPolicy. Flannel (by default) just connects things; Calico secures them. With GDPR now in full effect as of May, you cannot have a database pod accepting connections from just anywhere in the cluster.
Here is a standard NetworkPolicy to deny all ingress traffic by default—a configuration that should be standard in every namespace you deploy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
Once applied, you explicitly whitelist traffic. This follows the "Zero Trust" model that Datatilsynet (The Norwegian Data Protection Authority) effectively mandates for sensitive data handling.
The Shift to IPVS (IP Virtual Server)
This is the most exciting development in Kubernetes 1.11 (stable since June). Traditionally, kube-proxy used iptables to handle Service VIPs. When you have a service, kube-proxy writes iptables rules to redirect traffic to the backend pods.
The problem? iptables is a linear list. If you have 5,000 services, the kernel has to traverse a massive list of rules for every packet. O(n) complexity kills performance at scale.
IPVS is a kernel-space load balancer based on hash tables. It has O(1) complexity. It doesn't care if you have 5 services or 5,000; the lookup time is virtually the same. To enable this in your cluster, you need to ensure the IPVS kernel modules are loaded on your worker nodes before starting kube-proxy.
# Load required kernel modules
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
# Check if they are loaded
lsmod | grep -e ip_vs -e nf_conntrack_ipv4
If you are managing your own control plane, you then configure kube-proxy with mode: "ipvs". The latency difference in high-churn environments is noticeable.
The Hardware Reality: Latency and Virtualization
Software optimization implies that the hardware underneath is reliable. This is where many DevOps engineers fail. They spend weeks tuning sysctl.conf parameters but deploy on oversold VPS hosts.
Kubernetes adds layers: Pod -> CNI -> Host Network -> Physical Interface. If you are using VXLAN, you are adding packet fragmentation risks and CPU overhead for encapsulation. If the physical host underneath your VM is stealing CPU cycles (Steal Time) because the hosting provider oversold the CPU, your network throughput crashes. It doesn't matter how good your Calico config is if the hypervisor isn't scheduling your VM's CPU instructions fast enough.
| Feature | Standard Container VPS | CoolVDS (KVM) |
|---|---|---|
| Virtualization | Container-based (LXC/OpenVZ) | Hardware-assisted (KVM) |
| Kernel Access | Shared Kernel (Restricted) | Dedicated Kernel (Full Control) |
| IPVS Support | Often blocked | Native Support |
| IOPS Consistency | Fluctuates (
Recent Searches |