Console Login

Kubernetes Networking Autopsy: Killing Latency Before It Kills Your Pods

Kubernetes Networking Autopsy: Killing Latency Before It Kills Your Pods

Let’s be honest: Kubernetes networking is where 90% of "stable" clusters eventually go to die. You deploy your microservices, everything looks green in the dashboard, but your latency to Oslo is fluctuating wildly, and you have random TCP timeouts that leave no trace in the application logs.

I recently spent three sleepless nights debugging a cluster for a Norwegian fintech client. They were blaming the database code. The real culprit? A default VXLAN configuration on a noisy public cloud provider that was dropping encapsulated packets whenever the neighbor VM spiked its CPU.

Kubernetes is not magic. It is just Linux namespaces, bridges, and a terrifying amount of iptables rules glued together. If you don't understand the plumbing, you will drown in it. Today, we are going deep into the CNI wars, the iptables vs IPVS debate, and why your hosting provider's hardware matters more than your manifest files.

The CNI Battlefield: Calico, Flannel, or Cilium?

In May 2021, the Container Network Interface (CNI) landscape is crowded. Your choice here dictates your cluster's performance ceiling.

1. Flannel: The "Just Work" Trap

Flannel is simple. It creates a flat overlay network (usually VXLAN). It is great for a Raspberry Pi cluster in your basement. It is terrible for high-throughput production. The encapsulation overhead is real, and it lacks support for Network Policies. If you are running a business in a post-Schrems II Europe, you need security controls, not just connectivity.

2. Calico: The Industry Standard

Calico gives you options. You can use encapsulation (IPIP or VXLAN) or, if your underlying network supports it, unencapsulated BGP peering. For most VPS setups where Layer 2 access is restricted, you will likely use IPIP.

Here is a snippet from a standard calico.yaml configuration ensuring we handle MTU correctly (a common source of packet drops):

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: calico-node
  namespace: kube-system
spec:
  template:
    spec:
      containers:
        - name: calico-node
          env:
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # Set MTU to match the VPS interface minus overhead (usually 1480 for IPIP)
            - name: FELIX_IPINIPMTU
              value: "1480"

3. Cilium: The eBPF Future

Cilium is gaining serious traction this year. By bypassing iptables and using eBPF (Extended Berkeley Packet Filter) inside the kernel, it offers visibility and speed that pure iptables setups can't match. If you are pushing high packet rates, look at Cilium.

The Bottleneck: iptables vs. IPVS

By default, Kubernetes kube-proxy uses iptables mode. This means every Service you create adds a chain of rules. When you hit 5,000 services, the kernel has to traverse a sequential list of rules to route a packet. This is O(n) complexity. It is slow.

IPVS (IP Virtual Server) uses hash tables. It is O(1). It doesn't care if you have 10 services or 10,000. Switch your kube-proxy to IPVS mode immediately if you scale.

To verify what mode you are running:

kubectl logs -n kube-system -l k8s-app=kube-proxy | grep "Using"

If you see "Using iptables Proxier", it’s time to edit your ConfigMap.

Pro Tip: IPVS requires specific kernel modules to be loaded on the host system before kube-proxy starts. On a CoolVDS instance, we ensure the kernel is prepped, but always check lsmod | grep ip_vs before applying the change.

Security: The Norway & GDPR Factor

Running a cluster in 2021 without Network Policies is negligence. With the strict enforcement of GDPR and Datatilsynet watching, you cannot allow all pods to talk to all pods. If your frontend is compromised, it shouldn't have direct access to your billing database.

This NetworkPolicy denies all ingress traffic to the namespace by default, forcing you to whitelist specific paths:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress

Once applied, you explicitly open ports only for what is necessary:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 5432

Why Infrastructure Dictates Network Performance

Here is the uncomfortable truth: You can tune sysctls all day, but if your Virtual Private Server (VPS) suffers from "noisy neighbor" syndrome, your network latency will spike.

Network processing in Linux is CPU-intensive. When a packet arrives, an interrupt is triggered. If your host CPU is busy waiting for someone else's PHP script to finish execution, that packet sits in the ring buffer. Eventually, the buffer overflows, and the packet is dropped.

Performance Metric Standard VPS CoolVDS Architecture
CPU Steal High (Shared resources) Near Zero (Dedicated allocation)
Disk I/O (etcd) SATA/SAS SSD Mix Pure NVMe (Crucial for K8s API)
Latency (Oslo) Variable ( congested routes) < 2ms (Optimized Peering)

When we designed the CoolVDS platform, we specifically looked at the demands of modern orchestration like Kubernetes. K8s relies heavily on etcd for state. Etcd is incredibly sensitive to disk latency. If your disk fsync takes too long, the API server hangs, and network updates (like new endpoint slices) get delayed.

Debugging Network Latency

If you suspect underlying network issues, stop looking at Grafana and get into the shell.

1. Check for dropped packets on the interface:

netstat -i

Look at the RX-DRP and TX-DRP columns. Non-zero numbers here usually mean the VPS can't process packets fast enough.

2. Check Conntrack usage:

Kubernetes fills the connection tracking table rapidly. If this fills up, new connections are dropped silently.

sysctl net.netfilter.nf_conntrack_count
sysctl net.netfilter.nf_conntrack_max

If you are close to the limit, increase it in /etc/sysctl.conf:

net.netfilter.nf_conntrack_max = 262144

The Data Residency Advantage

Since the Schrems II ruling last year, many Norwegian dev teams are scrambling to move workloads out of US-controlled cloud regions. Latency isn't just about physics anymore; it's about legal compliance. Hosting on CoolVDS ensures your data sits physically in Oslo, subject to Norwegian law, reducing your GDPR exposure significantly compared to the hyperscalers.

Final Thoughts

Kubernetes networking is complex, but it is deterministic. It fails for reasons you can find if you look deep enough. Don't accept packet loss as "normal." Don't accept 50ms latency between pods as "overhead."

We built CoolVDS to handle the I/O storms that Kubernetes generates. If you are tired of fighting your provider's CPU steal while trying to debug a CNI mesh, it is time for a change.

Ready to stabilize your cluster? Deploy a high-performance NVMe KVM instance on CoolVDS today and see what 0% CPU steal feels like.