Kubernetes Networking is Not Magic. It's Just Packet Encapsulation Hell.
Let’s get one thing straight immediately: Kubernetes networking is not magic. It is a terrifyingly complex abstraction layer sitting on top of Linux primitives that have existed for decades. If you think you can deploy a production cluster without understanding iptables chains or the overhead of VXLAN, you are going to have a bad time when—not if—a pod loses connectivity at 3 AM.
I’ve spent the last six months debugging a microservices architecture for a fintech client in Oslo. They were complaining about "random" 502 errors and high latency. The developers blamed the code. The network engineers blamed the firewall. The reality? It was a default CNI configuration choking on packet encapsulation overhead, compounded by noisy neighbor issues on their previous budget VPS provider.
In this deep dive, we are ripping off the abstraction layer. We will look at CNI choices available right now in early 2020, why kube-proxy mode matters, and how to configure your cluster on CoolVDS to minimize the latency penalties that kill performance.
The CNI Battlefield: Flannel vs. Calico
When you initialize a cluster with kubeadm, you have a choice. Many tutorials blindly tell you to apply Flannel because it's "simple." Simplicity has a cost.
Flannel creates a VXLAN overlay network. Every packet leaving a pod is encapsulated in a UDP packet, sent across the physical network, and decapsulated on the target node. That is CPU overhead. That is MTU fragmentation risk. On a high-throughput cluster, it’s a bottleneck.
Calico, on the other hand, can run in pure Layer 3 mode using BGP (Border Gateway Protocol). No encapsulation. Just routing. If your underlying infrastructure supports it—and CoolVDS KVM instances absolutely do—you get near-metal networking speeds.
Here is how you apply Calico 3.11 (the current stable release) correctly, ensuring you set the MTU to match the interface of the underlying VPS:
kubectl apply -f https://docs.projectcalico.org/v3.11/manifests/calico.yaml
# CRITICAL: Check your IP_AUTODETECTION_METHOD if you have multiple interfaces
# Edit the daemonset:
kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=eth0Pro Tip: If you are hosting in Norway and dealing with GDPR data, use Calico's GlobalNetworkPolicy to strictly whitelist traffic. A default-deny policy is the only way to satisfy Datatilsynet auditors that you actually control your data flow.The iptables vs. IPVS Debate
By default, Kubernetes uses iptables to implement Services. When a Service has a ClusterIP, kube-proxy writes iptables rules to trap traffic destined for that IP and redirect it to a Pod.
This works fine for 50 services. It works okay for 500. But if you are running a large cluster with thousands of services, iptables becomes a nightmare. It is a sequential list. To match a rule, the kernel has to traverse the list. This is O(n) complexity. I've seen rule updates take seconds to apply on busy clusters.
The solution in 2020 is IPVS (IP Virtual Server). It uses a hash table for lookups—O(1) complexity. It is faster, more stable, and supports better load balancing algorithms like least-connection.
Enabling IPVS Mode
First, ensure the IPVS kernel modules are loaded on your CoolVDS node (our images have these available by default):
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack_ipv4Next, edit your kube-proxy ConfigMap to switch modes:
kubectl edit configmap kube-proxy -n kube-systemLook for the mode setting and change it from "" (which defaults to iptables) to "ipvs":
apiVersion: kubeproxiconfig.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
...Kill the kube-proxy pods to restart them. Your service resolution latency just dropped significantly.
Ingress Controllers and the "Double Hop"
So you have your internal networking sorted. Now traffic needs to get in. You are likely using the NGINX Ingress Controller. It is the workhorse of the industry.
A common mistake is allowing traffic to hit any node in the cluster, which then routes it to the node running the Ingress Controller, which then routes it to the application. This is the "hairpin" or "double hop."
To fix this, use the externalTrafficPolicy: Local setting in your LoadBalancer service definition. This forces traffic to only land on nodes that are actually running the Ingress pod. It preserves the client source IP (critical for security logs) and removes a network hop.
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx
namespace: ingress-nginx
spec:
type: LoadBalancer
externalTrafficPolicy: Local
ports:
- name: http
port: 80
targetPort: 80
selector:
app.kubernetes.io/name: ingress-nginxThe Hardware Reality: Why "Cloud" Often Fails K8s
You can tune software all day, but you cannot tune away physics. Kubernetes is essentially a distributed database application (etcd) masquerading as an orchestrator. If etcd writes are slow, the API server hangs. If the API server hangs, deployments fail and leader elections timeout.
This is where the difference between "cheap VPS" and CoolVDS becomes apparent. etcd is incredibly sensitive to disk write latency (fsync). Many providers put you on shared spinning rust or oversold SSDs where "noisy neighbors" steal your IOPS.
At CoolVDS, we use NVMe storage arrays. The latency difference isn't just a benchmark number; it's the difference between a cluster that heals itself instantly and one that enters a crash loop back-off state during a traffic spike.
Furthermore, network latency to your users matters. If your primary market is Norway, hosting in a US-East region adds 80-100ms of latency to every handshake. By hosting in our Oslo-connected data centers, you are leveraging direct peering at NIX (Norwegian Internet Exchange). Your handshake times drop to <10ms for local users.
Network Policy: The GDPR Firewall
Since we are operating in Europe, we have to talk about security. By default, Kubernetes allows all pods to talk to all other pods. This is a security disaster waiting to happen and a compliance violation under GDPR.
You must implement NetworkPolicy resources. Here is a strict policy that denies all ingress traffic to a namespace unless explicitly allowed. This should be your starting point for every namespace:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- IngressOnce this is applied, you then layer on specific allowances. This creates a whitelist model that is secure by design.
Summary
Kubernetes offers immense power, but it demands respect for the networking layer. To build a robust platform in 2020:
- Use Calico for efficiency and BGP routing capabilities.
- Switch
kube-proxyto IPVS mode to handle scale. - Use
externalTrafficPolicy: Localto reduce hops and preserve IPs. - Host on infrastructure that guarantees NVMe I/O and low network latency.
Your infrastructure is the foundation of your reliability. Don't build a skyscraper on a swamp. If you are ready to run Kubernetes on infrastructure that respects the packet, deploy your master node on CoolVDS today.