If I had a krone for every time a developer told me, "It works on Minikube but times out in production," I could retire to a cabin in Lofoten. But I don't, because usually, the problem isn't the code. It's the network. Specifically, the misunderstood, abstracted, and often abused networking model of Kubernetes.
It is March 2018. Kubernetes 1.10 is just around the corner, and while the orchestration war is practically won, the networking battle is still raging in the trenches. Most people deploy a cluster using kubeadm, slap on a CNI plugin they found in a tutorial, and pray. That works for a blog. It does not work for high-availability systems handling Norwegian financial data or latency-sensitive APIs.
Let's rip open the abstraction layer. We are going to look at what actually happens to a packet when it enters your node, why iptables is becoming a bottleneck, and why your choice of infrastructure provider (and their CPU capabilities) matters more than you think.
The Lie of the "Flat Network"
Kubernetes mandates a flat network structure: every pod can talk to every other pod without NAT. Ideally, this means high throughput. Practically, on most VPS environments, this is achieved through an overlay network—encapsulation. Your packet isn't just a packet; it's a packet wrapped inside a VXLAN or IPIP header, shipped across the wire, and unwrapped at the destination.
This process, encapsulation and decapsulation, costs CPU cycles. On a noisy public cloud with "burstable" (read: stolen) CPU credits, this latency varies wildly. This is where the "Battle-Hardened" engineer chooses their weapons carefully.
Choosing Your CNI: Flannel vs. Calico
The Container Network Interface (CNI) determines how this wiring happens. In 2018, you usually have two main contenders for general use:
- Flannel: The simple option. It creates a VXLAN overlay. It works, but it's not efficient at scale.
- Calico: The professional option. It can run in pure Layer 3 mode using BGP, avoiding encapsulation overhead if your underlying network supports it.
If you are running on CoolVDS, where we give you raw, high-performance KVM instances, you have the CPU headroom to handle VXLANs without sweating. However, for pure performance, Calico is usually the superior choice because it also offers Network Policies—essential for the upcoming GDPR enforcement in May.
Configuring Calico for Performance
Don't just apply the default manifest. If you are running a cluster across a Nordic WAN or even just inside our Oslo datacenter, you need to tune the MTU (Maximum Transmission Unit). The default is often 1440, but if your host interface supports 1500, you are wasting bytes.
Here is a snippet of how we define the CNI configuration in /etc/cni/net.d/10-calico.conf on a production node:
{
"name": "k8s-pod-network",
"cniVersion": "0.3.0",
"plugins": [
{
"type": "calico",
"log_level": "info",
"datastore_type": "kubernetes",
"nodename": "oslo-node-01",
"mtu": 1440,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
}
]
}
The `iptables` Nightmare and the IPVS Hope
Here is the scenario that wakes me up at 3 AM. You have a cluster with 5,000 services. Suddenly, latency spikes. The CPUs are pegged, but application load is low. Why?
The culprit is kube-proxy. By default, it uses iptables to route traffic to Services. iptables is a linear list. To find a rule, the kernel checks them one by one. With 5,000 services, that is thousands of rules to traverse for every single packet.
Pro Tip: If you are running Kubernetes 1.9 or newer (which you should be), switch kube-proxy to IPVS mode. IPVS uses hash tables instead of linear lists. The lookup time is constant O(1), regardless of whether you have 10 services or 10,000.
To enable this, you need to ensure the IPVS kernel modules are loaded on your CoolVDS instance:
# Load required modules
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack_ipv4
Then, edit your kube-proxy ConfigMap:
kubectl edit configmap kube-proxy -n kube-system
Find the mode setting and change it from "" (which defaults to iptables) to "ipvs":
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
Restart your kube-proxy pods. The difference in service-to-service latency is measurable in milliseconds, which adds up significantly in microservices architectures.
Securing the Traffic: GDPR is Coming
We are two months away from GDPR enforcement (May 25, 2018). If you have a database pod that accepts connections from the entire cluster, you are failing "Privacy by Design." You must restrict traffic.
Kubernetes NetworkPolicies are the firewall rules for pods. If you used Flannel earlier, you are out of luck; it doesn't support them. This is why we recommended Calico. Here is a policy that strictly isolates a Postgres database so only the backend service can talk to it:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: db-access-control
namespace: production
spec:
podSelector:
matchLabels:
app: postgres
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
role: backend-api
ports:
- protocol: TCP
port: 5432
Without this, if your frontend is compromised, your database is wide open. On CoolVDS, we see many clients using separate VLANs for traditional servers, but in Kubernetes, NetworkPolicies are your VLANs.
Why Infrastructure Matters: The I/O Bottleneck
Networking is not just about packets; it's about buffering. When your NGINX Ingress Controller gets hit with a DDoS or a flash sale, it writes logs. It buffers requests. If your underlying storage is a slow spinning disk or a shared SATA SSD with "noisy neighbors," your network throughput collapses because the I/O wait time spikes.
This is the dirty secret of budget VPS providers. They oversell storage I/O. You might have a Gigabit port, but you can't fill it because your disk can't keep up with the logging.
At CoolVDS, we utilize local NVMe storage for this exact reason. In a Kubernetes environment, etcd (the brain of the cluster) requires extremely low write latency. If etcd writes slow down, the API server slows down, and eventually, nodes flap and the cluster degrades.
Benchmarking Network Throughput
Don't take my word for it. Spin up two pods on different nodes and run iperf3. If you see high jitter or throughput drops under load, check your host's steal time (st in top).
kubectl run -it --rm iperf-server --image=networkstatic/iperf3 -- -s
kubectl run -it --rm iperf-client --image=networkstatic/iperf3 -- -c <server-pod-ip>
On a proper host, you should see stable transfer rates. If you see fluctuations, your "cloud" is throttling you.
Final Thoughts
Kubernetes networking in 2018 is powerful, but it is not hands-off. You need to understand CNI, you need to migrate to IPVS, and you absolutely need to lock down traffic with NetworkPolicies before the datatilsynet comes knocking.
But above all, software cannot fix broken hardware. A high-performance cluster requires high-performance backing. Don't let slow I/O or stolen CPU cycles kill your SEO or your uptime.
Ready to build a cluster that actually performs? Deploy a high-frequency NVMe instance on CoolVDS in 55 seconds and see the difference raw power makes.