Kubernetes Networking Deep Dive: Optimizing CNI & eBPF for Low-Latency Architectures
Most tutorials tell you that Kubernetes networking is "abstracted away." They lie. If you have ever stared at a CrashLoopBackOff status caused by a DNS timeout or watched your throughput tank because of VXLAN encapsulation overhead, you know that the network layer is where clusters go to die.
I have spent the last decade debugging packet drops at 3 AM. In 2024, deploying a cluster isn't just about `kubectl apply`. It is about squeezing every microsecond out of the Linux kernel to handle high-throughput workloads. This isn't a "Hello World" guide. This is a breakdown of how to build a network stack that doesn't fold under pressure, specifically tailored for the Nordic infrastructure landscape.
The CNI Decision: Why eBPF is Non-Negotiable in 2024
For years, iptables was the hammer we used for everything. But at scale, it is a disaster. O(N) lookup complexity means that as your services grow, your packet processing speed drops. In a recent project migrating a fintech platform in Oslo, we saw kube-proxy consuming 20% of CPU just managing rules.
The solution is eBPF (Extended Berkeley Packet Filter). By March 2024, tools like Cilium have matured enough to replace kube-proxy entirely. This allows us to run networking logic directly in the kernel sandbox, bypassing the heavy context switching of userspace networking.
Pro Tip: When running on high-performance infrastructure like CoolVDS, disable the default kube-proxy during cluster initialization. Let your CNI handle the Service implementation for a noticeable drop in latency.
Deploying Cilium with Strict Mode
Don't just install the defaults. You want strict strict-mode to prevent unencrypted traffic and enforce strict network policies by default. Here is the configuration we use for production clusters targeting the NIX (Norwegian Internet Exchange):
helm install cilium cilium/cilium --version 1.15.1 \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT} \
--set bpf.masquerade=true \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true
The Overlay Tax: VXLAN vs. Direct Routing
By default, many CNIs use VXLAN to encapsulate traffic between nodes. This adds headers to every packet, reducing your MTU (Maximum Transmission Unit) and burning CPU cycles for encapsulation/decapsulation. On a standard cloud provider, you are often forced into this.
However, if you are renting proper VPS Norway instances where you have control over the underlying interface, you should aim for Direct Routing. This writes static routes to the Linux routing table, allowing pods to communicate without encapsulation.
To make this work, your underlying infrastructure must allow layer 2 adjacency or BGP peering. This is why we deploy on CoolVDS NVMe instances; the network stack is clean enough to handle BGP via MetalLB without the provider's firewall dropping "unknown" packets.
Calculating the Correct MTU
If you must use an overlay, setting the wrong MTU is the #1 cause of random connection resets. If your host interface is 1500 bytes and you use VXLAN (50 bytes overhead), your Pod MTU must be 1450.
apiVersion: v1
kind: ConfigMap
metadata:
name: cni-configuration
namespace: kube-system
data:
cni-config: |{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"mtu": 1450,
"log_level": "info",
"datastore_type": "kubernetes",
"nodename": "__KUBERNETES_NODE_NAME__"
}
]
}
Ingress and Load Balancing on "Bare Metal"
In a managed cloud, you request a LoadBalancer and get an expensive external IP. On a VPS or bare-metal setup, you are on your own. MetalLB is the standard here. It allows your nodes to announce an external IP address for services using ARP (Layer 2) or BGP.
For a typical setup in a Norwegian datacenter where you have a block of IPs assigned to your CoolVDS account, Layer 2 mode is sufficient and robust.
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: production-pool
namespace: metallb-system
spec:
addresses:
- 185.xxx.xxx.10-185.xxx.xxx.20
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: production-advertisement
namespace: metallb-system
This configuration makes your Kubernetes services accessible directly from the internet without a reverse proxy bottleneck, provided your VPS provider has clean routing tables.
The Norwegian Context: Latency and Compliance
Data residency is not a buzzword in Europe; it is a legal requirement. With GDPR and the fallout from Schrems II, ensuring traffic stays within the EEA (European Economic Area) is critical. But beyond legality, there is physics.
Routing traffic from Oslo to a US-based cloud control plane and back adds 80-100ms of latency. Running your control plane and worker nodes on CoolVDS in Norway keeps local latency under 5ms. For database replication or real-time trading data, that difference is the entire ballgame.
Enforcing Data Residency with NetworkPolicies
You can enforce "digital borders" using Kubernetes NetworkPolicies. Here is a strict policy that denies egress traffic to non-allowlisted IP blocks (effectively blocking non-EU traffic if you map your CIDRs correctly).
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: restrict-egress-eu
namespace: sensitive-workloads
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/8 # Internal Cluster Traffic
- ipBlock:
cidr: 185.0.0.0/16 # Local Norwegian Subnets
- ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
War Story: The DNS UDP Conntrack Race Condition
We recently audited a cluster suffering from intermittent 5-second delays on API calls. The culprit wasn't the app code; it was a race condition in the Linux kernel's conntrack table dealing with UDP packets for CoreDNS.
The fix involved patching the underlying node configuration to use nf_conntrack_tcp_be_liberal and switching the DNS policy to use NodeLocal DNSCache. This reduced DNS resolution time from spikes of 5000ms down to a consistent 2ms.
This kind of tuning requires access to the host kernel configurations. Many shared hosting providers lock this down. CoolVDS virtualization (KVM) gives you the kernel access required to tune `sysctl` parameters for high-load networking.
Conclusion
Kubernetes networking is brittle if you treat it as a black box. By leveraging eBPF with Cilium, removing overlay overhead, and hosting on infrastructure that respects local peering and data sovereignty, you build a system that is resilient, compliant, and fast.
Don't let a slow network layer strangle your container orchestration. Build your cluster on infrastructure designed for raw throughput. Deploy a high-performance node on CoolVDS today and see what sub-millisecond latency actually feels like.