Console Login

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Nordic Workloads

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Nordic Workloads

Most Kubernetes clusters deployed in Europe today are running on default networking settings. If you are running a hobby project, that's fine. If you are handling financial transactions in Oslo or streaming data across the Nordics, defaults are a liability.

I recently audited a cluster for a fintech client based in Stavanger. They were plaguing their support team with "random" timeout errors. The application code was fine. The database was healthy. The culprit? A mismatch in MTU settings between the overlay network and the underlying VPS infrastructure, causing massive packet fragmentation.

This is the reality of Kubernetes networking. It is not magic; it is encapsulation wrapped in complexity. In this guide, we are going to strip away the abstraction and look at how to architect a network stack that respects the laws of physics and the requirements of the Norwegian market.

1. The CNI Battlefield: eBPF vs. Iptables

By February 2024, the debate between traditional iptables-based routing and eBPF (Extended Berkeley Packet Filter) is largely settled for high-performance workloads. While Flannel or standard Calico are reliable, they rely heavily on iptables. As your service count grows into the thousands, iptables rule evaluation becomes a sequential bottleneck. Complexity degrades from O(1) to O(N).

For high-throughput systems, we lean heavily towards Cilium. It bypasses the iptables bottleneck by operating directly in the kernel using eBPF. This is crucial for reducing latency, a key metric when your users are pinging from varying distances across Norway.

Deploying Cilium without Kube-Proxy

To truly reduce overhead, replace kube-proxy entirely. Here is how we configure Cilium to handle all service load balancing on a CoolVDS instance running KVM:

helm install cilium cilium/cilium --version 1.14.6 \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=${API_SERVER_IP} \
  --set k8sServicePort=${API_SERVER_PORT} \
  --set bpf.masquerade=true \
  --set ipam.mode=kubernetes
Pro Tip: When running on virtualized hardware, ensure your underlying kernel supports eBPF fully. CoolVDS standardizes on Kernel 6.x for our NVMe instances specifically to support these advanced networking features without kernel panics. Many legacy providers are still stuck on 4.x kernels that choke on complex BPF maps.

2. The MTU Trap: Fragmentation Kills Performance

This is where 90% of setups fail. The standard Ethernet MTU is 1500 bytes. If you use an overlay network (like VXLAN or IPIP), the encapsulation adds headers (usually 50 bytes for VXLAN). If your Pod attempts to send a 1500-byte packet, it gets wrapped, exceeds the physical interface MTU, and fragments.

Fragmentation doubles your packet count and spikes CPU usage. Worse, some firewalls drop fragments silently.

You must calculate the overhead and set the CNI MTU accordingly. For a VXLAN setup on a standard WAN interface:

# Calico ConfigMap Example
kind: ConfigMap
apiVersion: v1
metadata:
  name: calico-config
  namespace: kube-system
data:
  veth_mtu: "1450" # 1500 - 50 bytes overhead

To verify if your current nodes are suffering, check the interface statistics:

netstat -s | grep "fragments"

If that number is climbing, you have a configuration problem. On CoolVDS, our internal network supports Jumbo Frames (MTU 9000) in specific private zones, which allows for massive throughput between database and application nodes without fragmentation risks. However, for public traffic, stick to a conservative MTU inside the pod.

3. Latency, NIX, and Geography

Physics is the only hard constraint. The round-trip time (RTT) from Oslo to Frankfurt is approx 15-20ms. From Oslo to a datacenter in Oslo, it is <1ms. For a simple request, this is negligible. For a microservices architecture where one frontend request triggers 50 internal API calls, that latency compounds aggressively.

If your target audience is Norway, hosting data in Germany or the US is a performance tax. Furthermore, strict adherence to Schrems II and GDPR compliance suggests keeping data within the EEA, and ideally, within the country of operation to satisfy local redundancy requirements from the Datatilsynet.

Optimizing CoreDNS for Local Resolution

DNS latency is often the silent killer. By default, CoreDNS might have low timeouts. Adjust your `Corefile` to aggressive caching if your service IPs are relatively stable.

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }

4. Ingress: Nginx Tuning for High Concurrency

The standard Nginx Ingress Controller configuration is designed for compatibility, not speed. It often leaves file descriptors open and has conservative keepalive settings. For a high-traffic shop, you need to tune the `worker-shutdown-timeout` and `keepalive-requests`.

Here is a production-ready snippet for your Nginx Ingress Controller ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
data:
  keep-alive: "75"
  keep-alive-requests: "10000"
  worker-processes: "auto"
  worker-cpu-affinity: "auto"
  upstream-keepalive-connections: "100"
  upstream-keepalive-timeout: "32"
  compute-full-forwarded-for: "true"
  use-forwarded-headers: "true"

The `compute-full-forwarded-for` setting is critical when behind a Load Balancer to ensure the logs reflect the real client IP, a requirement for security auditing.

5. IPVS Mode: The Alternative to eBPF

If you aren't ready to jump to Cilium/eBPF and want to stick with standard tools, at least switch `kube-proxy` to IPVS mode. IPVS (IP Virtual Server) is a kernel-level load balancer that uses hash tables rather than linear lists.

To enable this, ensure the `ip_vs` kernel modules are loaded on your host (standard on CoolVDS images):

# Load modules
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh

# Edit kube-proxy config
kubectl edit configmap kube-proxy -n kube-system
# Set mode: "ipvs"
Feature Iptables (Default) IPVS eBPF (Cilium)
Scalability O(N) - Slow at scale O(1) - Constant time O(1) - Fastest
Complexity Low Medium High (Kernel dependency)
Throughput Standard High Maximum

Conclusion

Kubernetes networking is about making trade-offs between complexity and performance. For most Norwegian businesses operating under strict SLAs, the default settings are insufficient. Moving to eBPF or IPVS, fixing MTU fragmentation, and hosting close to your users (Oslo/Europe) are non-negotiable steps.

We built the network stack at CoolVDS to eliminate the "noisy neighbor" packet loss that plagues cheaper VPS providers. When you are ready to stop debugging network flakes and start shipping code, we provide the raw, unthrottled NVMe infrastructure your cluster demands.

Stop guessing your latency. Deploy a test node in our datacenter and run iperf3 yourself.