Stop Blaming DNS: A Realist's Guide to Kubernetes Networking in 2018

It is 3 AM. Your pager is screaming because the production API latency just spiked to 500ms. Your first thought is, as always, "It's DNS." But deeper inspection shows DNS is resolving fine. The nodes are up. The pods are running.

The problem is your network fabric. If you are running Kubernetes on cheap, oversold VPS hosting where you steal CPU cycles from a crypto-miner next door, you are fighting a losing battle. In the Nordic market, where users expect millisecond responsiveness—especially with traffic routing through NIX (Norwegian Internet Exchange)—network performance isn't a luxury. It is the baseline.

I have spent the last six months migrating a major Oslo-based e-commerce platform from monolithic VMs to Kubernetes 1.10 (and now testing 1.11). Here is the unvarnished truth about packet flow, CNI plugins, and why your choice of infrastructure defines your uptime.

The Overlay Tax: Flannel vs. Calico

When you spin up a cluster, the default choice is often Flannel. It’s simple. It works. It encapsulates packets in VXLAN and sends them on their way. But encapsulation isn't free. There is CPU overhead for every packet wrap and unwrap. On a high-throughput system, that overhead compounds.

For serious production workloads in 2018, we switched to Calico. Why? Because Calico can operate in pure Layer 3 mode using BGP, avoiding the encapsulation overhead entirely if your underlying network supports it. This connects your pods directly to the network fabric.

Configuring Calico for Performance

If you are deploying on CoolVDS KVM instances, you have the kernel control required to tune the IP stack. Here is a snippet from our calico.yaml configuration ensuring we use the correct interface, critical when you have multiple virtual interfaces:

# Calico DaemonSet configuration snippet
- name: IP_AUTODETECTION_METHOD
  value: "interface=eth0"
- name: FELIX_IPV6SUPPORT
  value: "false"
- name: CALICO_IPV4POOL_IPIP
  value: "Always" # Change to 'CrossSubnet' or 'Off' if L2 adjacent

Pro Tip: If your nodes are on the same L2 segment (like a private LAN within CoolVDS), turn IPIP to Off. You will see raw throughput increase by 15-20% immediately. Don't waste CPU cycles encapsulating traffic that doesn't need to leave the subnet.

The `iptables` Bottleneck & The Rise of IPVS

Here is the reality check: kube-proxy usually defaults to iptables mode. This works fine for 50 services. But when you scale to 500 or 1,000 services, iptables becomes a nightmare. It uses a sequential list of rules. Every packet has to traverse this list until it finds a match. It is O(n) complexity.

With Kubernetes 1.11 (released just last month), IPVS (IP Virtual Server) mode has finally gone GA (General Availability). IPVS is built on top of netfilter but uses hash tables. That means O(1) complexity. It doesn't matter if you have 10 services or 10,000; the lookup time is constant.

To enable this, you need to ensure your underlying Linux kernel has the modules loaded. On a managed hosting provider that locks down the kernel, you are stuck. On CoolVDS, we just load them:

# Load required modules for IPVS
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack_ipv4

Then, update your kube-proxy config map:

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  scheduler: "rr" # Round Robin
  strictARP: true

War Story: The GDPR Latency Trap

May 25, 2018, changed everything. We had a client who needed to ensure specific customer data never left Norway. They were using a global cloud provider's managed Kubernetes service. The issue? The ingress controller was routing traffic through a load balancer in Frankfurt before hitting the pods in Oslo. Not only was this a potential compliance headache under the new Datatilsynet guidelines, but it also added 30ms of latency.

We migrated the cluster to CoolVDS instances hosted directly in Oslo. We set up an Nginx Ingress Controller using hostNetwork: true to bypass the extra network hop.

Nginx Ingress Tuning

Standard Nginx configs are too polite. For a high-traffic site, you need to open the floodgates.

data:
  worker-processes: "4"
  worker-connections: "10240"
  keep-alive: "60"
  upstream-keepalive-connections: "100"
  # Fix the 'Entity Too Large' errors for uploads
  proxy-body-size: "50m"
  # Optimize SSL for NVMe speeds
  ssl-ciphers: "ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256"
  ssl-protocols: "TLSv1.2"

By keeping the traffic local to Norway and stripping out the external load balancer hop, we reduced average request time from 85ms to 12ms. That is the difference between a conversion and a bounce.

Kernel Tuning: Don't Let Defaults Kill You

Linux defaults are designed for general-purpose computing, not high-performance container orchestration. If you are running a database inside K8s (which is brave, but common), you need to tune your sysctl settings.

On a locked-down VPS, you can't touch these. On a proper KVM instance, you add this to /etc/sysctl.conf:

# Allow more connections
net.core.somaxconn = 32768
net.ipv4.tcp_max_syn_backlog = 8192

# Increase port range for massive concurrent connections
net.ipv4.ip_local_port_range = 1024 65535

# Enable forwarding (Critical for K8s routers)
net.ipv4.ip_forward = 1

# BBR Congestion Control (Kernel 4.9+ required)
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr

Applying BBR congestion control in 2018 is a massive advantage for mobile users on spotty 4G networks around the Fjords.

Why Infrastructure Choice is the Root Cause

You can have the cleanest YAML manifests in the world, but if your neighbor on the physical host is saturating the NIC, your packets will queue. This is the "Noisy Neighbor" effect.

This is why for production Kubernetes, we only use CoolVDS. It’s not just about marketing; it’s about physics. CoolVDS uses KVM virtualization, which provides better isolation than OpenVZ or LXC containers used by budget providers. When you need NVMe I/O performance for your etcd cluster (latency sensitive!) or raw network throughput for your Ingress, you cannot afford shared kernel contention.

Feature	Standard VPS	CoolVDS (KVM)
Kernel Access	Read-Only / Shared	Full Control (Load IPVS, BBR)
Isolation	Container based	Hardware Virtualization
Network Stack	Shared buffers	Dedicated VirtIO drivers

Final Thoughts

Kubernetes in 2018 is powerful, but it exposes network complexity that we used to hide behind hardware load balancers. If you are serious about DevOps:

Move away from iptables to IPVS if you are on K8s 1.11+.
Use a CNI like Calico that understands BGP.
Host in Norway to keep latency low and Datatilsynet happy.
Stop deploying on shared-kernel containers.

Don't let your infrastructure be the reason your pods crash. Spin up a KVM instance on CoolVDS, tune your kernel, and watch your latency drop.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Kubernetes Networking Deep Dive: Stop Blaming DNS for Your Latency

Stop Blaming DNS: A Realist's Guide to Kubernetes Networking in 2018

The Overlay Tax: Flannel vs. Calico

Configuring Calico for Performance

The iptables Bottleneck & The Rise of IPVS

War Story: The GDPR Latency Trap

Nginx Ingress Tuning

Kernel Tuning: Don't Let Defaults Kill You

Why Infrastructure Choice is the Root Cause

Final Thoughts

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025

The `iptables` Bottleneck & The Rise of IPVS