Kubernetes Networking Deep Dive: Packet Flow, CNI Wars, and Why Your Overlay Network is Slow

Let’s cut the marketing noise. Kubernetes networking isn’t magic. It is a complex layer of iptables rules, routing tables, and encapsulation protocols held together by hope and bash scripts. I’ve spent the last three weeks debugging a cluster that kept dropping packets between microservices only during peak hours. The culprit wasn't code—it was a default MTU setting colliding with an overlay network on a budget VPS provider.

If you are running Kubernetes in production in 2022, you cannot afford to treat the network as a black box. Whether you are serving high-traffic APIs in Oslo or managing data pipelines across Europe, understanding the path a packet takes from an Ingress Controller to a Pod is mandatory.

The CNI Jungle: Calico vs. Cilium (2022 Edition)

The Container Network Interface (CNI) is where the rubber meets the road. In the Nordic hosting market, we see two dominant players right now: Calico and Cilium.

Calico is the industry workhorse. It uses BGP for routing and acts like a traditional router. It is stable, predictable, and we see it on 80% of clusters migrating to CoolVDS. However, as of late 2022, Cilium is eating its lunch by leveraging eBPF (Extended Berkeley Packet Filter) to bypass iptables entirely. Iptables was never designed for the churn of dynamic container scheduling. When you have 5,000 services, iptables becomes a linear bottleneck.

The Encapsulation Tax

Unless you are running BGP directly with your top-of-rack switches (rare in virtualized environments), you are likely using an overlay network like VXLAN or IPIP. This encapsulates your packet inside another packet. This process consumes CPU cycles.

Pro Tip: If your hosting provider over-provisions CPU (stealing cycles from you), your network throughput drops because the kernel can't encapsulate packets fast enough. We configured CoolVDS KVM slices with dedicated CPU pin options specifically to prevent this "noisy neighbor" network lag.

The Hidden Killer: MTU Fragmentation

This is the most common configuration error I see. The standard internet MTU is 1500 bytes. VXLAN adds a 50-byte header. If your physical host interface is 1500, and your Pod interface is 1500, the encapsulated packet becomes 1550 bytes. The physical switch drops it, or fragmentation occurs, killing performance.

You must configure your CNI to account for the overhead. Here is how we verify the interface MTU on the host node before deploying:

ip -d link show eth0 | grep mtu

If your host is 1500, your Calico configuration needs to look like this:

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: calico-node
  namespace: kube-system
spec:
  template:
    spec:
      containers:
        - name: calico-node
          env:
            - name: FELIX_IPINIPMTU
              value: "1480" # Allow room for header
            - name: FELIX_VXLANMTU
              value: "1450"

Setting this incorrectly results in sporadic connection resets that are incredibly difficult to debug.

Service Discovery and DNS Latency

In Kubernetes, DNS is not just for finding Google.com; it’s how your frontend finds your backend. By default, K8s uses CoreDNS. I've seen latency spikes in clusters simply because ndots:5 (the default search configuration) forces the resolver to query multiple search domains before finding the actual service.

If you are running a high-load setup, standard CoreDNS settings are insufficient. You need to tune the Corefile and the upstream behavior. Below is a production-grade CoreDNS config map optimized for high throughput:

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }

Notice the max_concurrent and cache settings. On a generic VPS with slow I/O, CoreDNS can choke on logging or caching operations. This is why CoolVDS utilizes NVMe storage standard—even for system logs—to ensure that I/O wait times never impact network resolution.

Optimizing Kernel Parameters for K8s

Linux defaults are tuned for a modest web server from 2010, not a 2022 Kubernetes node routing gigabits of traffic. You need to touch `sysctl`. Be careful here; changing these on a live system can disrupt connectivity.

We recommend applying the following tuning via a DaemonSet or Cloud-Init script on your worker nodes. These settings increase the connection tracking table (crucial for NAT) and allow for faster TCP recycling.

# /etc/sysctl.d/k8s-net.conf

# Increase the connection tracking table size
net.netfilter.nf_conntrack_max = 1000000
net.netfilter.nf_conntrack_tcp_timeout_established = 86400

# Allow more pending connections
net.core.somaxconn = 32768

# Expand the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65000

# Fast recycling of TIME_WAIT sockets (use with caution behind NAT)
net.ipv4.tcp_tw_reuse = 1

# Increase TCP buffer sizes for high-speed local networks (like NIX peering)
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

To apply this immediately:

sysctl -p /etc/sysctl.d/k8s-net.conf

The Infrastructure Layer: Why CoolVDS Wins

You can tune software all day, but you cannot tune physics. Kubernetes control plane components, specifically etcd, are extremely sensitive to disk write latency. If etcd writes take longer than a few milliseconds, the leader election fails, and your cluster goes into a split-brain scenario. I have seen this happen repeatedly on shared "cloud" platforms that throttle disk IOPS.

This is where the choice of hosting provider becomes a technical decision, not just a financial one. At CoolVDS, we don't use spinning rust or shared SATA SSDs for our high-performance tiers. We use NVMe directly attached to the PCI bus. For a K8s cluster node, this means:

Etcd Stability: Write latencies consistently under 2ms.
Faster Image Pulls: Docker images extract faster, improving pod startup time.
Compliance: For our Norwegian clients, data residency is critical. Ensuring your data sits on servers physically located in Oslo or nearby ensures compliance with strict interpretations of GDPR and Schrems II.

Ingress and Local Peering

Finally, how does traffic get into the cluster? In 2022, the Gateway API is the future, but the Ingress Controller (specifically Nginx or Traefik) is the present. If your target audience is in Norway, latency matters.

Hosting outside of the region adds 20-30ms of latency. Hosting on CoolVDS, which peers directly at NIX (Norwegian Internet Exchange), keeps latency to domestic users often below 5ms. When configuring your Ingress, ensure you are preserving the client source IP to enable proper rate limiting and geographic filtering.

apiVersion: v1
kind: Service
metadata:
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local # Preserves Client IP
  ports:
    - name: http
      port: 80
      targetPort: http
    - name: https
      port: 443
      targetPort: https

Setting externalTrafficPolicy: Local drops packets if the pod isn't on the node receiving traffic, so ensure you have a robust external LoadBalancer or run Ingress as a DaemonSet.

Conclusion

Kubernetes networking is unforgiving of weak infrastructure. A dropped packet in an overlay network looks like an application timeout to your users. By combining precise CNI configuration, kernel tuning, and the raw IO performance of CoolVDS NVMe instances, you build a foundation that doesn't just survive peak load—it ignores it.

Stop fighting CrashLoopBackOff caused by slow I/O. Spin up a rock-solid K8s node on CoolVDS today and see the difference dedicated resources make.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Kubernetes Networking Deep Dive: Packet Flow, CNI Wars, and Why Your Overlay Network is Slow

Kubernetes Networking Deep Dive: Packet Flow, CNI Wars, and Why Your Overlay Network is Slow

The CNI Jungle: Calico vs. Cilium (2022 Edition)

The Encapsulation Tax

The Hidden Killer: MTU Fragmentation

Service Discovery and DNS Latency

Optimizing Kernel Parameters for K8s

The Infrastructure Layer: Why CoolVDS Wins

Ingress and Local Peering

Conclusion

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025