Console Login

Kubernetes Networking Deep Dive: Configuring CNI and Ingress for Production in 2018

Let’s be honest. Kubernetes solves deployment, but it makes networking a headache. If you are reading this in June 2018, you've likely spent the last week staring at iptables-save output wondering why your pods can't resolve DNS. The simplified "IP-per-pod" promise is great until you actually have to implement it across a distributed cluster.

I’ve spent the last month migrating a legacy monolithic stack for a fintech client in Oslo. We learned—painfully—that the abstraction layer has a cost. GDPR is now in full effect (as of two weeks ago), meaning we can't just let traffic flow freely. We need boundaries, we need encryption, and we need speed. Here is how we architect K8s networking without losing our minds or our packet throughput.

The CNI Battlefield: Flannel vs. Calico

The Container Network Interface (CNI) determines how your pods talk. In 2018, if you aren't choosing a CNI explicitly, you are doing it wrong.

Flannel is the default for many. It’s simple. It creates a VXLAN overlay. But VXLAN encapsulation burns CPU cycles. If you are running on older hardware or noisy shared VPS neighbors, your packet encapsulation latency will skyrocket. We saw 20% CPU steal on a budget host last year just from network overhead.

Calico, on the other hand, operates at Layer 3 using BGP. It routes packets without encapsulation when nodes are on the same subnet. This is where the infrastructure provider matters. If your hosting provider allows BGP peering or has a flat L2 network between VMs, Calico is vastly superior for performance.

Here is a snippet from our Calico configuration (v3.1) applied via kubectl:

apiVersion: v1
kind: ConfigMap
metadata:
  name: calico-config
  namespace: kube-system
data:
  # Typha is essential for clusters > 50 nodes to offload K8s API
  typha_service_name: "none"
  # Configure the MTU based on your underlying network interface
  veth_mtu: "1440" 
Pro Tip: Check your MTU. The standard Ethernet MTU is 1500. If you wrap packets in VXLAN (Flannel), you add headers. If the inner packet + header > 1500, fragmentation occurs. Fragmentation kills performance. Set your CNI MTU to 1450 or lower unless your provider supports Jumbo Frames.

Ingress: Stop Using NodePort

I see this in production configs constantly: services exposed via NodePort. This opens a port on every single node in your cluster. It’s a security nightmare and a routing mess.

The standard in 2018 is the NGINX Ingress Controller. It acts as the single entry point, terminating SSL and routing based on Host headers. With the new GDPR requirements, we force HTTPS everywhere. Here is a standard production Ingress resource for Kubernetes 1.10:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: production-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
    # Force SSL redirect for compliance
    ingress.kubernetes.io/ssl-redirect: "true"
    # Increase buffer size for large headers
    nginx.org/proxy-buffer-size: "16k"
spec:
  tls:
  - hosts:
    - api.coolvds-demo.no
    secretName: tls-secret
  rules:
  - host: api.coolvds-demo.no
    http:
      paths:
      - path: /
        backend:
          serviceName: backend-service
          servicePort: 80

The Hidden Killer: etcd Latency

Networking isn't just about moving user data; it's about cluster state. Kubernetes relies on etcd to store configuration. Etcd uses the Raft consensus algorithm. If disk latency is high, etcd writes (fsync) slow down. If fsync is slow, the cluster leader election fails. The network appears to "flap."

We debugged a cluster last week where pods were randomly restarting. The culprit? Slow disk I/O on the master nodes causing etcd timeouts. The network overlay (Flannel) couldn't update its routing tables fast enough.

You can check this metric on your master node:

# Check the 99th percentile of fsync duration
promql> histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m]))

If this value is above 10ms, your cluster is unstable. This is why we strictly use CoolVDS NVMe instances for master nodes. Spinning rust or standard SATA SSDs often cannot sustain the IOPS required for a busy etcd cluster, especially when you have multiple controllers fighting for I/O.

Network Policies: The GDPR Firewall

By default, Kubernetes allows all pods to talk to all pods. In a multi-tenant environment, or even just a frontend/backend split, this fails the "Data Protection by Design" requirement of GDPR Art. 25.

We implement a "Default Deny" policy in every namespace. You must explicitly allow traffic. It’s annoying, but necessary.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress

Once applied, ping stops working between pods. You then layer on specific allow rules. It forces developers to understand exactly what dependencies their microservices have.

Latency Matters: The Norwegian Context

Your code can be optimized, but the speed of light is constant. If your users are in Norway, hosting in Frankfurt or London adds 20-30ms of round-trip time (RTT). For a REST API with multiple sequential calls, that adds up to noticeable lag.

Routing through the NIX (Norwegian Internet Exchange) in Oslo keeps traffic local. When we deployed our clusters on CoolVDS (which peers directly at NIX), we dropped our average API response time from 85ms to 12ms compared to our previous provider in Ireland. That is not code optimization; that is just physics.

Troubleshooting Checklist

Before you blame the firewall, run these commands inside a debug container (Alpine with net-tools):

  1. nslookup kubernetes.default - Does internal DNS work? (Checks CoreDNS/Kube-DNS).
  2. ip route - Is the gateway correct?
  3. netstat -rn - Check the routing table for CNI conflicts.
  4. curl -v --connect-timeout 5 10.244.x.x - direct pod-to-pod IP test.

Kubernetes networking in 2018 is complex. It relies heavily on iptables (which is getting creaky) and overlay networks that demand CPU and low-latency storage. Don't let your infrastructure be the bottleneck. Configure your Ingress correctly, lock down your namespaces, and run your masters on NVMe storage.

Need a cluster that doesn't choke on I/O wait? Deploy a CoolVDS NVMe instance in Oslo today and see the difference etcd stability makes.