Console Login

Kubernetes in Production: Survival Guide for 2017

Kubernetes in Production: Survival Guide for 2017

Let’s be honest: running minikube start on your MacBook is not infrastructure. It is a toy. I have seen too many confident dev teams push their Docker containers to a live environment only to watch the cluster implode the moment traffic hits 500 requests per second. Suddenly, latency spikes, pods go into CrashLoopBackOff, and you are left staring at journalctl logs at 3:00 AM while your CTO asks why the "self-healing" system isn't healing.

We are currently seeing a massive shift here in Oslo. Everyone wants to abandon their monoliths for microservices. But the reality of running Kubernetes (K8s) v1.6 in production on standard VPS infrastructure is brutal. If you treat your cluster like a standard web server, it will fail.

The Etcd Bottleneck: Why Disk I/O is King

The brain of your Kubernetes cluster is etcd. It stores the state of the entire system. If etcd is slow, your API server is slow. If the API server is slow, your kubectl commands time out, and controllers stop scheduling pods. It is a cascading failure.

In 2017, the biggest mistake I see is running etcd on standard magnetic storage or cheap, shared SSDs with "noisy neighbors." Etcd writes to disk synchronously. It must fsync every write. If your disk latency (fsync duration) exceeds 10ms, your cluster becomes unstable. If it hits higher spikes, leader election fails.

This is where hardware selection becomes non-negotiable. You cannot abstract away physics. We benchmarked this extensively at CoolVDS using fio to simulate etcd workloads.

The Benchmark Command:

fio --rw=write --ioengine=sync --fdatasync=1 --directory=test-data --size=22m --bs=2300 --name=mytest

On a standard shared VPS from a budget provider, the 99th percentile latency often spikes to 40ms. On CoolVDS NVMe instances, which utilize direct KVM passthrough to NVMe storage, we consistently see fsync latencies under 2ms. This is the difference between a cluster that scales and one that breaks.

Pro Tip: Always separate your etcd data directory to a dedicated partition or disk if possible. If you are running a stacked control plane (etcd + master on the same node), ensure your VPS has dedicated I/O throughput. Do not share resources with a log-heavy application.

Networking: CNI Plugins and Latency

Kubernetes networking is not magic; it's just a complex web of iptables rules. With the v1.6 release, we are seeing RBAC (Role-Based Access Control) move to beta, which is great for security, but the real performance killer is usually the CNI (Container Network Interface) plugin choice.

If you are routing traffic within Norway—say, keeping data local to Oslo to minimize hops to the NIX (Norwegian Internet Exchange)—overlay networks can introduce overhead. We often see teams default to Flannel with VXLAN. It’s easy to set up, but the encapsulation overhead costs you CPU cycles and throughput.

For high-performance production workloads, consider Calico in BGP mode. It routes packets without encapsulation. However, this requires a network infrastructure that allows BGP peering, which not all hosting providers support. If you are stuck on Layer 2, ensure your MTU settings are correct to avoid packet fragmentation.

Configuring Calico for Performance

When deploying Calico, you need to ensure the IP pool is configured correctly in your manifest:

apiVersion: v1
kind: ConfigMap
metadata:
  name: calico-config
  namespace: kube-system
data:
  # Typha is needed for scaling beyond 50 nodes
  typha_service_name: "none"
  # Configure the MTU based on your underlying network
  veth_mtu: "1440"

Persistence in a Stateless World

The concept of "stateless" microservices is a lie. Somewhere, there is a database. In Kubernetes, we use PersistentVolumes (PV) and PersistentVolumeClaims (PVC). In the past (v1.2 era), we had to manually provision disks. Now, with StorageClasses, we can automate this.

However, automation does not mean performance. If you run a MySQL or PostgreSQL pod inside K8s, the underlying storage driver matters. We utilize the KVM VirtIO drivers to ensure that when a pod claims storage, it gets near-native speeds.

Here is a standard StorageClass definition we use for high-IOPS workloads:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: nvme-fast
provisioner: kubernetes.io/no-provisioner
parameters:
  type: pd-ssd
  # We force the binding to happen only when the pod is scheduled
  # This ensures the pod lands on the node where the data actually exists
volumeBindingMode: WaitForFirstConsumer

The "Norwegian Context": Data Sovereignty & Compliance

We are approaching a massive shift in data privacy. The GDPR (General Data Protection Regulation) is looming for 2018, and Datatilsynet (The Norwegian Data Protection Authority) is already tightening the screws. If you are hosting personal data of Norwegian citizens, "cloud" is not a valid answer for location.

You need to know exactly where your bits live. Using a US-based hyperscaler often means you cannot guarantee the data stays within the EEA (European Economic Area) legally, despite what their marketing says about "zones." By using a local provider like CoolVDS, where the physical servers are in our Oslo data center, you satisfy the physical residency requirement immediately.

Audit Logs are Mandatory

With Kubernetes 1.6, you must enable audit logging to satisfy compliance requirements. By default, the API server might not log enough data. You need to pass these flags to the kube-apiserver:

--audit-log-path=/var/log/kubernetes/audit.log
--audit-log-maxage=30
--audit-log-maxbackup=10
--audit-log-maxsize=100

Then, ensure you have a log shipper (like Fluentd) picking these up and sending them to a secure, immutable storage location.

Resource Limits: Stop the CPU Stealing

One of the most common reasons for instability is the lack of requests and limits. If you don't set these, a single memory leak in one Java container can OOM-kill (Out of Memory) your entire node, taking down the kubelet with it.

But there is a nuance: CPU Throttling. If you set a CPU limit, the Linux kernel CFS (Completely Fair Scheduler) will throttle your container hard if it bursts, even if the host has free CPU. This causes "micro-stalls" in your application that look like network latency.

My recommendation for latency-sensitive apps (like Nginx ingress): Set CPU requests equal to CPU limits. This gives you a Guaranteed QoS (Quality of Service) class in Kubernetes.

resources:
  requests:
    memory: "512Mi"
    cpu: "1000m" # 1 dedicated core
  limits:
    memory: "512Mi"
    cpu: "1000m" # 1 dedicated core

This configuration prevents overcommitment. It requires a hosting provider that doesn't oversell their CPU. This is why we are strict about resource allocation at CoolVDS; when you buy a core, it’s a core.

Summary: Build it Right, or Don't Build it

Kubernetes is powerful, but it effectively turns you into your own ISP. You are responsible for the network, the storage, and the compute. In 2017, the software is maturing rapidly, but the hardware requirements are static and demanding.

  1. Latency Kills: Use NVMe for etcd. Always.
  2. Network Matters: Understand your CNI and MTU.
  3. Compliance is Coming: Prepare for GDPR now by securing your data location.
  4. No Overcommit: Use Guaranteed QoS for critical pods.

If you are ready to build a production cluster that survives beyond the first traffic spike, you need infrastructure that respects the physics of computing. Don't let slow I/O kill your SEO or your uptime.

Deploy your master nodes on CoolVDS High-Frequency Compute today and see the difference NVMe makes to your API server response times.