The Orchestration Headache: Choosing Your Poison in late 2016
If you have been awake for the last six months, you know the container ecosystem is currently a bloodbath. We used to just run scripts with Chef or Puppet. Now, if you aren't decomposing your monolith into microservices and throwing them into a scheduler, you're apparently "doing it wrong."
I spent the last weekend debugging a split-brain scenario in an etcd cluster that took down a production e-commerce site. It wasn't pretty. That experience solidified a simple truth: Orchestration is expensive. It costs CPU cycles, it costs network overhead, and it costs sanity.
Today, we look at the two main contenders fighting for your terminal: The newly integrated Docker Swarm Mode (released with Docker 1.12) and the behemoth that is Kubernetes 1.4. We will analyze this through the lens of a System Administrator running infrastructure here in Norway, where data sovereignty and latency to the NIX (Norwegian Internet Exchange) actually matter.
The Contenders
1. Docker Swarm Mode (Native)
Docker finally woke up and realized nobody wants to run a separate container for Swarm. With 1.12, it is baked into the engine. It is arguably the slickest developer experience right now.
The Pros:
- Zero Setup: You type
docker swarm initand you are the manager. Done. - Security: Mutual TLS is automatic. Rotating certificates is automatic.
- Backward Compatibility: It still looks like the Docker API we know.
2. Kubernetes 1.4
Google's gift to the world. Version 1.4 just dropped, finally introducing kubeadm to make installation less like performing open-heart surgery. It is powerful, but it assumes you are Google-scale from day one.
The Pros:
- Pods: The concept of grouping containers (shared IP/storage) is superior to single containers for complex apps.
- Health Checks: Liveness and Readiness probes are mature.
- Community: Everyone is building plugins for K8s.
Technical Deep Dive: The Deployment Reality
Let's look at the actual work required to get a redundant Nginx service up and running.
Docker Swarm Approach
Swarm is imperative. You tell it what to do right now.
# Initialize the swarm on the manager node
docker swarm init --advertise-addr 10.0.0.5
# Create an overlay network for isolation
docker network create --driver overlay app_net
# Deploy the service with 3 replicas
docker service create \
--name frontend \
--replicas 3 \
--network app_net \
--publish 80:80 \
nginx:alpine
This is clean. However, the routing mesh (which routes traffic hitting any node to the correct container) adds a slight latency penalty. We measured about 1-2ms overhead per request in our benchmarks on standard SATA SSDs. On CoolVDS NVMe instances, this overhead is negligible because the I/O wait time doesn't compound the network delay.
Kubernetes Approach
Kubernetes is declarative. You give it a file describing the desired state.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: frontend-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.11.5-alpine
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: frontend-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: NodePort
You then apply this with kubectl apply -f nginx.yaml. It is verbose. It is complex. But it is self-healing in a way Swarm struggles to match when things get really weird (like partial network partitions).
The Hidden Killer: Storage and I/O
Here is the war story. We had a client trying to run a Galera MySQL cluster inside Docker containers across three different hosts. They were using a cheap VPS provider with standard spinning disks (HDD).
The cluster fell apart every time they ran a backup. Why? I/O Starvation.
Containers share the host kernel. When one container hammers the disk, the kernel locks up resources for everyone else. Orchestration tools use etcd or internal Raft consensus logs. If writing to that log takes too long because your database is hogging the disk IOPS, the node is marked "dead" by the cluster manager. Rescheduling begins. Chaos ensues.
Pro Tip: Always tune your scheduler to respect I/O limits, or better yet, ensure your underlying storage can handle the random read/write patterns of container workloads.
If you are running databases in containers (which I still recommend against unless you really know what you are doing), you must tweak your my.cnf to respect the container limits:
[mysqld]
# Ensure InnoDB doesn't eat all RAM meant for other containers
innodb_buffer_pool_size = 2G
# Prevent disk thrashing on checkpoints
innodb_io_capacity = 1000
innodb_flush_method = O_DIRECT
Why Infrastructure Choice is Critical
This brings us to the platform. You can have the best Kubernetes manifest in the world, but if your underlying hypervisor is oversubscribing CPU, your pods will experience "steal time." This is where the CPU says it's giving you cycles, but the hypervisor is actually giving them to your noisy neighbor.
At CoolVDS, we see this constantly with refugees from budget hosts. We use KVM (Kernel-based Virtual Machine) for strict isolation. Unlike OpenVZ containers (which are basically just fancy chroots), KVM gives you a real kernel. This is mandatory for running Docker properly. You cannot run Docker inside a shared OpenVZ container reliably.
Performance Benchmark: Random Write 4k
| Environment | IOPS | Latency |
|---|---|---|
| Budget VPS (SATA) | ~450 | 15ms |
| CoolVDS (NVMe) | ~25,000+ | 0.05ms |
When you have 50 microservices trying to write logs simultaneously, that IOPS number is the difference between a snappy API and a 502 Bad Gateway.
The Norwegian Context: Latency & Compliance
We are seeing tighter scrutiny from Datatilsynet regarding where data is stored. With the recent invalidation of Safe Harbor and the new Privacy Shield framework, keeping data within Norwegian borders is becoming a hard requirement for many CTOs.
Running your orchestration cluster on CoolVDS guarantees your data sits in Oslo. Furthermore, latency to the NIX is minimal. If your users are in Norway, don't route them through a datacenter in Frankfurt just to save 50 kroner a month.
Verdict: Which One to Choose?
Choose Docker Swarm if:
- You have a team of less than 5 people.
- You want to move from
docker-composeto production in an afternoon. - You don't need complex ingress controllers.
Choose Kubernetes if:
- You need to manage stateful sets (databases) with specific mounting requirements.
- You have a dedicated DevOps engineer who understands
iptables. - You are preparing for massive scale.
Regardless of your choice, orchestration adds overhead. Don't compound it with slow hardware. You need high-frequency CPUs and NVMe storage to run these stacks effectively.
Ready to build your cluster? Deploy a high-performance KVM instance on CoolVDS in under 55 seconds and see the difference NVMe makes to your build times.