Kubernetes vs. Docker Swarm vs. Nomad: The 2021 Orchestration Battle Report
Let's cut the marketing noise. It is February 2021, and if I see one more LinkedIn post claiming Kubernetes is the "easy" solution for a startup with three developers and a monolith PHP app, I'm going to rm -rf / my production cluster. The industry is suffering from severe tool-fetishism. We are deploying complexity we don't understand to solve problems we don't have.
I've spent the last six months migrating a fintech workload from a messy set of shell scripts to a containerized environment. The latency requirements were strict: data had to stay in Norway due to the recent Schrems II ruling, and the API response time had to be under 50ms. We evaluated the "Big Three" orchestrators: Kubernetes (K8s), Docker Swarm, and HashiCorp Nomad.
Here is the war report from the trenches. No fluff. Just configurations, benchmarks, and the cold hard truth about why your orchestrator crashes when your underlying storage is trash.
The Contenders
1. Kubernetes (The 800lb Gorilla)
Kubernetes won the war. We know this. With the release of v1.20 late last year, it deprecated the Docker shim, pushing us all toward containerd. It is powerful, extensible, and absolutely exhausting to manage if you don't have a dedicated platform team.
The Reality: K8s is not an orchestrator; it is a framework for building platforms. If you run a vanilla kubeadm cluster, you are responsible for the control plane, the CNI (Calico/Flannel), the CSI, and the ingress controllers.
2. Docker Swarm (The "Just Works" Option)
Despite Mirantis acquiring Docker Enterprise, Swarm mode is still alive in 2021. It is embedded in the Docker engine. You type docker swarm init and you have a cluster. For 80% of the teams I consult for in Oslo, this is actually what they need, even if they are ashamed to admit it.
3. Nomad (The UNIX Philosophy)
HashiCorp's Nomad is a scheduler. That's it. It doesn't care if you run Docker containers, Java JARs, or raw exec binaries. It is a single binary that integrates perfectly with Consul and Vault.
The Hidden Killer: Etcd Latency
Most DevOps engineers focus on CPU and RAM. They forget the single most critical metric for orchestration stability: Disk Write Latency (fsync). Kubernetes relies on etcd to store the state of the cluster. Docker Swarm uses its own Raft implementation.
If your underlying VPS storage is slow, etcd will time out. Leader elections will fail. Your API server will hang. I've seen entire clusters in production enter a "split-brain" scenario because the cheap hosting provider was throttling IOPS on their "SSD" tier.
Pro Tip:etcdis incredibly sensitive to disk latency. A generic cloud instance with network-attached storage (NAS) often introduces 2-10ms of latency per write. This is fatal for high-load clusters. You need local NVMe storage to guarantee the sub-millisecond writesetcdcraves.
Benchmarking Your Infrastructure
Before you even pick K8s or Swarm, test your disk. If you are deploying on CoolVDS, you are already running on local NVMe, which is why we use it for our control plane nodes. But let's prove it.
Run fio to simulate an etcd write load (sequential writes, syncing every time):
fio --rw=write --ioengine=sync --fdatasync=1 \
--directory=test-data --size=22m --bs=2300 \
--name=mytest
On a standard budget VPS from a German giant (naming no names), I got 650 IOPS and 8ms latency. Acceptable for a blog, fatal for K8s.
On a CoolVDS NVMe instance, I consistently hit 15,000+ IOPS with 0.3ms latency. That is the difference between a cluster that heals itself and a pager going off at 3 AM.
Configuration Deep Dive
Kubernetes: Taming the Beast
If you choose Kubernetes in 2021, you must optimize etcd if you run it yourself. Here is a snippet from a production etcd.yaml override I use to handle potential network jitter between Oslo and external European nodes:
# /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://192.168.1.10:2379
- --heartbeat-interval=100
- --election-timeout=1000 # Increased from default 1000ms if network is shaky
- --snapshot-count=10000
- --wal-dir=/var/lib/etcd/wal
- --data-dir=/var/lib/etcd
image: k8s.gcr.io/etcd:3.4.13-0
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
Note the election-timeout. If your infrastructure has "noisy neighbors" stealing CPU cycles, the heartbeat fails, and the cluster re-elects a leader. This causes downtime. CoolVDS provides dedicated CPU threads on higher plans, preventing this CPU steal issue.
Docker Swarm: Simplicity Wins
For a project last month involving a Python scraper fleet, we skipped K8s. The complexity overhead wasn't worth it. Here is the entire stack definition:
version: "3.8"
services:
scraper:
image: registry.coolvds.com/internal/scraper:v2
deploy:
replicas: 12
update_config:
parallelism: 2
delay: 10s
restart_policy:
condition: on-failure
environment:
- TARGET_GEO=NO
networks:
- internal_net
networks:
internal_net:
driver: overlay
To deploy this? One command: docker stack deploy -c docker-compose.yml scraper_stack. No Helm charts, no RBAC headaches. If you operate within Norway and have a small team, this is often the pragmatic choice.
Nomad: The Performance King
Nomad shines when you need to mix Docker with legacy binaries. It is also much lighter on resources. While K8s eats 1GB+ of RAM just to exist, Nomad consumes roughly 50MB. This density allows us to pack more worker jobs onto a single CoolVDS instance, lowering the Total Cost of Ownership (TCO).
job "redis-cache" {
datacenters = ["oslo-dc1"]
type = "service"
group "cache" {
count = 3
task "redis" {
driver = "docker"
config {
image = "redis:6.0"
port_map {
db = 6379
}
}
resources {
cpu = 500
memory = 256
}
}
}
}
Comparison Matrix
| Feature | Kubernetes | Docker Swarm | Nomad |
|---|---|---|---|
| Learning Curve | Steep | Low | Moderate |
| Resource Overhead | High | Low | Very Low |
| State Storage | etcd (needs NVMe) | Raft (internal) | Consul (optional) |
| Best For | Enterprise, Microservices | Small teams, fast deploy | Mixed workloads, High Perf |
The Compliance Angle: Schrems II & GDPR
Since the Schrems II ruling last year, transferring personal data to US-owned cloud providers has become a legal minefield. Datatilsynet (The Norwegian Data Protection Authority) has been clear: you need to control where your data lives.
Running your orchestration layer on CoolVDS means your data sits on servers physically located in Europe, under European jurisdiction. You aren't just getting low latency to the NIX (Norwegian Internet Exchange); you are buying peace of mind for your Data Protection Officer.
Final Verdict
The best orchestrator is the one your team can debug at 3 AM. For 80% of you, that might actually be Docker Swarm. For the 20% needing scale, Kubernetes is the standard.
But remember: Software cannot fix hardware limitations. You can have the most beautiful Kubernetes manifests in the world, but if your underlying disk I/O chokes during a traffic spike, your pod liveness probes will fail, and your site will go down.
Stop building castles on sand. Build them on rock-solid NVMe.
Need a battle-ready cluster? Deploy a high-performance KVM instance on CoolVDS today and see what 0.3ms disk latency does for your etcd stability.