Console Login

Orchestration Wars 2021: Kubernetes vs. Docker Swarm vs. Nomad – A Real-World Survival Guide

Orchestration Wars 2021: Kubernetes vs. Docker Swarm vs. Nomad

Let’s be honest: most of you don't need Kubernetes. I’ve seen engineering teams in Oslo burn three months of runway trying to deploy a simple CRUD app on a self-hosted K8s cluster, only to realize they spend more time managing etcd than shipping code. In the current landscape of 2021, complexity is the enemy of reliability. If your orchestration layer requires a dedicated team just to keep the lights on, you've already lost.

I have spent the last decade fixing broken clusters across Europe. I have seen Docker Swarm save startups and Kubernetes bury them. Today, we are going to look at the three main contenders for container orchestration—Kubernetes (K8s), Docker Swarm, and HashiCorp Nomad—through the lens of actual production requirements, not resume-driven development.

The 800-Pound Gorilla: Kubernetes (v1.21)

Kubernetes has won the marketing war. With the recent release of v1.21 in April 2021, it is more stable than ever. It is the standard for a reason: the ecosystem is massive. If you have a microservices architecture with 50+ services, need complex autoscaling, sidecars for service meshes (like Istio), and granular RBAC, K8s is the only serious choice.

However, the cost is operational overhead. The control plane is heavy. If you are running K8s on cheap, oversold VPS hosting with spinning disks, you will crash. Kubernetes relies heavily on etcd for state, and etcd is notoriously sensitive to disk latency.

Pro Tip: If your etcd fsync latency exceeds 10ms, your cluster stability will degrade. This is why we insist on NVMe storage for all CoolVDS instances. Spinning rust cannot handle the random write patterns of a busy API server.

Here is a standard deployment configuration. Notice the verbosity required just to get a container running:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: coolvds-app
  labels:
    app: core-system
spec:
  replicas: 3
  selector:
    matchLabels:
      app: core-system
  template:
    metadata:
      labels:
        app: core-system
    spec:
      containers:
      - name: nginx
        image: nginx:1.19.10
        ports:
        - containerPort: 80
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 80
          initialDelaySeconds: 3
          periodSeconds: 3

You apply this, and then you still need a Service, an Ingress, and a CNI plugin just to talk to the world. It is powerful, but it is heavy.

The Pragmatic Choice: Docker Swarm

Despite rumors of its death, Docker Swarm is still alive and kicking in 2021. For teams of 2 to 10 developers, Swarm is often the superior choice. It is integrated directly into the Docker engine. There is no separate binary to install. If you can write a docker-compose.yml file, you can manage a Swarm cluster.

I recently migrated a logistics company in Bergen from a broken K8s setup to Swarm. Their deployment time went from 15 minutes to 45 seconds. Why? Because Swarm treats the cluster as a single Docker engine. It lacks advanced features like custom resource definitions (CRDs), but it excels at simplicity.

Deploying a stack is a single command:

# Initialize the swarm on the manager node
docker swarm init --advertise-addr 10.0.0.1

# Deploy the stack
docker stack deploy -c docker-compose.yml production_stack

However, Swarm struggles with stateful workloads. If you need complex persistent volume claims tied to specific storage backends, Swarm's limitations become painful.

The Unix Philosophy: HashiCorp Nomad

Nomad is the dark horse. It follows the Unix philosophy: do one thing and do it well. Unlike K8s, which tries to be an OS for the cloud, Nomad is just a scheduler. It can schedule containers, but it can also schedule raw Java JARs, binaries, or QEMU virtual machines.

For a project last year involving legacy binaries that couldn't be containerized easily, Nomad was a lifesaver. It is a single binary. The resource footprint is tiny compared to Kubelet. You can run a Nomad agent on a 512MB VPS without it choking.

A Nomad job specification looks distinct but readable:

job "coolvds-worker" {
  datacenters = ["dc1"]
  type = "service"

  group "cache" {
    count = 1
    task "redis" {
      driver = "docker"
      config {
        image = "redis:6.2"
        port_map {
          db = 6379
        }
      }
      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

The Hardware Reality: Latency and IOPS

Orchestrators are software. They cannot fix hardware limitations. The most common cause of orchestration failure I see in 2021 is IOPS starvation. When you pack 20 containers onto a single node, and they all start writing logs or updating databases simultaneously, the underlying storage queue fills up. The orchestrator thinks the node is dead because the health checks time out.

This is where the "Noisy Neighbor" effect destroys performance. On shared cloud platforms, your K8s node might be sharing a disk with a neighbor mining crypto. Your latency spikes, pods get evicted, and you wake up at 4 AM.

At CoolVDS, we specifically configure our KVM hypervisors to ensure dedicated I/O throughput. We use NVMe storage arrays because in 2021, SSDs are simply not fast enough for high-density container clusters. If you are running a database inside Kubernetes (which is risky but common), you need the hardware to back it up.

Performance Tuning for Orchestrators

Whether you choose K8s or Swarm, you must tune the Linux kernel. The defaults are set for general desktop usage, not high-throughput container networking. Add these to your /etc/sysctl.conf:

# Increase the limit of open file descriptors
fs.file-max = 2097152

# Improve networking for high connection rates
net.core.somaxconn = 65535
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.ip_local_port_range = 1024 65000

# Increase virtual memory areas for Elasticsearch/Java apps
vm.max_map_count = 262144

Data Sovereignty and The Nordic Edge

We cannot ignore the elephant in the room: Schrems II. Since the ruling last year, hosting data on US-owned clouds has become a legal minefield for European companies. The Norwegian Datatilsynet is watching. If your orchestration layer is automagically replicating data to a US-east region, you are non-compliant.

Running your own cluster on a Norwegian provider like CoolVDS simplifies this. Data stays in Oslo. You know exactly where the physical drive sits. For dev teams in the Nordics, this also means lower latency. Pinging a server in Oslo from Trondheim is roughly 10-15ms. Pinging AWS Frankfurt is 35-40ms. Those milliseconds add up when you are doing synchronous replication.

Comparison Matrix

Feature Kubernetes Docker Swarm Nomad
Learning Curve Steep (Months) Low (Days) Medium (Weeks)
Maintenance Overhead High Low Low
Scalability Massive (5000+ nodes) Medium (1000 nodes) Massive (10k+ nodes)
Storage Integration Excellent (CSI) Poor Good (CSI support beta)

Final Verdict

If you are building the next Netflix, use Kubernetes. If you are a team of five building a SaaS product, start with Docker Swarm. If you have a weird mix of legacy binaries and Docker containers, use Nomad.

But regardless of the software, ensure your foundation is solid. Don't let slow I/O kill your SEO or your uptime. Deploy your cluster on infrastructure that respects the physics of latency.

Ready to test your cluster performance? Spin up a high-performance NVMe instance on CoolVDS in under 55 seconds and see the difference raw power makes.