Stop Hardcoding IPs: Dynamic Service Discovery with Consul 0.5
If I have to manually edit an nginx.conf upstream block one more time because we added a new backend node, I might just unplug the server rack. It is 2015. We are past the era of static infrastructure, yet I still see sysadmins maintaining spreadsheets of IP addresses like it's 1999.
When you are running a couple of VPS instances, hardcoding IPs works. When you scale to twenty nodes across different zones to handle a traffic spike, static configuration is a liability. It breaks. It requires a human to SSH in. It causes downtime.
We need automatic service discovery. We need the infrastructure to be self-aware. Enter Consul.
The Problem with "Standard" Load Balancing
In a traditional setup, your load balancer (Nginx or HAProxy) looks like this:
upstream myapp {
server 10.0.0.1:80;
server 10.0.0.2:80;
}What happens when 10.0.0.1 suffers a kernel panic? Nginx might try it a few times before marking it down, causing 502 Bad Gateway errors for your users. What happens when you spin up 10.0.0.3? Nothing, until you manually update the config and reload the service.
In the fast-moving DevOps world, this latency is unacceptable. We are deploying Docker containers (version 1.7 just dropped, and it's looking stable) that live for hours, not years. We cannot chase IPs.
Why Consul beats Zookeeper and Etcd
There are other options. Zookeeper is the old guard, but setting up a JVM cluster just for key-value storage feels like bringing a tank to a knife fight. Etcd is promising, but it's primarily a key-value store.
Consul (currently v0.5) is superior for operations because it treats Service Discovery as a first-class citizen, not an afterthought. It provides:
- DNS Interface: You can query
db.service.consuland get the IP of the master node. No API integration required for legacy apps. - Health Checks: Consul doesn't just list services; it checks if they are alive. If a node fails a generic script check, it is removed from the DNS pool immediately.
- Multi-Datacenter: It works across WAN links out of the box.
Pro Tip: Do not run a single Consul server. You need a quorum. For a production cluster on CoolVDS, run 3 or 5 server nodes to handle the Raft consensus protocol. If you lose quorum, you lose the cluster.
Architecture: The "Consul-Template" Pattern
The most robust way to implement this right now is using consul-template. This daemon watches the Consul cluster for changes and re-renders configuration files on the fly.
Here is the workflow:
- New Node: You spin up a new KVM instance on CoolVDS. It boots and starts the Consul agent.
- Registration: The agent tells the cluster: "I am here, I am running 'web-app', and my IP is X."
- Detection: The load balancer, running
consul-template, sees the new node. - Reconfiguration: The template engine rewrites
/etc/nginx/nginx.confand reloads Nginx. Zero downtime.
The Implementation
First, get the agent running on your backend nodes (Ubuntu 14.04 LTS example):
consul agent -server -bootstrap-expect 3 -data-dir /tmp/consul -node=agent-one -bind=10.0.0.1Next, define a service definition in /etc/consul.d/web.json:
{
"service": {
"name": "web",
"tags": ["production"],
"port": 80,
"check": {
"script": "curl localhost:80 >/dev/null 2>&1",
"interval": "10s"
}
}
}If that curl command fails, Consul removes the node. Simple.
Why Infrastructure Stability Matters for Raft
Consul relies on the Raft consensus algorithm. Raft is incredibly sensitive to timing and network latency. If your underlying virtualization platform steals CPU cycles (common in oversold OpenVZ hosting) or has high network jitter, the cluster members will think the leader is dead. They will force a re-election.
This causes a "flapping" cluster. Your service discovery goes haywire.
This is where the choice of hosting provider becomes architectural, not just financial. At CoolVDS, we strictly use KVM virtualization. We don't oversell CPU cores. When you run a consensus protocol across our Oslo datacenter, the latency is flat. You get the stability of bare metal with the flexibility of a VPS.
Data Sovereignty in Norway
For those of us operating in Europe, keeping data within the EEA is becoming critical. While the Safe Harbor framework is currently in place, privacy advocates are already challenging it. Storing your service maps and topology data on servers physically located in Norway (covered by the Norwegian Personal Data Act / Personopplysningsloven) adds a layer of legal safety that US-centric clouds cannot guarantee.
Next Steps
Manual configuration is a bottleneck. Automation is the only way to scale.
- Install Consul 0.5.2 on your nodes.
- Set up
consul-templateon your load balancer. - Kill a backend node and watch Nginx recover automatically in under 10 seconds.
If you need a sandbox to test this cluster, don't run it on your laptop. Network conditions there are too perfect. Deploy three small instances on CoolVDS today and see how robust service discovery behaves in a real production environment.