Console Login
Home / Blog / DevOps & Infrastructure / Stop Hardcoding IPs: Dynamic Service Discovery with Consul 0.5
DevOps & Infrastructure 0 views

Stop Hardcoding IPs: Dynamic Service Discovery with Consul 0.5

@

Stop Hardcoding IPs: Dynamic Service Discovery with Consul 0.5

If I have to manually edit an nginx.conf upstream block one more time because we added a new backend node, I might just unplug the server rack. It is 2015. We are past the era of static infrastructure, yet I still see sysadmins maintaining spreadsheets of IP addresses like it's 1999.

When you are running a couple of VPS instances, hardcoding IPs works. When you scale to twenty nodes across different zones to handle a traffic spike, static configuration is a liability. It breaks. It requires a human to SSH in. It causes downtime.

We need automatic service discovery. We need the infrastructure to be self-aware. Enter Consul.

The Problem with "Standard" Load Balancing

In a traditional setup, your load balancer (Nginx or HAProxy) looks like this:

upstream myapp {
server 10.0.0.1:80;
server 10.0.0.2:80;
}

What happens when 10.0.0.1 suffers a kernel panic? Nginx might try it a few times before marking it down, causing 502 Bad Gateway errors for your users. What happens when you spin up 10.0.0.3? Nothing, until you manually update the config and reload the service.

In the fast-moving DevOps world, this latency is unacceptable. We are deploying Docker containers (version 1.7 just dropped, and it's looking stable) that live for hours, not years. We cannot chase IPs.

Why Consul beats Zookeeper and Etcd

There are other options. Zookeeper is the old guard, but setting up a JVM cluster just for key-value storage feels like bringing a tank to a knife fight. Etcd is promising, but it's primarily a key-value store.

Consul (currently v0.5) is superior for operations because it treats Service Discovery as a first-class citizen, not an afterthought. It provides:

  • DNS Interface: You can query db.service.consul and get the IP of the master node. No API integration required for legacy apps.
  • Health Checks: Consul doesn't just list services; it checks if they are alive. If a node fails a generic script check, it is removed from the DNS pool immediately.
  • Multi-Datacenter: It works across WAN links out of the box.
Pro Tip: Do not run a single Consul server. You need a quorum. For a production cluster on CoolVDS, run 3 or 5 server nodes to handle the Raft consensus protocol. If you lose quorum, you lose the cluster.

Architecture: The "Consul-Template" Pattern

The most robust way to implement this right now is using consul-template. This daemon watches the Consul cluster for changes and re-renders configuration files on the fly.

Here is the workflow:

  1. New Node: You spin up a new KVM instance on CoolVDS. It boots and starts the Consul agent.
  2. Registration: The agent tells the cluster: "I am here, I am running 'web-app', and my IP is X."
  3. Detection: The load balancer, running consul-template, sees the new node.
  4. Reconfiguration: The template engine rewrites /etc/nginx/nginx.conf and reloads Nginx. Zero downtime.

The Implementation

First, get the agent running on your backend nodes (Ubuntu 14.04 LTS example):

consul agent -server -bootstrap-expect 3 -data-dir /tmp/consul -node=agent-one -bind=10.0.0.1

Next, define a service definition in /etc/consul.d/web.json:

{
"service": {
"name": "web",
"tags": ["production"],
"port": 80,
"check": {
"script": "curl localhost:80 >/dev/null 2>&1",
"interval": "10s"
}
}
}

If that curl command fails, Consul removes the node. Simple.

Why Infrastructure Stability Matters for Raft

Consul relies on the Raft consensus algorithm. Raft is incredibly sensitive to timing and network latency. If your underlying virtualization platform steals CPU cycles (common in oversold OpenVZ hosting) or has high network jitter, the cluster members will think the leader is dead. They will force a re-election.

This causes a "flapping" cluster. Your service discovery goes haywire.

This is where the choice of hosting provider becomes architectural, not just financial. At CoolVDS, we strictly use KVM virtualization. We don't oversell CPU cores. When you run a consensus protocol across our Oslo datacenter, the latency is flat. You get the stability of bare metal with the flexibility of a VPS.

Data Sovereignty in Norway

For those of us operating in Europe, keeping data within the EEA is becoming critical. While the Safe Harbor framework is currently in place, privacy advocates are already challenging it. Storing your service maps and topology data on servers physically located in Norway (covered by the Norwegian Personal Data Act / Personopplysningsloven) adds a layer of legal safety that US-centric clouds cannot guarantee.

Next Steps

Manual configuration is a bottleneck. Automation is the only way to scale.

  1. Install Consul 0.5.2 on your nodes.
  2. Set up consul-template on your load balancer.
  3. Kill a backend node and watch Nginx recover automatically in under 10 seconds.

If you need a sandbox to test this cluster, don't run it on your laptop. Network conditions there are too perfect. Deploy three small instances on CoolVDS today and see how robust service discovery behaves in a real production environment.

/// TAGS

/// RELATED POSTS

Building a CI/CD Pipeline on CoolVDS

Step-by-step guide to setting up a modern CI/CD pipeline using Firecracker MicroVMs....

Read More →

Stop Guessing: A SysAdmin’s Guide to Application Performance Monitoring in 2015

Is your application slow, or is it the network? Learn how to diagnose bottlenecks using the ELK stac...

Read More →

Latency is the Enemy: Why Centralized Architectures Fail Norwegian Users (And How to Fix It)

In 2015, hosting in Frankfurt isn't enough. We explore practical strategies for distributed infrastr...

Read More →

Docker in Production: Security Survival Guide for the Paranoia-Prone

Containerization is sweeping through Norwegian dev teams, but the default settings are a security ni...

Read More →

Stop Using Ping: A Sysadmin’s Guide to Infrastructure Monitoring at Scale

Is your monitoring strategy just a cron job and a prayer? In 2015, 'uptime' isn't enough. We explore...

Read More →

The Truth About "Slow": A SysAdmin’s Guide to Application Performance Monitoring in 2015

Uptime isn't enough. Discover how to diagnose high latency, banish I/O wait time, and why KVM virtua...

Read More →
← Back to All Posts