Console Login

Scaling Beyond the Single Box: High Availability with HAProxy on CentOS 6

Scaling Beyond the Single Box: High Availability with HAProxy on CentOS 6

Let’s be honest for a second. If you are running your business-critical application on a single server, you aren't hosting; you're gambling. I’ve seen it a dozen times: a marketing campaign goes live, traffic spikes by 400%, and that lonely Apache instance creates a bottleneck that no amount of caching can fix. The server chokes, the site goes 503, and your reputation takes a nosedive.

It is 2012. We don't have to tolerate single points of failure anymore. While hardware load balancers like F5 Big-IP are fantastic if you have the budget of a small nation, the rest of us in the Linux trenches have a weapon that is arguably more flexible and infinitely cheaper: HAProxy.

In this guide, I’m going to show you how to set up a robust Layer 4/7 load balancer using HAProxy 1.4 on CentOS 6. We will look at distributing traffic across multiple web nodes and why the underlying virtual infrastructure—specifically the stability provided by providers like CoolVDS—is just as critical as your configuration file.

The Architecture: Why Decouple?

The goal is simple: stop treating your web server as a pet. Your web servers should be cattle (disposable workers). The intelligence lives in the Load Balancer (LB).

When you place an HAProxy instance in front of two or more backend web servers, you gain:

  • Redundancy: If Web Node A dies, HAProxy detects it and routes everything to Web Node B instantly.
  • Scalability: Need more power? Spin up a new VPS on CoolVDS, add the IP to the config, and reload. No downtime.
  • Maintenance: You can take a server offline for kernel patching without waking up your users.
Pro Tip: Latency kills conversions. If your target audience is in Norway, hosting your Load Balancer in Germany or the US adds unnecessary milliseconds to the handshake. CoolVDS offers low latency endpoints directly in Oslo, peering locally via NIX (Norwegian Internet Exchange). Keep the hops short.

Step 1: Installation on CentOS 6

First, we need to get the software. The default repositories are decent, but for the latest stable 1.4 branch, ensure your `yum` is updated.

[root@lb01 ~]# yum update -y
[root@lb01 ~]# yum install haproxy

Once installed, don't start it yet. We need to configure it to handle high concurrency. By default, Linux limits file descriptors, which limits connections.

Step 2: The Configuration (The Meat)

The configuration file lives at /etc/haproxy/haproxy.cfg. I recommend backing up the default one and starting fresh.

Here is a battle-tested configuration for a standard HTTP cluster. This setup uses Round Robin balancing but includes a `cookie` directive for session persistence (sticky sessions)—crucial if you are running PHP applications like Magento or WordPress where losing the session means losing the shopping cart.

global
    log         127.0.0.1 local2
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # Turn on stats socket for dynamic inspection
    stats socket /var/lib/haproxy/stats

defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

frontend  main_http
    bind *:80
    default_backend             web_servers

backend web_servers
    balance     roundrobin
    cookie      SERVERID insert indirect nocache
    option      httpchk HEAD /health_check.php HTTP/1.0
    server      web01 192.168.10.2:80 check cookie s1
    server      web02 192.168.10.3:80 check cookie s2

Breaking Down the Directives

`balance roundrobin`: This rotates requests sequentially. Request 1 goes to web01, Request 2 to web02. It's fair and simple.

`option httpchk`: This is vital. HAProxy will periodically ping /health_check.php. If your web server returns a 200 OK, it receives traffic. If your database crashes and the PHP script returns a 500 error, HAProxy automatically marks that server as DOWN and stops sending users there.

`cookie SERVERID`: This injects a cookie into the user's browser. If a user lands on `web01`, they stay on `web01`. This prevents users from being logged out randomly as they browse.

Step 3: Kernel Tuning for Heavy Loads

Configuring the application is only half the battle. If your OS limits TCP connections, HAProxy will hit a wall. In a high-traffic environment, you can run out of ephemeral ports.

Edit your /etc/sysctl.conf to allow the kernel to recycle TIME_WAIT sockets faster:

net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.core.somaxconn = 4096

Run sysctl -p to apply. This ensures that during a DDoS attack or a legitimate flash crowd, your load balancer doesn't collapse under the weight of zombie connections.

The Storage Bottleneck: Why I/O Matters

You might think, "It's just a load balancer, it doesn't write to disk." Wrong. Logs write to disk. Stats write to disk. If you are logging every HTTP request for analytics (which you should be), a slow hard drive will cause I/O wait, which steals CPU cycles from the packet forwarding process.

This is where infrastructure choice becomes a strategic decision. Legacy VPS providers often pile 50 tenants onto a single SATA RAID array. The result is "Noisy Neighbor" syndrome. If your neighbor decides to compile a kernel or run a backup, your load balancer lags.

We use CoolVDS for our setups because they utilize SSD storage arrays and KVM virtualization. KVM provides strict resource isolation—unlike OpenVZ, where memory is often oversold. When you are balancing 5,000 req/s, you need guaranteed CPU cycles and fast I/O access, or you introduce latency before the request even hits your web server.

A Warning on Compliance (Personopplysningsloven)

Operating in Norway means respecting data privacy laws. Under the Personal Data Act (Personopplysningsloven), you are responsible for where your user data flows.

If you use a US-based cloud load balancer, you are technically exporting traffic metadata out of the EEA. By deploying your own HAProxy instance on a Norwegian VPS, you maintain full data sovereignty. The IP termination happens in Oslo, the logs stay in Oslo, and you stay compliant with the Datatilsynet guidelines.

Verifying the Setup

Start the service:

[root@lb01 ~]# service haproxy start

Now, watch the logs in real-time while you generate some traffic:

[root@lb01 ~]# tail -f /var/log/haproxy.log

If you configured the stats socket, you can also install socat and query the status directly from the command line without needing a web interface:

[root@lb01 ~]# echo "show info" | socat stdio /var/lib/haproxy/stats

Conclusion

Complexity is the enemy of uptime, but redundancy is its best friend. By placing HAProxy in front of your stack, you gain the ability to sleep at night knowing that a single server failure won't take down your business.

However, software is only as good as the iron it runs on. Don't let slow I/O or oversold CPU steal your performance. For production workloads, I rely on the consistent low latency and SSD performance of CoolVDS.

Ready to harden your infrastructure? Deploy a KVM instance on CoolVDS today and start building a stack that actually stays up.