Console Login

Scaling Beyond the Single Server: High Availability with HAProxy 1.4

Scaling Beyond the Single Server: High Availability with HAProxy 1.4

It’s 3:00 AM. Your pager goes off. The monitoring system is screaming because your primary web server just hit MaxClients in Apache and locked up. Again. If you are running a high-traffic site in 2010, relying on a single box—no matter how much RAM you shove into it—is a ticking time bomb. Vertical scaling has a ceiling, and you just hit it.

We see this constantly at CoolVDS. A client gets featured on Digg or Slashdot, their traffic graphs go vertical, and their single LAMP stack melts. The solution isn't just "buy a bigger server." The solution is horizontal scaling with a smart load balancer.

Enter HAProxy.

Why Software Load Balancing?

Historically, if you wanted load balancing, you bought a hardware appliance like a Cisco LocalDirector or an F5 Big-IP. They cost as much as a luxury car. But the game has changed. With the release of HAProxy 1.4 this year, we now have a stable, production-ready software solution that can handle tens of thousands of concurrent connections on standard hardware.

Unlike Apache, which uses a process-based or threaded model that eats RAM for breakfast, HAProxy uses an event-driven, single-process model. It is extremely lightweight. You can run it on a modest VPS instance, and it will saturate a gigabit link before it runs out of CPU.

The Architecture: Decoupling the Front-end

The goal is simple: Stop exposing your application servers directly to the public internet. Place a CoolVDS instance running HAProxy in front of them. This gives you:

  • Fault Tolerance: If App-Server-1 dies, HAProxy routes traffic to App-Server-2 instantly.
  • Maintenance Freedom: Take a backend server offline for kernel updates without downtime.
  • Better Security: Your backend servers can live on a private network (LAN), unreachable from the outside world.

Configuration: The Meat and Potatoes

Let’s assume you are running CentOS 5.5. First, grab the latest 1.4 packages. Don't use the default repositories; they often carry ancient versions. Compile from source if you have to—you want the clt_server_close features introduced in 1.4 for better HTTP keep-alive handling.

Here is a battle-tested haproxy.cfg snippet for a standard web cluster:

global
    log 127.0.0.1   local0
    maxconn 4096
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    retries 3
    option  redispatch
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

listen webfarm 0.0.0.0:80
    mode http
    stats enable
    stats uri /haproxy?stats
    balance roundrobin
    option httpclose
    option forwardfor
    cookie SERVERID insert indirect nocache
    server web01 192.168.1.10:80 check cookie s1
    server web02 192.168.1.11:80 check cookie s2

Breaking Down the Config

The balance roundrobin directive is key here. It rotates requests sequentially. However, if you are running a PHP application with local sessions (like Magento or vBulletin), you need session stickiness. That’s what cookie SERVERID insert does. It injects a cookie so the user sticks to the same backend server for their session duration.

Pro Tip: Watch your TIME_WAIT sockets. On high-traffic Linux nodes, you might run out of ephemeral ports. Tune your sysctl settings: net.ipv4.tcp_tw_reuse = 1 inside /etc/sysctl.conf. It saves lives.

The Hardware Reality: IOPS Matter

Software is only half the battle. Virtualization platform choice is the other half. Many hosting providers oversell their nodes using OpenVZ, where you share the kernel with everyone else. If your neighbor decides to run a massive `tar` backup, your load balancer starts lagging.

At CoolVDS, we strictly use KVM (Kernel-based Virtual Machine). This provides true hardware virtualization and isolation. Furthermore, we are aggressively rolling out Solid State Drives (SSDs) across our fleet. In 2010, spinning rust (mechanical HDDs) is becoming the primary bottleneck for database performance. While SSD storage is premium, the random I/O performance is orders of magnitude faster than SAS 15k drives. For a database backend, this difference is night and day.

The Norwegian Context: Latency and Law

If your user base is in Oslo, Bergen, or Trondheim, hosting in Germany or the US is a mistake. The speed of light is a hard limit. Peering at NIX (Norwegian Internet Exchange) ensures your latency stays in the single digits.

Additionally, we must talk about compliance. Under the Norwegian Personopplysningsloven (Personal Data Act) and the EU Data Protection Directive, you are responsible for where your customer data lives. Keeping your data on servers physically located in Norway simplifies your compliance significantly compared to navigating the complex Safe Harbor framework required for US hosting.

Final Thoughts

Load balancing isn't just for Google or Facebook anymore. With tools like HAProxy 1.4 and affordable KVM instances, you can build a resilient, redundant architecture today. Don't wait for the next crash.

Ready to harden your infrastructure? Spin up a KVM instance on CoolVDS today and test HAProxy on our high-performance network.