Console Login

Survive the Slashdot Effect: High-Availability Load Balancing with HAProxy 1.4 on Linux

Enterprise Load Balancing on a Startup Budget

It starts with a trickle. Then a link hits Digg or Slashdot, and suddenly your top command shows load averages spiking past 20.0. Your single Apache server, configured with the standard Prefork MPM, runs out of RAM, swaps to disk, and dies. If you are running a business in Norway, that downtime isn't just embarrassing; it is losing you kroner by the second.

I have seen this scenario play out in data centers from Oslo to Frankfurt. The solution isn't just throwing more RAM at a single box—that is a dead end. The solution is horizontal scaling. Today, we are going to look at the exact architecture used by the big players, adapted for your VPS environment: HAProxy.

While hardware load balancers like F5 Big-IP cost more than a luxury car, HAProxy (High Availability Proxy) offers comparable performance for free. Version 1.4, released earlier this year, finally brings stability to client-side keep-alive, making it a viable replacement for hardware solutions.

The Architecture: Why Latency Kills Conversion

Before we touch the config files, let's talk about physics. If your customers are in Oslo, but your servers are in Texas, you are fighting the speed of light. You add 150ms of latency before the first packet is even acknowledged. Add a load balancer, a database query, and page rendering time, and your site feels sluggish.

This is why we host on CoolVDS. Their infrastructure is physically located in Norway, peering directly at NIX (Norwegian Internet Exchange). Pings from local ISPs are often under 10ms. When you are splitting traffic between multiple web nodes, that low latency network fabric is critical. You cannot balance load effectively if the internal network is congested.

The Setup

We will configure a simple but robust cluster:

  • Load Balancer (LB1): HAProxy + Keepalived (Master)
  • Web Node 1 (Web1): Apache/Nginx
  • Web Node 2 (Web2): Apache/Nginx

This setup assumes you are running CentOS 5.5 or Debian Lenny. We will focus on compiling HAProxy from source to ensure we get the latest 1.4 features, as most repositories are still stuck on the ancient 1.3 branch.

Step 1: Compiling HAProxy 1.4

Don't rely on yum or apt-get here unless you have verified the version. We need 1.4 for the improved polling mechanisms.

$ wget http://haproxy.1wt.eu/download/1.4/src/haproxy-1.4.8.tar.gz $ tar xzvf haproxy-1.4.8.tar.gz $ cd haproxy-1.4.8 $ make TARGET=linux26 CPU=native USE_PCRE=1 $ sudo make install

Note the TARGET=linux26. This optimizes the build for the Linux 2.6 kernel, enabling epoll support. If you run this on a generic VPS provider that oversells CPUs, epoll won't save you. On CoolVDS instances, which offer guaranteed CPU cycles and KVM virtualization, this allows you to handle thousands of concurrent connections with minimal context switching.

Step 2: The Configuration (haproxy.cfg)

Create /etc/haproxy/haproxy.cfg. We will use the leastconn algorithm, which directs new traffic to the server with the fewest active connections—perfect for long-running web sessions.

global
    log 127.0.0.1   local0
    maxconn 40960
    user haproxy
    group haproxy
    daemon
    # Spread checks to avoid spiking load on backends
    spread-checks 5

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    retries 3
    option  redispatch
    maxconn 2000
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000

listen webfarm 0.0.0.0:80
    mode http
    stats enable
    stats uri /haproxy?stats
    stats realm Haproxy\ Statistics
    stats auth admin:SecurePass2010
    balance leastconn
    option httpclose
    option forwardfor
    cookie JSESSIONID prefix
    
    server web1 192.168.1.10:80 cookie A check inter 2000 rise 2 fall 5
    server web2 192.168.1.11:80 cookie B check inter 2000 rise 2 fall 5

Dissecting the Config

The option forwardfor line is vital. Without it, your backend Apache logs will show the Load Balancer's IP for every request, making your analytics useless. This header passes the real client IP through.

We also enabled the stats page. This is your cockpit. It shows you exactly how many sessions are active and if a server has been marked as "DOWN" by the health checks.

Step 3: High Availability with Keepalived

A load balancer is a single point of failure. If the HAProxy box dies, your site goes dark. To fix this, we use VRRP (Virtual Router Redundancy Protocol) via Keepalived. This allows a "floating IP" to shift instantly from a master server to a backup server.

Pro Tip: Many budget hosting providers block Multicast traffic, which VRRP relies on. You end up with a "Split Brain" scenario where both servers think they are Master, causing IP conflicts. CoolVDS supports private VLANs and Multicast, ensuring your failover actually works when you need it.

Install Keepalived and configure /etc/keepalived/keepalived.conf on the Master:

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    interface eth0
    state MASTER
    virtual_router_id 51
    priority 101
    virtual_ipaddress {
        10.0.0.100
    }
    track_script {
        chk_haproxy
    }
}

On the backup server, change state MASTER to state BACKUP and priority 101 to priority 100. Now, if HAProxy crashes, the IP 10.0.0.100 automatically moves to the backup node.

Kernel Tuning for Heavy Loads

Out of the box, the Linux networking stack is tuned for modest traffic. For a load balancer, we need to open the floodgates. Edit /etc/sysctl.conf:

# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65000

# Protect against SYN flood attacks
net.ipv4.tcp_syncookies = 1

Run sysctl -p to apply. These settings are crucial when you are handling thousands of short-lived HTTP connections.

Data Integrity and Privacy

Operating in Norway means adhering to strict standards. The Datatilsynet (Data Inspectorate) takes the Personal Data Act seriously. When you architect a solution like this, ensure your logs (which contain IP addresses) are rotated frequently and stored securely.

Using a domestic provider like CoolVDS simplifies this. Your data stays within Norwegian borders, simplifying compliance with EU directives compared to hosting with US giants where "Safe Harbor" is the only (and increasingly shaky) protection.

Performance: The Disk I/O Factor

Even with a perfect load balancer, your backend databases can become a bottleneck. While most providers are still spinning 7.2k SATA drives, high-performance setups demand better. We recommend using servers backed by 15k SAS RAID-10 arrays or the emerging enterprise SSD technology available on select CoolVDS plans. Low disk latency prevents your database threads from locking up, ensuring the load balancer doesn't have to queue requests.

Conclusion

Scale is not about magic; it is about architecture. By placing HAProxy in front of your application, you decouple the client connection from the application logic. You gain stability, the ability to perform maintenance without downtime, and resilience against traffic spikes.

But software is only half the equation. You need hardware that doesn't steal your CPU cycles and a network that respects the physics of latency.

Ready to build your cluster? Don't settle for oversold hosting. Deploy a high-performance, KVM-based instance on CoolVDS today and see the difference raw power makes.