Surviving the Slashdot Effect: Robust Load Balancing with HAProxy
It is 3:00 AM. Your pager is buzzing. Nagios just sent a critical alert: Load Average: 25.04. Your single Apache server is thrashing swap because the marketing team sent a newsletter to 50,000 users at once. If you are still trying to solve this by editing httpd.conf and increasing MaxClients, you are fighting a losing war.
Vertical scaling—throwing more RAM and CPU at a single box—has a ceiling. The only way to survive a serious traffic spike, whether it's from a Digg frontpage feature or a busy holiday shopping season in Norway, is horizontal scaling.
Forget expensive hardware load balancers like F5 Big-IP. Unless you have an enterprise budget, they are overkill. Today, we are deploying HAProxy (High Availability Proxy). It is free, open-source, and arguably more stable than the hardware it replaces.
Why HAProxy?
HAProxy is a strictly event-driven, non-blocking engine. In plain English: it can handle thousands of concurrent connections without eating up your memory. While Apache forks a new process or thread for every connection (bloating memory usage), HAProxy forwards packets with minimal overhead.
At CoolVDS, we see clients try to balance traffic using DNS Round Robin. Do not do this. DNS caches do not respect TTLs instantly. If one web node dies, half your users will see a "Connection Refused" error for hours. HAProxy checks backend health and removes dead nodes automatically.
The Architecture
We are going to move from a single point of failure to a redundant setup:
- Load Balancer (LB01): A lightweight CoolVDS VPS running HAProxy.
- Web Nodes (WEB01, WEB02): Two standard VPS instances running Apache/PHP.
- Database: A separate MySQL node (out of scope for today, but essential).
Pro Tip: Network latency matters. Ensure your Load Balancer and Web Nodes are in the same datacenter. If your target audience is in Oslo, hosting in a US datacenter adds 100ms+ latency before the request even hits your server. Keep it local to the NIX (Norwegian Internet Exchange) for sub-10ms response times.
Configuration: The haproxy.cfg
First, install HAProxy. On CentOS 5, it is in the extras repository, or you can compile from source (version 1.3.18 is stable as of now).
yum install haproxy
Here is a production-ready configuration. We are using the leastconn algorithm, which sends new traffic to the server with the fewest active connections, rather than just rotating blindly (Round Robin).
global
log 127.0.0.1 local0
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen webfarm 0.0.0.0:80
mode http
stats enable
stats uri /haproxy?stats
stats auth admin:SuperSecretPass
balance leastconn
cookie SERVERID insert indirect nocache
option httpclose
option forwardfor
server web01 192.168.1.10:80 cookie A check
server web02 192.168.1.11:80 cookie B check
Breaking Down the Config
balance leastconn: Crucial for long sessions. If Web01 gets stuck processing a heavy PHP script, HAProxy sends the next user to Web02.cookie SERVERID: This enables Session Stickiness. If a user logs into your osCommerce store on Web01, they must stay on Web01. HAProxy injects a cookie to track this.stats uri: This creates a web interface showing you exactly how much traffic is hitting each node.
Infrastructure Matters: The Underlying Iron
Software optimization is useless if your host oversells the physical CPU. Many budget providers pack hundreds of OpenVZ containers onto a single server. If one neighbor runs a backup script, your I/O wait shoots through the roof.
This is why we strictly use Xen virtualization at CoolVDS. It provides better isolation. RAM is reserved, not shared. When you are balancing high loads, you need consistent disk I/O.
A Note on Disk Speed
While most of the web runs on standard 7.2k SATA drives, we are seeing a shift toward 15k RPM SAS drives and the emerging Enterprise SSD technology for database nodes. If your MySQL queries are locking up, no amount of load balancing will save you. You need faster disks. In our Oslo facility, we prioritize low-latency I/O configurations specifically to prevent the database from becoming the bottleneck.
Legal Compliance (Norwegian Context)
Operating in Norway requires adherence to the Personopplysningsloven (Personal Data Act). When you introduce a load balancer, you are introducing a new point where IP addresses (personal data) are processed. Ensure your haproxy logs are rotated frequently and that you are not storing access logs longer than necessary, per Datatilsynet guidelines.
Final Thoughts
Building a cluster used to require a Cisco certification. Now, with HAProxy and reliable VPS hosting, you can build a setup that handles millions of hits per day for the price of a dinner in Aker Brygge.
Do not wait for the crash. Deploy a test load balancer today. If you need a sandbox, spin up a CoolVDS instance; our Xen nodes boot in under 60 seconds.