I still remember the first time a client's site made it to the front page of Digg. It should have been a celebration. Instead, it was a funeral. Their single Apache server, configured with the default httpd.conf, hit its MaxClients limit in four minutes. The server didn't just slow down; it locked up so hard we had to issue a hard reboot via the IPMI console.
If you are running a serious business in 2009, relying on a single box to handle both your database and your web serving is negligence. Hardware fails. Traffic spikes happen. If you want to sleep through the night, you need architecture, not just a bigger server.
Enter HAProxy. While F5 Big-IP hardware load balancers cost as much as a new car, HAProxy offers enterprise-grade load balancing for free—if you know how to configure it.
The Architecture of Availability
The goal is simple: decouple the request entry point from the application logic. By placing a lightweight HAProxy instance in front of two (or more) web servers, you gain three things immediately:
- Fault Tolerance: If Web-01 dies, HAProxy routes traffic to Web-02 instantly.
- Scalability: You can add Web-03 without changing DNS records.
- Performance: HAProxy manages connections more efficiently than Apache Prefork.
For this setup, we assume you are running CentOS 5.3 or Debian Lenny. You will need a dedicated VDS for the load balancer. Since HAProxy is CPU-light but network-heavy, this is where the underlying infrastructure matters. A CoolVDS instance with a clean 100Mbps uplink to NIX (Norwegian Internet Exchange) is ideal here because we need low latency packet forwarding, not raw number-crunching power.
Configuring HAProxy 1.3
First, install HAProxy. It's in the standard repositories, but I recommend compiling version 1.3.15+ from source to get the latest stability patches.
Here is a battle-tested /etc/haproxy/haproxy.cfg that handles session persistence (crucial for PHP applications) and health checking:
global
log 127.0.0.1 local0
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen web-farm 0.0.0.0:80
mode http
stats enable
stats uri /haproxy?stats
balance roundrobin
cookie SERVERID insert indirect nocache
option httpclose
option forwardfor
server web01 192.168.1.10:80 cookie A check
server web02 192.168.1.11:80 cookie B check
Breaking Down the Config
The balance roundrobin directive ensures traffic is distributed evenly. However, PHP sessions are usually stored locally in /var/lib/php/session. If a user hits Web-01 for login and Web-02 for the dashboard, they will be logged out.
The cookie SERVERID insert line fixes this. HAProxy injects a cookie into the browser, ensuring the user sticks to the same backend server for their session duration. No complex memcached session sharing required (though you should look into that for the future).
Pro Tip: Handle the File Descriptors
HAProxy will crash if it runs out of file descriptors. The default Linux limit is 1024. In/etc/sysctl.conf, ensure you have sufficient range, and runulimit -n 65535before starting the daemon. We configure this by default on CoolVDS templates because debugging "Too many open files" at 3 AM is miserable.
The Hardware Reality Check
Virtualization has come a long way, but noisy neighbors are still the enemy of load balancers. If you are on a cheap, oversold VPS where the host node is thrashing its disks, your load balancer will introduce latency. This defeats the purpose.
You need consistent I/O. While we are starting to see early SSD adoption in the enterprise, a robust RAID-10 setup with 15k RPM SAS drives is still the reliability king for 2009. At CoolVDS, we prioritize disk I/O scheduling so that your syslog writes don't block your network packets. If your load balancer lags, your whole infrastructure lags.
Compliance and Geography
For our Norwegian clients, physical location is not just about ping times—it's about the law. Under the Personopplysningsloven (Personal Data Act of 2000), you have strict obligations regarding where customer data is processed.
Hosting your front-end load balancer in the US (to save a few kroner) while your database is in Oslo puts you in a gray area regarding data transfer mechanisms like Safe Harbor. It is cleaner, faster, and legally safer to keep the entire stack within the EEA, preferably in Norway where the Datatilsynet has jurisdiction. Latency from Oslo to a datacenter in Germany might be 30ms, but latency to a CoolVDS node in Oslo is <2ms. For a high-transaction e-commerce store, that speed difference directly impacts conversion rates.
Final Thoughts
Redundancy is insurance. You hope you never need the second web server to take the full load, but when the primary drive fails or the kernel panics, you will be glad it's there.
Don't wait for the crash. Spin up a secondary web node and a load balancer today. If you need a sandbox to test your HAProxy config, a CoolVDS instance can be provisioned in minutes, giving you a clean, dedicated environment to break things before you go live.