Surviving the Digg Effect: High-Availability Load Balancing with HAProxy
It starts with a slow page load. Then the connection timeouts begin. Finally, your SSH session lags, and you realize your single Apache server has hit MaxClients. It's swapping to disk. It's dying. And it's happening right in the middle of your biggest marketing push.
I've been there. Last month, a client's e-commerce site got featured on a major Norwegian news portal. Their single server architecture melted in minutes. We fixed it, but it was a painful lesson.
You don't need a $20,000 F5 Big-IP hardware appliance to fix this. You need HAProxy. It's the open-source solution handling traffic for heavyweights like Reddit, and today I'm going to show you how to set it up on a CoolVDS Xen instance.
The Architecture: Decoupling for Stability
Stop serving static assets, PHP processing, and database queries from one box. That is a single point of failure. The goal is horizontal scalability.
We will place a lightweight HAProxy load balancer in front of two (or more) web servers (Web01 and Web02). If Web01 dies, HAProxy detects it instantly and routes traffic to Web02. Your users see zero downtime.
Pro Tip: Latency kills conversion. Placing your load balancer in Germany while your target market is in Norway adds unnecessary milliseconds. CoolVDS instances are peered directly at NIX (Norwegian Internet Exchange) in Oslo. Keep the hops short.
Step 1: Installation on CentOS 5
We assume you have a clean CoolVDS instance running CentOS 5.2 or 5.3. Unlike Apache, HAProxy is event-driven. It doesn't fork a process for every connection, meaning it can handle thousands of concurrent connections with a tiny memory footprint.
First, grab the latest stable 1.3 branch (currently 1.3.18) or install via EPEL if you prefer RPMs, though compiling gives you more control.
yum install haproxy
Step 2: The Configuration (haproxy.cfg)
Edit /etc/haproxy/haproxy.cfg. This is where the magic happens. We need to define the listener and the backend farm.
global
log 127.0.0.1 local0
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen webfarm 0.0.0.0:80
mode http
stats enable
stats auth admin:password
balance roundrobin
cookie SERVERID insert indirect nocache
option httpchk HEAD /check.txt HTTP/1.0
server web01 192.168.1.10:80 cookie A check
server web02 192.168.1.11:80 cookie B check
Breaking Down the Config
- balance roundrobin: Distributes requests sequentially. Simple and effective for stateless apps.
- cookie SERVERID: This is critical for PHP applications. It injects a cookie so a user stays on the same server during their session. Without this, their shopping cart empties if they get bounced to a different server.
- option httpchk: HAProxy doesn't just guess if the server is up; it checks. Create a simple
check.txtfile in your web root. If HAProxy can't read it, it pulls the server out of rotation.
Why Underlying Hardware Matters
Software load balancing is efficient, but it is I/O dependent. High traffic generates massive amounts of logging and state tracking.
Many VPS providers oversell their storage I/O, putting you on crowded SATA drives with "noisy neighbors." When their backup script runs, your load balancer chokes. At CoolVDS, we utilize enterprise-grade 15k RPM SAS RAID-10 arrays. We don't oversell IOPS. When you are routing 500 requests per second, that disk speed keeps the request queue empty.
Data Sovereignty and Datatilsynet
Operating in Norway isn't just about latency; it's about legality. Under the Personal Data Act (Personopplysningsloven), you have strict obligations regarding where your customer data lives.
Hosting your infrastructure on US-based servers (under the Patriot Act) creates a compliance headache. By keeping your load balancer and web nodes within our Oslo datacenter, you simplify compliance with Datatilsynet regulations. Your data stays on Norwegian soil.
The Verdict
Scaling isn't about buying a bigger server. It's about smart architecture. With HAProxy and a cluster of CoolVDS instances, you can handle traffic spikes that would crush a dedicated server costing five times as much.
Don't wait for the next downtime to upgrade your infrastructure. Spin up a test environment today.
Need a stable rock for your load balancer? Deploy a CoolVDS instance in Oslo. Provisioning takes just 60 seconds.