Console Login
Home / Blog / System Administration / Survive the Digg Effect: High-Availability Load Balancing with HAProxy 1.3
System Administration 1 views

Survive the Digg Effect: High-Availability Load Balancing with HAProxy 1.3

@

I still remember the first time a client's site made it to the front page of Digg. It should have been a celebration. Instead, it was a funeral. Their single Apache server, configured with the default httpd.conf, hit its MaxClients limit in four minutes. The server didn't just slow down; it locked up so hard we had to issue a hard reboot via the IPMI console.

If you are running a serious business in 2009, relying on a single box to handle both your database and your web serving is negligence. Hardware fails. Traffic spikes happen. If you want to sleep through the night, you need architecture, not just a bigger server.

Enter HAProxy. While F5 Big-IP hardware load balancers cost as much as a new car, HAProxy offers enterprise-grade load balancing for free—if you know how to configure it.

The Architecture of Availability

The goal is simple: decouple the request entry point from the application logic. By placing a lightweight HAProxy instance in front of two (or more) web servers, you gain three things immediately:

  1. Fault Tolerance: If Web-01 dies, HAProxy routes traffic to Web-02 instantly.
  2. Scalability: You can add Web-03 without changing DNS records.
  3. Performance: HAProxy manages connections more efficiently than Apache Prefork.

For this setup, we assume you are running CentOS 5.3 or Debian Lenny. You will need a dedicated VDS for the load balancer. Since HAProxy is CPU-light but network-heavy, this is where the underlying infrastructure matters. A CoolVDS instance with a clean 100Mbps uplink to NIX (Norwegian Internet Exchange) is ideal here because we need low latency packet forwarding, not raw number-crunching power.

Configuring HAProxy 1.3

First, install HAProxy. It's in the standard repositories, but I recommend compiling version 1.3.15+ from source to get the latest stability patches.

Here is a battle-tested /etc/haproxy/haproxy.cfg that handles session persistence (crucial for PHP applications) and health checking:

global log 127.0.0.1 local0 maxconn 4096 user haproxy group haproxy daemon defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000 listen web-farm 0.0.0.0:80 mode http stats enable stats uri /haproxy?stats balance roundrobin cookie SERVERID insert indirect nocache option httpclose option forwardfor server web01 192.168.1.10:80 cookie A check server web02 192.168.1.11:80 cookie B check

Breaking Down the Config

The balance roundrobin directive ensures traffic is distributed evenly. However, PHP sessions are usually stored locally in /var/lib/php/session. If a user hits Web-01 for login and Web-02 for the dashboard, they will be logged out.

The cookie SERVERID insert line fixes this. HAProxy injects a cookie into the browser, ensuring the user sticks to the same backend server for their session duration. No complex memcached session sharing required (though you should look into that for the future).

Pro Tip: Handle the File Descriptors
HAProxy will crash if it runs out of file descriptors. The default Linux limit is 1024. In /etc/sysctl.conf, ensure you have sufficient range, and run ulimit -n 65535 before starting the daemon. We configure this by default on CoolVDS templates because debugging "Too many open files" at 3 AM is miserable.

The Hardware Reality Check

Virtualization has come a long way, but noisy neighbors are still the enemy of load balancers. If you are on a cheap, oversold VPS where the host node is thrashing its disks, your load balancer will introduce latency. This defeats the purpose.

You need consistent I/O. While we are starting to see early SSD adoption in the enterprise, a robust RAID-10 setup with 15k RPM SAS drives is still the reliability king for 2009. At CoolVDS, we prioritize disk I/O scheduling so that your syslog writes don't block your network packets. If your load balancer lags, your whole infrastructure lags.

Compliance and Geography

For our Norwegian clients, physical location is not just about ping times—it's about the law. Under the Personopplysningsloven (Personal Data Act of 2000), you have strict obligations regarding where customer data is processed.

Hosting your front-end load balancer in the US (to save a few kroner) while your database is in Oslo puts you in a gray area regarding data transfer mechanisms like Safe Harbor. It is cleaner, faster, and legally safer to keep the entire stack within the EEA, preferably in Norway where the Datatilsynet has jurisdiction. Latency from Oslo to a datacenter in Germany might be 30ms, but latency to a CoolVDS node in Oslo is <2ms. For a high-transaction e-commerce store, that speed difference directly impacts conversion rates.

Final Thoughts

Redundancy is insurance. You hope you never need the second web server to take the full load, but when the primary drive fails or the kernel panics, you will be glad it's there.

Don't wait for the crash. Spin up a secondary web node and a load balancer today. If you need a sandbox to test your HAProxy config, a CoolVDS instance can be provisioned in minutes, giving you a clean, dedicated environment to break things before you go live.

/// TAGS

/// RELATED POSTS

Paranoid Security: Hardening Your Linux VPS Against 2011's Threat Landscape

It's 2011 and LulzSec is on the loose. Default configurations are a death sentence. Here is the batt...

Read More →

IPv4 is Dead: A Battle-Hardened Guide to Native IPv6 Deployment

IANA officially ran out of IPv4 blocks in February. If you aren't dual-stacking now, your infrastruc...

Read More →

Surviving the Digg Effect: High-Availability Load Balancing with HAProxy on Linux

Is your Apache server ready for a massive traffic spike? Learn how to implement HAProxy 1.3 for robu...

Read More →

Xen Virtualization: The Definitive Guide for High-Performance Hosting

Stop gambling with oversold resources. We analyze the Xen hypervisor architecture (Dom0 vs DomU), Pa...

Read More →

MySQL 5.1 Performance Tuning: Surviving High Load on Norwegian VPS Infrastructure

Is your database locking up under traffic? We dive deep into my.cnf optimization, the InnoDB vs MyIS...

Read More →
← Back to All Posts