Scaling Past the Breaking Point: HAProxy Load Balancing on Linux
It starts with a slow page load. Then a timeout. Then, the dreaded 503 Service Unavailable. If you are running a single Apache instance for a growing e-commerce site or a media portal here in Norway, you are sitting on a ticking time bomb. I’ve seen it happen too many times: a marketing email goes out, traffic spikes, and the server melts because MaxClients was reached.
Hardware upgrades only delay the inevitable. You don't need a bigger server; you need more servers working in unison. In January 2011, the smartest way to handle this isn't expensive hardware load balancers like F5 Big-IP—it's open-source, lightweight, and battle-tested: HAProxy.
The Architecture: Why HAProxy?
HAProxy (High Availability Proxy) creates a funnel. Instead of traffic hitting your web server directly, it hits the HAProxy instance first. HAProxy then distributes that traffic across a backend pool of web servers. If one server dies, HAProxy detects it and stops sending traffic there. Your users never know the difference.
For a Norwegian business targeting local customers, architecture matters. Placing your load balancer in a datacenter with direct peering to NIX (Norwegian Internet Exchange) in Oslo ensures that the extra hop adds negligible latency. This is why we architect CoolVDS on top of premium low-latency infrastructure; milliseconds cost money.
The Setup
Let's assume you are running CentOS 5.5 or Debian Lenny. We will use HAProxy 1.4, which brought us significant performance improvements and better health checks.
Scenario:
- Load Balancer (LB01): Public IP (e.g., 85.x.x.x)
- Web Server A (WEB01): Internal IP 10.0.0.10
- Web Server B (WEB02): Internal IP 10.0.0.11
Installation and Configuration
On the load balancer node, install the package. If it's not in your base repo, you might need the EPEL repository for CentOS.
yum install haproxy
Now, we need to edit /etc/haproxy/haproxy.cfg. Most default configs are garbage for high traffic. Wipe it and start with this production-ready baseline:
global
log 127.0.0.1 local0
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen webfarm 0.0.0.0:80
mode http
stats enable
stats uri /haproxy?stats
stats realm Haproxy\ Statistics
stats auth admin:securepass123
balance roundrobin
option httpclose
option forwardfor
cookie JSESSIONID prefix
server web01 10.0.0.10:80 cookie A check
server web02 10.0.0.11:80 cookie B check
Breaking Down the Config
balance roundrobin: Distributes requests sequentially. Usesourceif you need the same IP to hit the same server without cookies, but for web apps, sticking via cookies is usually better.option forwardfor: This is critical. Since the web servers only see the Load Balancer's IP, this adds theX-Forwarded-Forheader so your Apache logs show the real visitor IP.check: HAProxy will poll port 80. If the server doesn't answer, it's removed from the rotation instantly.
Persistence and Sessions
One of the biggest pain points in moving to a cluster is session handling. If a user logs into WEB01, and their next click goes to WEB02, they might get logged out if sessions are stored in /tmp on the disk.
The configuration above uses cookie JSESSIONID prefix. HAProxy injects a cookie telling the browser which server handled the request. If WEB01 processed the login, HAProxy ensures WEB01 gets the next request. This is the "sticky session" method. Ideally, you should move sessions to a shared memcached backend or a database, but sticky sessions are a fast fix for legacy PHP applications.
The Hardware Reality: IOPS and Latency
Software optimization is only half the battle. Virtualization overhead can kill load balancer performance if the underlying host is oversold. In 2011, many providers are still cramming OpenVZ containers onto SATA drives.
At CoolVDS, we take a different approach. We use KVM (Kernel-based Virtual Machine) for true hardware virtualization. This prevents the "noisy neighbor" effect where another customer's runaway script steals your CPU cycles. Furthermore, for database backends, we are aggressively rolling out enterprise-grade SSD storage. While standard SAS drives struggle at 150 IOPS, our SSD arrays are pushing thousands. If you are doing serious database work behind your load balancer, spinning rust is your bottleneck.
Pro Tip: To survive a DDoS attack, you need to tune your sysctl.conf. Add net.ipv4.tcp_syncookies = 1 and increase your backlog. HAProxy can handle the connection management, but the kernel needs to allow it.
Data Sovereignty and Datatilsynet
Operating in Norway means adhering to the Personal Data Act (Personopplysningsloven). Unlike hosting in the US under the Patriot Act, hosting locally in Oslo ensures you remain compliant with EU Data Directives. When you balance traffic, ensure all backend nodes are within the same legal jurisdiction to avoid cross-border data transfer headaches.
Whether you are running a high-traffic forum or a corporate intranet, reliability is not optional. Configure HAProxy, tune your kernel, and choose a host that respects the physics of latency.
Ready to stabilize your infrastructure? Deploy a KVM instance on CoolVDS today and get direct root access in under 2 minutes.