The 3:00 AM Wake-Up Call
If you have been in the trenches of system administration long enough, you know the sound of a pager going off when a marketing campaign succeeds a little too well. Last week, a client of mine running a Magento storefront got featured on a major Norwegian news portal. Traffic spiked from 50 concurrents to 3,000 in under two minutes. Their single Apache server didn't just stall; it fell over and died due to MaxClients exhaustion and RAM swapping.
The solution wasn't to throw more RAM at a single box. The solution was horizontal scaling with a robust load balancer. While Nginx is gaining traction as a reverse proxy, for pure, unadulterated packet shifting and complex Layer 7 logic, HAProxy 1.4 remains the weapon of choice for serious infrastructure. It is the gold standard for a reason.
Why HAProxy?
HAProxy (High Availability Proxy) is a single-threaded, event-driven, non-blocking engine. It doesn't fork processes like Apache does, meaning it can handle tens of thousands of concurrent connections with a negligible memory footprint. In a market where we are fighting for every millisecond of latency—especially here in Norway where routing via NIX (Norwegian Internet Exchange) is critical for local speed—efficiency is paramount.
But software is only half the battle. You can have the most optimized haproxy.cfg in the world, but if your VPS provider is putting you on oversubscribed storage, your logs will block I/O, and your latency will spike. This is why for this setup, I am using a CoolVDS instance. They use strict KVM virtualization (no OpenVZ noisy neighbors) and their disk I/O throughput is actually sufficient to handle verbose logging under load without choking the CPU.
Prerequisites & Installation
We are assuming a standard CentOS 6.2 environment. While you can compile from source (and often should for the absolute latest 1.4.x patches), for the sake of maintainability, we will grab it from the EPEL repository.
First, ensure your system is up to date and grab the tools:
yum install epel-release -y
yum install haproxy -y
chkconfig haproxy onBefore we touch the config, we need to prep the kernel. Linux default settings are tuned for desktop usage, not for a high-traffic load balancer. We need to allow the process to bind to non-local IP addresses (essential for Failover setups) and handle more open files.
Pro Tip: Never deploy a load balancer without tuning sysctl.conf. You will hit the `nf_conntrack` limit before you run out of CPU.Edit /etc/sysctl.conf:
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
# Increase system-wide file descriptors
fs.file-max = 100000
# Allow more connections to handle TIME_WAIT sockets
net.ipv4.netfilter.ip_conntrack_max = 131072Apply changes with sysctl -p.
The Configuration Strategy
We are going to configure HAProxy to sit in front of two web servers (Backend A and Backend B). We will use Round Robin for distribution, but with a twist: we will use cookie insertion for session persistence. This is mandatory for applications like Magento or PHP sessions, otherwise, users will get logged out as they bounce between servers.
Here is a battle-tested /etc/haproxy/haproxy.cfg:
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4096
user haproxy
group haproxy
daemon
# Spreading checks ensures we don't hammer backends simultaneously
stats socket /var/lib/haproxy/stats
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
# The Frontend: Where traffic hits
frontend main_webapp
bind *:80
acl url_static path_beg -i /static /images /javascript /stylesheets
acl url_static path_end -i .jpg .gif .png .css .js
use_backend static if url_static
default_backend app_servers
# The Backend: Where traffic goes
backend app_servers
balance roundrobin
cookie SERVERID insert indirect nocache
option httpchk GET /health_check.php
server web01 192.168.10.2:80 check cookie s1
server web02 192.168.10.3:80 check cookie s2
backend static
balance roundrobin
server static01 192.168.10.4:80 checkDissecting the Logic
There are a few critical directives here that separate the amateurs from the pros:
option http-server-close: In HTTP/1.1, keep-alive is great, but on a load balancer, maintaining open connections to the backend for too long wastes slots. This option allows the client side to keep-alive while closing the backend connection faster.cookie SERVERID insert: HAProxy injects a cookie named `SERVERID`. If a user lands on `web01`, their browser stores `s1`. HAProxy reads this on the next request and routes them back to `web01`. Simple, effective persistence.option httpchk: Do not just check if port 80 is open. The Apache process might be zombie, yet the port is open. We request a specific PHP file (`/health_check.php`). If that script doesn't return HTTP 200, the node is pulled from rotation.
Handling the Logs
In Norway, under the Personopplysningsloven, we have specific obligations regarding data retention. You need to know what is happening, but you also need to respect privacy. HAProxy logs are incredibly verbose.
By default, HAProxy sends logs to syslog UDP. You need to configure rsyslog to catch them. Create a file at /etc/rsyslog.d/haproxy.conf:
local2.* /var/log/haproxy.logThen restart rsyslog: service rsyslog restart.
Warning on Disk I/O: This is where cheap VPS providers fail. Writing thousands of log lines per second requires low I/O wait times. If your disk queue length spikes, HAProxy can stutter. I’ve benchmarked CoolVDS SSD instances against standard SAS drives, and the difference in log write latency is noticeable. When you are pushing 500 requests per second, that SSD speed prevents the logging daemon from becoming a bottleneck.
Monitoring with the Stats Page
HAProxy 1.4 includes a built-in statistics dashboard. It is ugly, but it gives you real-time data on sessions, byte rates, and server health. Secure it properly, or the whole world will see your infrastructure.
Add this to your haproxy.cfg:
listen stats *:1936
stats enable
stats uri /
stats hide-version
stats auth admin:SuperSecretPassword123Reload the service: service haproxy reload. Now navigate to port 1936. You will see your backend health in real-time. If a server turns red, you know exactly where to look.
Conclusion: Stability is a Choice
Load balancing isn't just about handling traffic; it's about sleeping at night. By decoupling the entry point from the application logic, you gain the ability to patch servers without downtime (drain a node, patch it, put it back).
Remember, a load balancer is a force multiplier. It multiplies your capacity, but it also multiplies the impact of a slow network or poor I/O. Do not build a Ferrari engine and put it on bicycle tires. Ensure your underlying infrastructure—specifically your network throughput and disk speed—is up to the task. For my Norwegian deployments, the combination of HAProxy 1.4 and the low-latency, KVM-based environment of CoolVDS has proven to be the architecture that survives the storm.
Next Steps: Check your current MaxClients setting in Apache. If you are near the limit, it's time to deploy a load balancer before your next traffic spike makes the decision for you.