Surviving the Traffic Spike: High Availability with HAProxy and Keepalived
There is no sound more terrifying to a sysadmin than silence. When the pager stops buzzing, the logs stop scrolling, and the latency flatlines not because it's fast, but because the connection timed out—that is when the panic sets in. I've been there. In 2010, I watched a perfectly good marketing campaign turn into a PR disaster because a single LAMP server choked on the connection limit the moment the newsletter went out.
If you are still serving your production application from a single VPS, you are gambling with your uptime. It doesn't matter how much RAM you throw at Apache; eventually, the MaxClients directive will be your bottleneck.
Today, we aren't just tweaking configs; we are building a fortress. We are going to deploy HAProxy 1.4 for Layer 7 load balancing, backed by Keepalived for IP failover. This is the exact architecture I use for high-traffic clients targeting the Norwegian market, ensuring that even if a server melts, the site stays up.
The Architecture: Why HAProxy?
Hardware load balancers like F5 Big-IP are fantastic if you have the budget of a bank. For the rest of us, HAProxy is the industry standard. It is an event-driven, non-blocking engine that can push gigabits of traffic on modest hardware. Unlike Nginx (which is great, but currently lacks some of the advanced health-check features found in HAProxy's stable branch), HAProxy is purely focused on moving packets efficiently.
Our setup will look like this:
- 2x Load Balancers (LB01, LB02): Running HAProxy + Keepalived.
- Floating IP (VIP): The public IP that floats between LBs.
- 2x Web Nodes (Web01, Web02): Running Nginx or Apache.
Pro Tip: When choosing a hosting provider for this setup, avoid OpenVZ containers. You need kernel-level control for sysctl tuning and IP binding. We use CoolVDS because their KVM virtualization guarantees that our resources aren't stolen by a noisy neighbor, and they allow the custom networking required for VRRP (Keepalived). Plus, their latency to NIX (Norwegian Internet Exchange) is consistently under 2ms.
Step 1: Installing the Tools
We are using Ubuntu 12.04 LTS (Precise Pangolin). It’s stable, supported until 2017, and has the packages we need in the repo.
sudo apt-get update
sudo apt-get install haproxy keepalived
By default, HAProxy is disabled in Ubuntu. You need to enable it in /etc/default/haproxy:
ENABLED=1
Step 2: Configuring HAProxy 1.4
Edit /etc/haproxy/haproxy.cfg. We are going to configure it to listen on port 80 and balance traffic between our web nodes. Note that HAProxy 1.4 doesn't support native SSL termination nicely yet (you'd need stunnel or wait for 1.5-dev), so for this guide, we are assuming SSL is handled at the web server level or stripped before hitting the LB.
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen webfarm 0.0.0.0:80
mode http
stats enable
stats uri /haproxy?stats
stats realm Strictly\ Private
stats auth admin:password123
balance roundrobin
option httpclose
option forwardfor
server web01 10.0.0.10:80 check
server web02 10.0.0.11:80 check
Critical Setting: option forwardfor is mandatory. Without it, your backend web servers will see all traffic coming from the load balancer's IP, not the actual client's IP. This ruins your analytics and makes IP-based banning impossible.
Step 3: High Availability with Keepalived
If your single HAProxy server dies, your site goes down. That defeats the purpose. We use Keepalived to manage a Virtual IP (VIP). This IP address is shared between LB01 (Master) and LB02 (Backup).
You must enable non-local binding in /etc/sysctl.conf so the backup server can bind to an IP it doesn't currently "own":
net.ipv4.ip_nonlocal_bind = 1
Run sysctl -p to apply it.
The Keepalived Config
On LB01 (Master), edit /etc/keepalived/keepalived.conf:
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 101
virtual_ipaddress {
192.168.1.100
}
track_script {
chk_haproxy
}
}
On LB02 (Backup), the config is similar, but state is BACKUP and priority is lower (100).
vrrp_instance VI_1 {
interface eth0
state BACKUP
virtual_router_id 51
priority 100
virtual_ipaddress {
192.168.1.100
}
track_script {
chk_haproxy
}
}
This script checks if HAProxy is running every 2 seconds. If HAProxy dies on the Master, the priority drops, and the Backup node takes over the VIP instantly. Your users won't even notice.
Datatilsynet and Logs
Operating in Norway involves strict adherence to the Personal Data Act (Personopplysningsloven). When you configure option forwardfor and log IPs, you are processing personal data. Ensure your rsyslog configs rotate logs frequently and that you aren't storing this data longer than necessary. We advise clients to keep raw access logs on a separate, encrypted partition.
The Hardware Reality
Virtualization overhead is real. Many providers oversell their CPU cycles, leading to "steal time" that kills load balancer performance. HAProxy is CPU-bound when handling thousands of concurrent connections.
| Feature | Shared Hosting / OpenVZ | CoolVDS (KVM) |
|---|---|---|
| Kernel Access | Shared (Restricted) | Dedicated (Full Control) |
| Network Stack | Shared Buffer | Isolated |
| Disk I/O | Often SATA Spinners | Enterprise SSD RAID |
For a load balancer, I/O latency isn't the primary bottleneck—network throughput and CPU interrupt handling are. However, if your logs are writing to a slow disk, the whole system blocks. That is why we deployed our infrastructure on CoolVDS. Their use of Enterprise SSDs (a rarity in the current VPS market) ensures that log writes never block the application, even under heavy load.
Final Checks
Before you go live, test the failover. Run a continuous ping to your VIP and shut down the Master load balancer.
# On your local machine
ping 192.168.1.100
If configured correctly, you might drop one packet, or none at all. That is the power of VRRP.
Don't wait for your server to crash during the holiday rush. Building redundancy takes an afternoon; recovering data from a crashed server takes a lifetime. Spin up a KVM instance on CoolVDS today and build an infrastructure that lets you sleep at night.