The 3:00 AM Wake-Up Call
If you have been in the trenches of systems administration long enough, you know the sound. The buzz of the pager or the relentless vibration of the phone at 3:00 AM. Your primary web server just hit 100% CPU, the swap is thrashing on your spinning rusty disks, and Apache has given up the ghost. You are offline.
For too many startups in the Oslo tech scene, the architecture strategy is still "buy a bigger server when the old one breaks." Vertical scaling has a hard limit. You can only shove so much RAM into a single chassis before the cost becomes astronomical and the returns diminish.
The solution isn't bigger hardware; it's smarter architecture. In 2012, you don't need a $20,000 F5 Big-IP hardware appliance to handle traffic like a pro. You need HAProxy. It is the Swiss Army knife of TCP/HTTP load balancing, and when paired with the solid I/O performance of a CoolVDS instance, it allows you to scale horizontally for a fraction of the cost.
Pro Tip: Hardware load balancers are inflexible. By virtualizing your load balancer on a Linux VDS, you can snapshot, clone, and migrate your traffic entry point in minutes, not days.
Why HAProxy?
HAProxy (High Availability Proxy) stands as the de-facto standard for open-source load balancing. It is event-driven, meaning it can handle tens of thousands of concurrent connections without eating up your RAM like Apache’s prefork worker model does. It simply passes packets where they need to go, extremely fast.
Here is the architecture we are building today:
- 1x Load Balancer Node: Running HAProxy on CoolVDS (CentOS 6.2).
- 2x Web Nodes: Running Apache/Nginx serving your application.
By placing your servers in a local Norwegian datacenter, specifically one peering at NIX (Norwegian Internet Exchange), you ensure that the extra hop through the load balancer doesn't add perceptible latency for your local users. This is where the choice of hosting provider becomes technical, not just financial. You need low jitter and stable routes.
Step 1: Installation on CentOS 6
Let's assume you have provisioned a fresh CoolVDS instance. We will use the standard repositories or EPEL (Extra Packages for Enterprise Linux) to get HAProxy 1.4. Don't compile from source unless you absolutely need custom patches.
# Import EPEL key
rpm --import https://fedoraproject.org/static/0608B895.txt
# Install the EPEL release
rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-5.noarch.rpm
# Install HAProxy
yum install haproxy
# Enable it on boot
chkconfig haproxy onStep 2: Configuration - The Meat of the Matter
The default configuration is useless for our needs. We are going to wipe /etc/haproxy/haproxy.cfg and build a robust config that handles HTTP traffic and checks if your web servers are actually alive.
A critical "War Story" lesson: I once saw a cluster go down because the load balancer kept sending traffic to a web node that was pinging successfully but returning 500 Internal Server Errors. We must use Layer 7 health checks, not just TCP checks.
Here is a production-ready template:
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
frontend http_front
bind *:80
# ACL to separate static content if needed
acl url_static path_beg /images /stylesheets /javascripts
use_backend static_servers if url_static
default_backend web_servers
backend web_servers
mode http
balance roundrobin
# The "check" keyword enables health checks
server web01 10.0.0.2:80 check inter 2000 rise 2 fall 3
server web02 10.0.0.3:80 check inter 2000 rise 2 fall 3
# Enable statistics page (Password protect this!)
stats enable
stats uri /haproxy?stats
stats auth admin:supersecretpasswordUnderstanding the Configuration
- balance roundrobin: This rotates requests sequentially (Server A -> Server B -> Server A). For stateless apps, this is perfect. If you have a PHP session heavy app (like Magento), you might need
balance sourceto keep a user on the same IP. - check inter 2000: HAProxy checks the backend every 2000ms. If it fails 3 times (
fall 3), it removes the node from the pool. No more 500 errors for your users. - stats uri: This gives you a dashboard to see exactly how much bandwidth each node is pushing.
Step 3: Sysctl Tuning for High Load
Out of the box, the Linux kernel isn't tuned for the thousands of state transitions a load balancer sees. If you expect heavy traffic, you need to modify /etc/sysctl.conf to allow more open files and handle TIME_WAIT sockets faster. This is often overlooked by budget VPS providers, but on CoolVDS, you have full root control to tune the kernel.
# /etc/sysctl.conf
# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65023
# Increase max open files (requires limits.conf changes too)
fs.file-max = 100000Run sysctl -p to apply these changes.
Data Privacy and Logging (The Norwegian Context)
Working within Norway, we must respect the Personopplysningsloven (Personal Data Act). When you enable option httplog, HAProxy captures client IP addresses. If you are storing these logs, you are processing personal data. The Datatilsynet (Data Inspectorate) is clear on this: do not store IPs longer than necessary for security purposes.
Ensure your log rotation is aggressive. On a CoolVDS setup, I recommend setting up a cron job to zip and archive logs to a separate secure storage partition and purging them after 30 days unless you have a specific legal requirement to keep them.
Performance Trade-offs: Virtualization Overhead
There is a myth that you need bare metal for load balancing. In 2005, maybe. In 2012, with modern hypervisors, the overhead is negligible if the host node is not oversold.
This is where