Scaling Beyond the Single Box: High Availability with HAProxy on KVM

Let’s be honest for a second. If you are still relying on DNS round-robin to handle your traffic, you are playing Russian Roulette with your uptime. I’ve seen it a dozen times: one backend server hangs on a heavy PHP process, DNS keeps sending it traffic, and suddenly your entire infrastructure cascades into oblivion. Your clients scream, your boss panics, and you’re stuck in a terminal at 3 AM trying to restart Apache.

It is 2013. We have better tools. We have HAProxy.

In this guide, we aren't just installing a package. We are architecting a failover-capable load balancing layer using HAProxy 1.4 on CentOS 6.4. We will look at why reliable KVM-based virtualization—like the instances we provision at CoolVDS—is critical for this setup, especially when low latency to NIX (Norwegian Internet Exchange) is non-negotiable.

The "War Story": Why Shared Resources Kill Load Balancers

Last month, a client came to me with a Magento store hosted on a generic OpenVZ container from a budget provider in Germany. They were prepping for a massive seasonal sale. During load testing with `ab` (Apache Bench), their load balancer started dropping packets.

The config was fine. The problem? Steal time.

Because OpenVZ shares the kernel, a "noisy neighbor" on the host node was thrashing the CPU. The load balancer couldn't process the SYN packets fast enough. We migrated them to a CoolVDS KVM instance. Why? Because KVM provides hard resource isolation. The steal time dropped to 0.0%, and the load balancer handled 4,000 requests per second without blinking. If you are serious about traffic, you cannot share your kernel.

Installing HAProxy on CentOS 6

First, let's get the software. The default repositories are decent, but for the latest stable patches, ensure you are updated.

[root@lb01 ~]# yum install haproxy
[root@lb01 ~]# chkconfig haproxy on

Don't start it yet. The default config is useless for our needs. We need to configure it to handle Layer 7 switching—routing traffic based on the requested domain or path.

Configuration: The Meat and Potatoes

Backup your original config:

[root@lb01 ~]# mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak
[root@lb01 ~]# vi /etc/haproxy/haproxy.cfg

Here is a battle-tested configuration block. This setup separates static assets from dynamic PHP requests, optimizing how we use our backend resources.

global
    log         127.0.0.1 local2
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon
    # Spreading checks to avoid spiking backends simultaneously
    stats socket /var/lib/haproxy/stats

defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

frontend  main_http_in
    bind *:80
    
    # ACLs for separating traffic
    acl url_static       path_beg       -i /static /images /javascript /stylesheets
    acl url_static       path_end       -i .jpg .gif .png .css .js

    use_backend static_servers          if url_static
    default_backend             app_servers

backend static_servers
    balance     roundrobin
    server      static01 10.0.0.10:80 check
    server      static02 10.0.0.11:80 check

backend app_servers
    balance     leastconn
    # Sticky sessions based on JSESSIONID or PHPSESSID are crucial for apps
    cookie      SERVERID insert indirect nocache
    server      app01 10.0.0.20:80 check cookie app01
    server      app02 10.0.0.21:80 check cookie app02

Understanding `balance leastconn`

In the `app_servers` backend, notice we use `balance leastconn`. For static files, Round Robin is fine because serving a JPG takes a predictable amount of time. But for a PHP or Ruby application, one request might take 50ms (cached) and another might take 3 seconds (complex SQL query). `leastconn` ensures that the server currently stuck processing the heavy SQL query doesn't get hammered with new requests. It is a simple change that drastically smoothes out latency.

High Availability with Keepalived

A load balancer is great, but if the load balancer itself dies, you go offline. This is where Keepalived comes in, using VRRP (Virtual Router Redundancy Protocol) to float a Virtual IP (VIP) between two CoolVDS instances.

Pro Tip: When configuring VRRP on a VPS, ensure your provider supports multicast or allows multiple MAC addresses on a single switch port. At CoolVDS, our network stack is built to handle VRRP traffic natively, so you don't face the "split-brain" scenarios common with cheaper hosts.

Install keepalived:

[root@lb01 ~]# yum install keepalived

Configure `/etc/keepalived/keepalived.conf` on the MASTER node:

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    interface eth0
    state MASTER
    virtual_router_id 51
    priority 101
    virtual_ipaddress {
        192.168.10.50
    }
    track_script {
        chk_haproxy
    }
}

On the BACKUP node, set `state BACKUP` and `priority 100`. Now, 192.168.10.50 is your floating IP. Point your DNS A-record there.

The Importance of Local Latency

Why host this in Norway? It comes down to physics. If your target market is in Oslo or Bergen, hosting in a US datacenter adds 100ms+ of latency to every single TCP handshake. For an SSL negotiation, that round trip happens multiple times.

By placing your HAProxy instances on CoolVDS servers in Norway, you are often less than 5ms away from the NIX. This makes your site feel "instant." Furthermore, with Datatilsynet keeping a close watch on data handling practices, keeping your data on Norwegian soil simplifies your compliance with the Personal Data Act (Personopplysningsloven).

Monitoring and Tuning

Once you are live, you need visibility. HAProxy includes a stats page that is incredibly lightweight. Add this to your config:

listen stats *:1936
    stats enable
    stats uri /
    stats hide-version
    stats auth admin:SuperSecretPassword

Now navigate to port 1936. You will see real-time health checks. If a backend server goes red, HAProxy has already pulled it from rotation. No manual intervention required.

Kernel Tuning for High Throughput

Don't let the default Linux TCP stack bottleneck you. Add these to `/etc/sysctl.conf` to handle thousands of concurrent connections:

# Allow reuse of sockets in TIME_WAIT state
net.ipv4.tcp_tw_reuse = 1
# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65023
# Maximize the backlog for heavy traffic bursts
net.core.somaxconn = 2048

Apply with `sysctl -p`. This is essential for preventing connection slot exhaustion during DDoS attacks or viral traffic spikes.

Conclusion

Building a high-availability cluster wasn't easy five years ago, but in 2013, the tools are mature. With HAProxy 1.4 and Keepalived, you can build an infrastructure that rivals the giants.

However, software is only half the battle. You need hardware that doesn't lie to you. You need SSD I/O that doesn't choke when writing logs, and a KVM hypervisor that guarantees your CPU cycles are yours alone. That is exactly what we engineered at CoolVDS.

Don't wait for your next outage to upgrade. Spin up a pair of high-performance KVM instances on CoolVDS today and build a network that stays up, no matter what.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Scaling Beyond the Single Box: High Availability with HAProxy on KVM

Scaling Beyond the Single Box: High Availability with HAProxy on KVM

The "War Story": Why Shared Resources Kill Load Balancers

Installing HAProxy on CentOS 6

Configuration: The Meat and Potatoes

Understanding `balance leastconn`

High Availability with Keepalived

The Importance of Local Latency

Monitoring and Tuning

Kernel Tuning for High Throughput

Conclusion

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025