Architecting High-Availability SOA: The HAProxy & Puppet Pattern for 2013
Let’s be honest: the era of the monolithic LAMP stack is ending. If you are still running a single httpd process serving your entire e-commerce platform, you are sitting on a time bomb. But the alternative—Service Oriented Architecture (SOA)—introduces a new nightmare: network latency and partial failure.
I recently consulted for a media streaming startup in Oslo. They broke their monolith into twelve different services. It looked great on the whiteboard. In production? Disaster. A 20ms delay in the User Service compounded across six downstream calls, resulting in a 5-second page load. Why? Because they treated internal service calls like function calls. They aren't.
The network is not reliable. In Norway, we have excellent connectivity via NIX (Norwegian Internet Exchange), but inside a datacenter, packet loss happens. Today, I’m going to show you how to implement a resilient service layer—what some of the bleeding-edge folks at Netflix are calling a "Service Mesh" concept—using HAProxy and Puppet.
The Problem: Hardware Load Balancers Are a Bottleneck
Traditionally, you put a big iron F5 or NetScaler in front of your servers. This works for ingress traffic. It fails miserably for east-west traffic (service-to-service). If Service A needs to talk to Service B, routing it out to a hardware load balancer and back adds unnecessary hops and a single point of failure.
The solution is Client-Side Load Balancing. We place a lightweight proxy on every single server (localhost) that manages traffic to downstream dependencies.
The Architecture: HAProxy as a Sidecar
We will use HAProxy 1.4 (stable) running on every application node. Your application talks to localhost:port, and HAProxy routes it to the correct backend server. This gives you:
- Intelligent Retries: If Backend Node 1 is down, HAProxy hits Node 2 instantly. Your app never sees the error.
- Connection Pooling: Save the TCP handshake overhead.
- Observability: HAProxy stats socket is a goldmine.
Implementation Details
First, you need a virtualization platform that respects SCHED_OTHER and doesn't steal CPU cycles. This architecture relies on micro-second forwarding decisions. We use CoolVDS because they run KVM (Kernel-based Virtual Machine). Unlike OpenVZ providers, CoolVDS guarantees that the kernel resources you need for high-concurrency TCP stacks are actually yours.
Step 1: Kernel Tuning for High Concurrency
Before installing HAProxy, you must tune the Linux network stack. The defaults in CentOS 6 or Ubuntu 12.04 are too conservative for high-throughput SOA.
# /etc/sysctl.conf
# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65023
# Maximize the backlog for incoming packets
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# Increase TCP buffer sizes for 10GbE networks (common in modern hosting)
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
Run sysctl -p to apply. If you skip this, you will hit connection limits regardless of your CPU power.
Step 2: The Local HAProxy Configuration
Here is a battle-tested configuration for HAProxy. This assumes we are routing traffic to a "User Profile Service" backend.
# /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
timeout connect 5000
timeout client 50000
timeout server 50000
# The "Service Mesh" Listener on Localhost
listen user_service_local
bind 127.0.0.1:8081
mode http
balance roundrobin
option httpchk GET /health
# The Backends (Managed dynamically via Puppet/Chef)
server user_srv_01 10.0.0.5:8080 check inter 2000 rise 2 fall 3
server user_srv_02 10.0.0.6:8080 check inter 2000 rise 2 fall 3
server user_srv_03 10.0.0.7:8080 check inter 2000 rise 2 fall 3
Pro Tip: Use the check inter 2000 parameter. This polls the backend every 2 seconds. If a backend fails, HAProxy removes it from rotation in roughly 6 seconds (3 falls). Your application code needs zero logic to handle this server failure.
Managing Configuration at Scale
In 2013, you cannot manually edit this file on 50 servers. That is madness. You need Puppet.
We define our backends in Hiera or a site manifest, and let Puppet generate the haproxy.cfg file. This ensures that when we add a new node to the "User Service" pool, the load balancer config on all dependent services is updated automatically within the next Puppet run (usually 30 minutes, or triggered via MCollective).
# Puppet Manifest Snippet
haproxy::listen { 'user_service_local':
ipaddress => '127.0.0.1',
ports => '8081',
options => {
'option' => ['httpchk GET /health'],
'balance' => 'roundrobin',
},
}
The "CoolVDS" Factor: Consistency is King
Why does infrastructure matter here? Because of I/O Wait and CPU Steal.
If you deploy this architecture on a crowded, oversold VPS provider, your HAProxy process might pause for 50ms while waiting for the physical CPU. In a distributed architecture, latency is additive. A 50ms pause on the proxy layer can trigger timeouts up the stack.
CoolVDS offers KVM instances where resources are dedicated. We also prioritize low latency to the major Norwegian exchanges. When you are dealing with strict Norwegian data laws (Personopplysningsloven), keeping data traversing internally on a private LAN within Oslo is not just good performance—it's good compliance hygiene.
Conclusion
Building a resilient service layer doesn't require waiting for future technologies. It requires smart usage of the tools we have today: Linux, HAProxy, and robust configuration management. By moving routing logic to the localhost, you eliminate bottlenecks and gain massive resilience.
Ready to build? Don't let noisy neighbors ruin your architecture. Deploy a KVM instance on CoolVDS today and see what stable latency looks like.