Console Login

Surviving the Cloud: Why Single-Provider Infrastructure is Dead (And How to Fix It)

Surviving the Cloud: Why Single-Provider Infrastructure is Dead (And How to Fix It)

If you were awake on August 7th, you probably weren’t happy. That was the day a seemingly minor transformer failure in a US-East data center cascaded into a massive EBS (Elastic Block Store) outage, taking down Reddit, Flipboard, and half the startups in my Twitter feed. I was on call.

It was a brutal wake-up call for the industry: The Cloud is just someone else’s computer, and sometimes that computer loses power.

For too long, we've bought into the myth that putting everything in one "Availability Zone" or even one provider is enough. It isn't. If you are serious about uptime in late 2014, you need a strategy that spans infrastructure providers. You need to own your redundancy.

In this guide, I’m going to show you how to build a battle-hardened, multi-provider failover setup. We will use the newly stable HAProxy 1.5 (finally, native SSL!), Ansible for configuration management, and a high-performance CoolVDS KVM instance in Norway as our secure European anchor.

The Architecture: "The Bunker & The Cloud"

The biggest mistake CTOs make is trying to replicate their entire stack actively across two expensive providers. That doubles your TCO immediately. Instead, I advocate for the "Pilot Light" or "Active-Passive" hybrid approach.

  • Primary (The Cloud): Your massive auto-scaling groups, likely in Frankfurt or Ireland.
  • Secondary (The Bunker): A high-performance, predictable KVM VPS (like CoolVDS) in Oslo.

Why Norway? Two reasons: Data Sovereignty and Latency. With the Snowden revelations last year, storing sensitive customer data purely on US-owned infrastructure is becoming a legal minefield. The Norwegian Personopplysningsloven (Personal Data Act) and the oversight of Datatilsynet offer a level of privacy protection that US providers simply cannot legally guarantee. Plus, if your market is Northern Europe, the latency to NIX (Norwegian Internet Exchange) is practically zero.

Step 1: The Load Balancer (HAProxy 1.5)

Until June of this year, running HAProxy meant you needed stunnel or nginx in front of it to handle HTTPS. That added latency and complexity. With HAProxy 1.5 stable, we can now terminate SSL directly. This is critical for performance.

But be careful. After the POODLE vulnerability discovered last month (October 2014), you must disable SSLv3. Do not copy-paste old configs from 2013.

Here is a production-ready haproxy.cfg optimized for a CoolVDS NVMe instance. This config handles thousands of concurrent connections without sweating.

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

    # PERFORMANCE TUNING
    maxconn 4096
    tune.ssl.default-dh-param 2048

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 503 /etc/haproxy/errors/503.http

frontend www-https
    bind *:443 ssl crt /etc/haproxy/certs/coolvds.pem no-sslv3
    reqadd X-Forwarded-Proto:\ https
    
    # ACL for traffic routing
    acl is_static path_end .jpg .gif .png .css .js
    use_backend static_node if is_static
    default_backend app_cluster

backend app_cluster
    balance roundrobin
    option httpchk HEAD /health HTTP/1.1\r\nHost:www.coolvds.com
    server app01 10.0.0.1:80 check fall 3 rise 2
    server app02 10.0.0.2:80 check fall 3 rise 2

backend static_node
    # Offload static assets to the local NVMe storage for speed
    server local_static 127.0.0.1:8080 check
Pro Tip: Notice the no-sslv3 directive in the bind line? That is mandatory now. If you are auditing your servers today, check this immediately. The "check fall 3 rise 2" ensures that a flapping node doesn't get traffic until it's actually stable.

Step 2: Data Persistence (MySQL & Latency)

Your application servers are disposable. Your data is not. In a split-brain scenario (network partition), you need to know which database is the "Source of Truth."

For a CoolVDS instance acting as a failover, I recommend a standard Master-Slave replication topology, but tuned for the hardware. Most virtual servers suffer from "noisy neighbor" syndrome where I/O wait times spike because another tenant is compiling a kernel.

CoolVDS uses KVM (Kernel-based Virtual Machine) rather than OpenVZ. This matters. In OpenVZ, the kernel is shared; in KVM, you have your own allocated kernel and verified RAM isolation. Combined with local SSD storage (which is still a premium feature for many hosts), we can push InnoDB harder.

Add this to your /etc/mysql/my.cnf on the slave (CoolVDS) node:

[mysqld]
# Allocating 70% of RAM to buffer pool is standard for dedicated DB nodes
innodb_buffer_pool_size = 4G 
innodb_log_file_size = 256M
innodb_flush_log_at_trx_commit = 2  # Trade slight ACID strictness for massive I/O gain
innodb_flush_method = O_DIRECT

# Replication Stability
relay-log = /var/log/mysql/mysql-relay-bin.log
slave-net-timeout = 60

Step 3: Orchestration with Ansible

Stop SSHing into servers manually. If you have to type apt-get update more than once, you’re doing it wrong. We are seeing a huge shift this year towards Ansible because it’s agentless. You don't need to install a Puppet agent or Chef client; you just need Python and SSH.

Here is a simple playbook to ensure your failover environment is always in sync with your primary environment. Save this as site.yml:

---
- hosts: failover_bunker
  user: root
  vars:
    http_port: 80
    max_clients: 200

  tasks:
  - name: Ensure Nginx is at the latest version
    apt: pkg=nginx state=latest update_cache=yes

  - name: Write Nginx config
    template: src=./templates/nginx.conf.j2 dest=/etc/nginx/nginx.conf
    notify:
    - restart nginx

  - name: Ensure firewall allows HTTP/HTTPS
    ufw: rule=allow port={{ item }} proto=tcp
    with_items:
      - 80
      - 443

  handlers:
    - name: restart nginx
      service: name=nginx state=restarted

Running this script (ansible-playbook site.yml) ensures that your CoolVDS "bunker" is identical to your production setup in minutes.

Why KVM and Norway Matter in 2014

We are living in an era of mass surveillance and massive instability. The "Safe Harbor" agreement is under heavy scrutiny, and I wouldn't be surprised if it gets challenged in court soon. Hosting your data inside the EU/EEA is no longer just a "nice to have"—it's a risk mitigation strategy.

Furthermore, the performance gap between "Cloud" storage (EBS/S3) and local NVMe/SSD storage is widening. Network attached storage will always have latency. When your database is getting hammered, nothing beats local I/O.

CoolVDS provides that raw, unadulterated KVM performance. It doesn't try to be an "Elastic Cloud." It tries to be a fast, reliable server that stays online when the giants stumble. In a multi-provider strategy, that stability is exactly what you are paying for.

The Final Checklist

Before you close this tab, run openssl s_client -connect your-server.com:443 -ssl3. If it connects, you are vulnerable. Patch it.

Don't wait for the next power outage to test your redundancy. Spin up a KVM instance on CoolVDS today, sync your repo, and sleep better knowing you aren't reliant on a single transformer in Virginia.