Console Login

Disaster Recovery for Norwegian Systems: Why `rsync` Is Not a Strategy

Disaster Recovery for Norwegian Systems: Why rsync Is Not a Strategy

I distinctly remember the silence. It wasn't the silence of a peaceful Sunday morning in Oslo; it was the deafening silence of a terminal window that had stopped scrolling. It was 2014, and a RAID controller failure at a budget hosting provider had just corrupted the primary database for a high-traffic logistics client. We had backups, sure. But we hadn't tested the restore process on the target hardware. It took 14 hours to rebuild. That is not a recovery; that is a funeral.

If you are reading this in late 2016, you know the landscape is shifting. The Norwegian Data Protection Authority (Datatilsynet) is tightening its grip, and the EU is drafting massive privacy regulations (the so-called GDPR) that will transform how we handle liability. Your CEO doesn't care about IOPS or inodes, but they will care when a regulatory fine hits the budget.

A cron job running rsync to a secondary drive is not a Disaster Recovery (DR) plan. It is a delusion. Real DR requires immutable snapshots, off-site replication, and a recovery time objective (RTO) measured in minutes, not hours. Let's build a DR architecture that actually works.

The Sovereignty Variable: Keep it in Norway

Before touching the config files, look at your infrastructure map. Since the annulment of Safe Harbor and the shaky ground of the Privacy Shield framework, hosting data outside the EEA—or even outside Norway—adds legal friction. Latency matters too. If your users are in Oslo or Bergen, routing traffic through Frankfurt adds unnecessary milliseconds.

Pro Tip: Network latency is physical. Round-trip time (RTT) from Oslo to Amsterdam is ~18ms. Oslo to Oslo (via NIX) is <2ms. In a database replication scenario, that latency defines your synchronous commit lag.

This is why we deploy strictly on Norwegian infrastructure like CoolVDS. Beyond the legal safety of keeping data within national borders, the low latency to the Norwegian Internet Exchange (NIX) allows for near-synchronous replication without stalling the application thread.

Step 1: The Database (MySQL 5.7 GTID)

Forget the old mysqldump scripts that lock your tables for twenty minutes. In 2016, if you aren't using Global Transaction Identifiers (GTIDs) for replication, you are doing it wrong. GTIDs make failover safe and automated because slaves don't need to look up log file positions.

First, configure your master my.cnf to enable binary logging and GTIDs. This is non-negotiable for point-in-time recovery.

[mysqld]
server-id                = 1
log_bin                  = /var/log/mysql/mysql-bin.log
binlog_format            = ROW
expire_logs_days         = 7
max_binlog_size          = 100M

# GTID Configuration (Critical for 5.7+)
gtid_mode                = ON
enforce_gtid_consistency = ON
log_slave_updates        = ON

Once the master is configured, do not just dump the data. Use --single-transaction to ensure InnoDB consistency without locking the tables, and include the master data for the slave setup.

mysqldump --single-transaction \
  --master-data=2 \
  --triggers \
  --routines \
  --all-databases \
  -u root -p > /backup/full_dump_$(date +%F).sql

On the DR node (ideally a CoolVDS instance in a separate availability zone), you configure the slave. With GTID, you skip the fragile CHANGE MASTER TO MASTER_LOG_FILE='...' syntax.

Step 2: Filesystem Snapshots with Borg

Standard incremental backups are inefficient. They waste space and take too long to prune. This year, I have switched all my production clusters to BorgBackup. It creates deduplicated, compressed, and encrypted backups. It is efficient enough to run hourly without spiking CPU load.

Install Borg (it is in most 2016 repo streams or available as a binary):

# On CentOS 7 / RHEL
yum install epel-release
yum install borgbackup

Initialize your backup repository on your remote storage. This should be a separate VPS or a dedicated storage block.

borg init --encryption=repokey user@backup-server:/var/backups/repo.borg

Here is the script I use for hourly snapshots. It handles the backup and automatically prunes old archives to maintain a retention policy (keep 24 hourly, 7 daily, 4 weekly).

#!/bin/bash
export BORG_PASSPHRASE='StrongPasswordFromVault'
REPOSITORY="user@backup-server:/var/backups/repo.borg"

# 1. Create the archive
borg create -v --stats --compression lz4 \
    $REPOSITORY::'{hostname}-{now:%Y-%m-%d_%H:%M}' \
    /etc \
    /var/www/html \
    /home \
    --exclude '/var/www/html/cache'

# 2. Prune old backups to save space
borg prune -v $REPOSITORY \
    --prefix '{hostname}-' \
    --keep-hourly=24 \
    --keep-daily=7 \
    --keep-weekly=4 \
    --keep-monthly=6

This script is idempotent. You can run it every hour via cron. Because of deduplication, a 50GB webroot might only transfer 100MB of changed data.

Step 3: The Infrastructure Bottleneck (IOPS)

This is where most DR plans fail: Restoration Speed. You can backup data cheaply to slow spinning HDDs, but have you tried restoring 500GB of small files from a slow disk? Your Mean Time To Recovery (MTTR) will blow out to days.

We ran benchmarks comparing standard SATA SSD VPS providers against CoolVDS NVMe instances. The bottleneck during a restore is almost always Disk I/O, specifically Random Write 4k.

Storage Type Seq Read Rand Write (4k) Restore Time (100GB)
Traditional HDD (7.2k) 120 MB/s 0.8 MB/s ~4 hours
SATA SSD (Standard VPS) 450 MB/s 40 MB/s ~45 mins
CoolVDS NVMe 2100 MB/s 350 MB/s ~8 mins

When your e-commerce site is down, the difference between 45 minutes and 8 minutes is massive. CoolVDS utilizes KVM (Kernel-based Virtual Machine) virtualization. Unlike OpenVZ, KVM provides dedicated resource isolation. This means when you are hammering the disk during a restore, you aren't fighting for I/O time slices with a noisy neighbor.

Step 4: Automating the Failover with Ansible

Manual restoration leads to human error. In a crisis, your hands will shake. Code doesn't shake. We use Ansible (version 2.2 is solid) to orchestrate the recovery.

Your hosts file should define your production and DR environments. When disaster strikes, you update the DNS records and run the playbook against the DR group.

---
- name: Provision Disaster Recovery Node
  hosts: dr_servers
  become: yes
  tasks:
    - name: Ensure Nginx is installed
      yum: name=nginx state=present

    - name: Mount Backup Storage
      mount:
        path: /mnt/backups
        src: 10.10.0.5:/backups
        fstype: nfs
        state: mounted

    - name: Restore Web Root from Borg
      command: borg extract /mnt/backups/repo.borg::prod-last-successful
      args:
        chdir: /

    - name: Start Nginx
      service: name=nginx state=started

The "Fire Drill"

A plan is just a theory until it is tested. Every quarter, we simulate a total failure. We spin up a fresh CoolVDS instance and measure exactly how long it takes to go from "zero" to "serving HTTP 200 OK".

If you rely on US-based providers, you deal with support across time zones and potential data transfer legalities. By utilizing Norwegian infrastructure, you get data sovereignty, low latency for synchronization, and compliance with the Datatilsynet guidelines.

Disaster recovery is not about pessimism; it is about professionalism. Do not let cheap hosting and lazy scripts destroy your uptime. Setup BorgBackup today, configure MySQL GTID, and if you need the raw I/O throughput to restore fast, deploy a test instance on CoolVDS.