Console Login

Disaster Recovery in a Post-Schrems II World: Why Your Backups Will Fail You

Disaster Recovery in a Post-Schrems II World: Why Your "Backups" Will Fail You

Let’s be honest. If your disaster recovery (DR) plan consists of a nightly cron job running tar -czf and pushing it to an FTP server, you don't have a plan. You have a placebo. I’ve seen production environments evaporate in seconds—not just from `rm -rf /` accidents, but from ransomware that encrypts the backups before it touches the live database. In the wake of the OVH datacenter fire last year and the tightening grip of GDPR following the Schrems II ruling, relying on hope is professional suicide.

Recovery is not about data preservation; it is about Time to Recovery. When your Magento storefront is returning 502 Bad Gateway, nobody cares that you have a backup on a tape drive in a vault. They care that you are losing 5,000 NOK per minute.

The Geometry of Failure: RTO and RPO

Before we touch a single config file, define your metrics. If you cannot answer these two questions, stop reading and go call your stakeholders:

  • RPO (Recovery Point Objective): How much data are you willing to lose? One hour? One transaction?
  • RTO (Recovery Time Objective): How long can you stay offline before the business collapses?

Achieving an RPO of near-zero requires synchronous replication, which introduces latency. If your server is in Oslo and your DR site is in Frankfurt, the speed of light is your enemy. This is why local geo-redundancy matters. Hosting on CoolVDS allows you to keep primary and secondary instances within the Nordic region, minimizing latency on the Norwegian Internet Exchange (NIX) while maintaining data sovereignty.

Step 1: The Database Layer (MySQL 8.0)

For a mission-critical application, relying on mysqldump is too slow for restoration. You need binary log replication. In 2022, we use GTID (Global Transaction Identifiers) to make failover less of a nightmare.

Here is the requisite configuration for your my.cnf to ensure durability and replication readiness. Note the sync_binlog setting—it costs I/O performance, which is why we insist on NVMe storage.

[mysqld]
# Basic Identification
server-id                = 1
log_bin                  = /var/log/mysql/mysql-bin.log
binlog_format            = ROW

# GTID for safer failover
gtid_mode                = ON
enforce_gtid_consistency = ON

# Durability Settings (ACID compliant)
innodb_flush_log_at_trx_commit = 1
sync_binlog              = 1

# Network Binding
bind-address             = 0.0.0.0
Pro Tip: Never expose port 3306 to the public internet. Use a VPN tunnel or a private VLAN. CoolVDS offers private networking options that isolate your replication traffic from public packet sniffers.

Step 2: Secure Transport with WireGuard

Replicating data between your primary VPS and your DR site must be encrypted. IPSec is bloated and OpenVPN is slow. Since Linux kernel 5.6, WireGuard is the standard. It is lean, fast, and re-establishes connections instantly—perfect for the unstable conditions during a disaster.

Setting up the interface on Ubuntu 22.04 LTS (Jammy Jellyfish):

# /etc/wireguard/wg0.conf
[Interface]
Address = 10.0.0.1/24
SaveConfig = true
PostUp = iptables -A FORWARD -i wg0 -j ACCEPT; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
PostDown = iptables -D FORWARD -i wg0 -j ACCEPT; iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
ListenPort = 51820
PrivateKey = [YOUR_PRIVATE_KEY]

[Peer]
PublicKey = [DR_SITE_PUBLIC_KEY]
AllowedIPs = 10.0.0.2/32
Endpoint = dr.coolvds.com:51820

Initiate the tunnel:

wg-quick up wg0

Step 3: Immutable File Backup

Ransomware attackers target backup directories first. To defeat this, use the immutable attribute on Linux filesystems (ext4/xfs) for local snapshots, or use an append-only object storage bucket.

For local protection on your backup server, use chattr. This prevents even root from deleting the file until the flag is removed (which creates an alertable event).

# Lock the daily backup archive
chattr +i /backup/2022-05-03-full-site.tar.gz

# Verify the attribute
lsattr /backup/2022-05-03-full-site.tar.gz
# Output: ----i---------e---- /backup/2022-05-03-full-site.tar.gz

Infrastructure as Code: The 15-Minute Recovery

If your server is compromised, you shouldn't be fixing it. You should be burning it down and provisioning a fresh one. This is the cattle vs. pets philosophy. Using Terraform, you can define your CoolVDS infrastructure state.

Here is a basic definition to spin up a recovery node. This assumes you are using a KVM-based provider where resources are guaranteed.

resource "coolvds_instance" "dr_node" {
  hostname     = "dr-oslo-01"
  plan         = "nvme-16gb"
  region       = "no-oslo-1"
  image        = "ubuntu-22.04"
  
  ssh_keys = [
    file("~/.ssh/id_rsa.pub")
  ]

  # Post-provisioning setup
  connection {
    type        = "ssh"
    user        = "root"
    private_key = file("~/.ssh/id_rsa")
    host        = self.ipv4_address
  }

  provisioner "remote-exec" {
    inline = [
      "apt-get update",
      "apt-get install -y nginx mysql-server wireguard"
    ]
  }
}

The Compliance Trap: Schrems II and Datatilsynet

This is where technical architecture meets legal reality. In 2022, transferring personal data of Norwegian citizens to US-controlled cloud providers carries significant legal risk due to the invalidation of the Privacy Shield. The Datatilsynet (Norwegian Data Protection Authority) is closely watching transfer impact assessments.

By keeping your primary and DR infrastructure within CoolVDS’s ecosystem, you ensure data stays on European soil, governed by GDPR and Norwegian law, not the US CLOUD Act. This isn't just a "nice to have"; for healthcare, finance, and public sector projects, it is a hard requirement.

Why I/O is the Bottleneck of Recovery

I once watched a restoration fail not because the data was missing, but because the disk I/O on the cheap VPS provider saturated at 100MB/s. Restoring a 500GB database took 2 hours longer than the RTO allowed.

Storage Type Avg Read Speed Time to Restore 500GB
Standard HDD ~120 MB/s ~70 Minutes
SATA SSD (Generic VPS) ~500 MB/s ~17 Minutes
CoolVDS NVMe ~3,000 MB/s < 3 Minutes

When you are down, every second feels like an hour. We use pure NVMe arrays on CoolVDS because waiting for I/O wait (iowait) during a crisis is unacceptable.

Final Thoughts: Test or Fail

A DR plan that hasn't been tested is just a hypothesis. Schedule a "Game Day." Shut down your primary interface. Run your Ansible playbooks. See if your WireGuard handshake succeeds. If you find yourself manually editing config files to get the site back up, your automation is broken.

Don't let a hardware failure or a legal audit become your career's tombstone. Build your fortress on infrastructure that respects your data sovereignty and your need for speed.

Ready to harden your infrastructure? Deploy a high-availability NVMe instance in Oslo on CoolVDS today and secure your business continuity.