Console Login

Disaster Recovery in a Post-GDPR World: A Norwegian CTO’s Playbook

When the Lights Go Out in Oslo: A Pragmatic Approach to Disaster Recovery

It has been six months since May 25th. The GDPR panic has settled into a dull, compliance-induced headache for most of us managing infrastructure in Europe. But while everyone was busy updating privacy policies and cookie banners, a critical aspect of Article 32 often got overlooked: resilience.

Specifically, the "ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident."

If your primary data center in Oslo goes dark due to a fiber cut or a power surge, and your backup strategy is a shell script that copies files to a server in the same rack, you aren't just facing downtime. You are facing regulatory scrutiny from Datatilsynet (The Norwegian Data Protection Authority).

As we close out 2018, let's stop treating Disaster Recovery (DR) as a luxury item. Here is the technical reality of building a warm-failover site that keeps latency low and legal teams happy.

The Geometry of Failure: RPO vs. RTO

Before we touch a single config file, define your variables. In a high-availability environment, we track two metrics:

  • RPO (Recovery Point Objective): How much data can you afford to lose? (e.g., "We can lose the last 5 minutes of transactions.")
  • RTO (Recovery Time Objective): How long until the service is back online? (e.g., "We need to be up in 1 hour.")

Achieving RPO=0 (zero data loss) usually requires synchronous replication, which introduces latency. For a server located in Oslo, syncing synchronously to a backup in Frankfurt might add 20-30ms to every write operation. That kills performance for high-traffic Magento or WordPress sites.

The Solution: Asynchronous replication to a geographically close, but physically separate facility. This is why CoolVDS utilizes distinct availability zones. You want your DR site close enough to keep ping under 5ms, but far enough that a grid failure doesn't take out both.

The Database: MySQL 5.7 GTID Replication

In 2018, if you aren't using GTID (Global Transaction Identifier) for your MySQL replication, you are doing it wrong. Old-school binary log file positioning is fragile. If a slave server crashes, figuring out exactly where it left off is a nightmare.

Here is a battle-tested configuration for a Master node running on a CoolVDS NVMe instance. We prioritize innodb_flush_log_at_trx_commit=1 for ACID compliance, even though it hits I/O hard. This is where the NVMe storage pays for itself.

Master Configuration (/etc/my.cnf)

[mysqld] server-id = 1 log_bin = /var/log/mysql/mysql-bin.log binlog_format = ROW gtid_mode = ON enforce_gtid_consistency = ON log_slave_updates = ON # Reliability Settings innodb_flush_log_at_trx_commit = 1 sync_binlog = 1 innodb_buffer_pool_size = 4G # Adjust based on RAM

On the Slave (DR) node, you simply point it to the master. Note that we are using a read-only flag on the slave to prevent accidental writes during normal operations.

Slave Configuration

[mysqld] server-id = 2 gtid_mode = ON enforce_gtid_consistency = ON read_only = 1

To initialize replication without locking the master for hours, use Percona XtraBackup. It allows you to take a hot backup while the database remains online.

# On Master xtrabackup --backup --target-dir=/data/backups/ # Scp to Slave scp -r /data/backups/ user@dr-server-ip:/data/ # On Slave xtrabackup --prepare --target-dir=/data/backups/ xtrabackup --move-back --target-dir=/data/backups/

Filesystem Synchronization: Beyond simple CP

Database replication is useless if your user-uploaded images aren't there. For static assets, rsync is still the king of reliability. However, running it manually is a recipe for disaster.

We use lsyncd (Live Syncing Daemon) which watches the kernel file system events (inotify) and triggers rsync only when files change. This creates a "near real-time" mirror.

Pro Tip: Always exclude temporary files and cache directories. Syncing your /var/www/html/var/cache folder is a waste of bandwidth and CPU cycles.

Lsyncd Configuration (/etc/lsyncd.conf)

settings { logfile = "/var/log/lsyncd/lsyncd.log", statusFile = "/var/log/lsyncd/lsyncd.status" } sync { default.rsyncssh, source = "/var/www/html/", host = "dr-server-ip", targetdir = "/var/www/html/", excludeFrom = "/etc/lsyncd.exclude", rsync = { archive = true, compress = true, _extra = { "--bwlimit=5000" } # Protect the pipe } }

The Infrastructure Layer: KVM vs. Containers

There is a lot of noise this year about running everything in containers (Docker). While Docker is fantastic for deployment, for the core data layer, I still trust full virtualization.

CoolVDS is built on KVM (Kernel-based Virtual Machine). Unlike OpenVZ or LXC, KVM provides true hardware isolation. If a "noisy neighbor" on the host node decides to fork-bomb their process list, your kernel remains untouched. In a DR scenario, predictability is the only metric that matters.

The Legal Safety Net (Data Sovereignty)

Why host the DR site in Norway? Why not just dump it to AWS S3 in US-East?

Schrems and Safe Harbor. Even with the Privacy Shield framework currently in place, the legal ground is shaky. The safest interpretation of GDPR for Norwegian companies is to keep data within the EEA (European Economic Area).

By keeping your primary node and your DR node within Norwegian borders (or at least Nordic borders), you bypass the complex legal frameworks required for third-country data transfers. You also benefit from the NIX (Norwegian Internet Exchange) peering, ensuring that failover traffic doesn't route through London just to get back to Oslo.

The "Red Button" Script

When the disaster happens, your hands will be shaking. Do not rely on typing commands manually. Create a failover script that promotes the slave to master.

#!/bin/bash # promote_slave.sh echo "Promoting Slave to Master..." # 1. Turn off read-only mode mysql -u root -p -e "SET GLOBAL read_only = OFF;" # 2. Stop Slave thread mysql -u root -p -e "STOP SLAVE; RESET SLAVE ALL;" # 3. Update DNS (Example using a CLI tool for DNS) # /usr/bin/update-dns-record --domain example.com --ip $THIS_SERVER_IP echo "Failover Complete. Check logs."

Conclusion

Disaster recovery isn't about if, but when. The hardware will fail. The fiber will be cut. The intern will rm -rf /.

By combining solid 2018-era tools like MySQL GTID, Lsyncd, and KVM virtualization, you can build a robust safety net. But software is only half the battle. You need infrastructure that respects the I/O demands of recovery and the legal demands of the state.

Don't wait for the panic call at 3 AM. Deploy a secondary NVMe instance on CoolVDS today and test your failover script.