Surviving the Storm: Disaster Recovery & Data Sovereignty Post-Schrems II
The comfortable era of "just dump it to AWS S3" is over. Since the CJEU invalidated the Privacy Shield in the Schrems II ruling last July, CTOs across Europe have been scrambling. If your disaster recovery (DR) strategy involves shipping Norwegian customer data to a US-owned cloud provider—even one with a data center in Frankfurt—you are now navigating a legal minefield. Datatilsynet (The Norwegian Data Protection Authority) has made it clear: reliance on Standard Contractual Clauses (SCCs) requires supplementary measures that many hyperscalers simply cannot guarantee against US surveillance laws like FISA 702.
But legal compliance is boring until you get fined. Let’s talk about the engineering reality. A DR plan is worthless if your Recovery Time Objective (RTO) exceeds your business's tolerance for pain. In 2021, if you are restoring from spinning rust (HDD), you are already failing.
This is a technical deep dive into building a compliant, high-velocity DR architecture hosted within the Nordic region.
The New "3-2-1" Rule: Sovereignty Edition
The traditional backup rule is simple: 3 copies of data, 2 different media, 1 offsite. In the post-2020 landscape, we must append a corollary: The offsite copy must be legally sovereign.
For a Norwegian entity, the safest bet is a secondary site within Norway or the EEA, ideally on infrastructure owned by a Nordic provider. This reduces latency to the Norwegian Internet Exchange (NIX) in Oslo, ensuring that data replication doesn't suffer from jitter, and keeps the lawyers happy.
Defining RPO and RTO
- RPO (Recovery Point Objective): How much data can you lose? (e.g., 5 minutes).
- RTO (Recovery Time Objective): How long until the site is back up? (e.g., 1 hour).
High RPO is a storage throughput problem. High RTO is a compute and automation problem. We solve the first with NVMe and the second with Ansible.
Phase 1: Database Consistency is King
Filesystem snapshots are great, but if you snapshot a running MySQL database under high load without quiescing it, you are backing up corruption. For a production-grade setup on CoolVDS, we avoid the overhead of `mysqldump` (which locks tables and is slow to restore) and use Percona XtraBackup for hot backups.
Here is a robust script wrapper for MySQL 8.0 that streams a compressed backup to a secure local directory, ready for offsite shipping. Note the use of `xbstream` for parallel processing:
#!/bin/bash
# timestamp for the backup folder
TS=$(date +"%F_%H-%M-%S")
BACKUP_DIR="/mnt/nvme_storage/backups/$TS"
LOG_FILE="/var/log/mysql_dr.log"
mkdir -p $BACKUP_DIR
echo "Starting XtraBackup at $TS" >> $LOG_FILE
# xtrabackup 8.0 specific syntax
xtrabackup --backup \
--user=backup_user \
--password='COMPLEX_PASSWORD_HERE' \
--stream=xbstream \
--parallel=4 \
--compress \
--compress-threads=4 \
--target-dir=$BACKUP_DIR > $BACKUP_DIR/backup.xbstream
if [ $? -eq 0 ]; then
echo "Backup Success: $TS" >> $LOG_FILE
else
echo "Backup FAILED: $TS" >> $LOG_FILE
# Trigger PagerDuty or Slack alert here
exit 1
fi
Pro Tip: Never rely on the exit code alone. Always verify the backup integrity. On CoolVDS NVMe instances, we typically see restore speeds up to 5x faster than standard SSD VPS providers because we don't throttle IOPS on the storage backend. Test your restore speed weekly.
Phase 2: Secure Offsite Replication with Restic
`rsync` is a classic, but for DR, we need encryption at rest, deduplication, and snapshotting. Restic (v0.12.0 is the current stable choice) is superior here. It encrypts data before it leaves your server. This is a "supplementary measure" that helps satisfy GDPR requirements when transferring data.
Initialize a repository on a secondary CoolVDS storage instance (or any SSH-accessible server in Oslo):
export RESTIC_PASSWORD="s3cr3t_k3y"
restic -r sftp:user@dr-site-oslo.coolvds.com:/srv/restic-repo init
Now, push the XtraBackup stream we created earlier. Restic's deduplication means only the changed blocks of your database backup get transferred, saving massive amounts of bandwidth.
restic -r sftp:user@dr-site-oslo.coolvds.com:/srv/restic-repo backup /mnt/nvme_storage/backups --tag database
Phase 3: Infrastructure as Code (IaC) for Rapid Recovery
Having the data is half the battle. If your primary datacenter goes dark, you need a server to restore to. Manually installing Nginx and PHP via SSH while your CEO screams at you is a bad day.
Use Terraform (current v0.14) to define your infrastructure state, and Ansible to configure it. This allows you to spin up a fresh CoolVDS instance and configure it identically to production in minutes.
Here is a snippet of an Ansible playbook `site_restore.yml` that prepares the environment:
---
- name: Disaster Recovery Provisioning
hosts: dr_servers
become: yes
vars:
mysql_root_pass: "{{ vault_mysql_pass }}"
tasks:
- name: Ensure sysctl optimizations for high throughput
sysctl:
name: "{{ item.key }}"
value: "{{ item.value }}"
state: present
loop:
- { key: 'net.core.somaxconn', value: '65535' }
- { key: 'vm.swappiness', value: '10' }
- { key: 'fs.file-max', value: '2097152' }
- name: Install LEMP Stack
apt:
name:
- nginx
- mysql-server
- php-fpm
- percona-xtrabackup-80
state: present
update_cache: yes
- name: Stop MySQL for data restore
service:
name: mysql
state: stopped
The Hardware Factor: Why "Cloud" Isn't Enough
Virtualization overhead is the enemy of rapid recovery. When you are restoring 500GB of database tables, you are hitting the disk subsystem hard. In a shared "cloud" environment with noisy neighbors, your I/O Wait (%iowait) can skyrocket, turning a 1-hour restore into a 6-hour nightmare.
This is where architecture matters. At CoolVDS, we use KVM (Kernel-based Virtual Machine) which provides stricter isolation than container-based virtualization. More importantly, we map NVMe storage directly to the instance using virtio-scsi drivers. We have seen `fio` benchmarks on our Oslo nodes sustain 4K random write speeds that would choke a standard SATA SSD array.
Validating the Plan
A DR plan that isn't tested is just a wish. You must simulate a failure. Every quarter, run this drill:
- Spin up a fresh instance on CoolVDS using your Ansible playbooks.
- Pull the latest Restic snapshot from your backup repository.
- Decompress and restore the XtraBackup stream.
- Point your `/etc/hosts` to the new IP and load the site.
If you encounter permission errors, missing PHP extensions, or database connection timeouts, fix the Ansible playbook, not the server. Then destroy the instance.
Conclusion
The post-Schrems II world demands that we stop treating data sovereignty as an afterthought. By combining legally compliant infrastructure in Norway with robust, open-source tools like Restic and Ansible, you can build a disaster recovery plan that satisfies both the Datatilsynet and your users' need for speed.
Don't wait for a fiber cut or a legal summons to test your strategy. Spin up a high-performance NVMe instance on CoolVDS today and see how fast your recovery scripts actually run.