Surviving the Inevitable: A DevOps Guide to Disaster Recovery in Norway
Let’s be honest. If your Disaster Recovery (DR) plan is just a cron job running a tarball script sent to an FTP server, you don't have a plan. You have a prayer. I've spent the last decade watching servers fail in spectacular ways—RAID controllers melting, power supplies popping like firecrackers, and the classic rm -rf / run on the wrong terminal window. Hardware fails. Software bugs out. Humans make mistakes.
In 2018, with GDPR in full swing since May, losing data isn't just an operational annoyance; it's a legal nightmare. Especially here in Norway, where the Datatilsynet doesn't take kindly to negligence. If you are running mission-critical infrastructure, you need more than just backups. You need resilience.
The "Bus Factor" and Local Sovereignty
Why do we obsess over location? Latency is physics. Light takes time to travel. If your users are in Oslo and your failover server is in a budget datacenter in Texas, your recovery time objective (RTO) is going to be painful, and your latency during the transition will kill your user experience. But it's also about compliance. Under the current interpretation of GDPR, keeping your data within the EEA (European Economic Area) is the safest bet to avoid legal headaches.
We choose CoolVDS for our primary infrastructure not just because of the NVMe storage (which makes restoring 500GB databases actually feasible in under an hour), but because the data stays in Norway. The power grid here is stable—thanks to hydroelectricity—but network partitions happen. Your DR plan needs to account for the link between Oslo and the rest of the world going dark.
The 3-2-1 Rule: Updated for 2018
You know the classic rule: 3 copies of data, 2 different media, 1 offsite. In a VPS environment, "media" is abstract. Here is the modern DevOps translation:
- Production: Your live CoolVDS instance (NVMe powered).
- Local Replica: A hot-standby or snapshot on a separate physical host within the same datacenter (for fast failover).
- Offsite Archive: Cold storage in a different geographical location (e.g., Bergen or Stockholm) effectively distinct from the primary failure domain.
Pro Tip: Never rely on the hosting provider's snapshots as your only backup. If the provider's control plane goes down, you lose access to the snapshots too. Always have an external dump.
Database Replication: The Heartbeat of DR
File backups are easy. Database consistency is hard. If you are running a high-traffic Magento store or a SaaS backend, restoring from a SQL dump taken at 3:00 AM means you lose all data from 3:01 AM to the moment of the crash. That's unacceptable.
For MySQL 5.7 or the new MySQL 8.0 (which is gaining traction this year), you should be using Master-Slave replication. This allows you to promote a slave to master almost instantly.
Configuration: my.cnf Essentials
Don't stick with defaults. To ensure data integrity during a crash, you need GTID (Global Transaction Identifiers). This makes failover operations much cleaner than the old binary log file/position method.
[mysqld]
# Unique ID for the server (1 for master, 2 for slave)
server-id = 1
# Enable Binary Logging
log_bin = /var/lib/mysql/mysql-bin
binlog_format = ROW
# GTID for safer replication
gtid_mode = ON
enforce_gtid_consistency = ON
# Crash safety
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1The sync_binlog = 1 and innodb_flush_log_at_trx_commit = 1 flags are crucial. They force a disk sync on every transaction. Yes, this adds I/O overhead. This is exactly why we insist on CoolVDS NVMe instances. On spinning rust (HDD), these settings destroy performance. On NVMe, the latency penalty is negligible, but the data safety is absolute.
Automating the Filesystem State
Replication handles the database, but what about your /etc/nginx/ configs, your SSL certificates, or your application code? In 2018, if you aren't using configuration management like Ansible or SaltStack, you are doing it wrong. However, for the raw data, rsync is still king.
Here is a battle-tested wrapper script I use to push encrypted backups to a remote storage box. It uses SSH keys and cleans up old backups.
#!/bin/bash
# Simple Rotating Backup Script
# Date: 2018-10-15
SOURCE_DIR="/var/www/html/"
BACKUP_DIR="/mnt/backups/daily"
REMOTE_HOST="user@backup.coolvds-storage.no"
REMOTE_DIR="/home/backup/incoming/"
DATE=$(date +%F)
# Create local archive
tar -czf $BACKUP_DIR/site-$DATE.tar.gz $SOURCE_DIR
# Sync to remote DR site
rsync -avz -e "ssh -p 22" --progress \
$BACKUP_DIR/site-$DATE.tar.gz \
$REMOTE_HOST:$REMOTE_DIR
# Retention: Delete backups older than 7 days locally
find $BACKUP_DIR -type f -mtime +7 -name "*.gz" -deleteThis is simple, but it works. For more complex setups, look into BorgBackup. It performs deduplication, which saves massive amounts of space and bandwidth, essentially allowing you to keep daily backups for years without breaking the bank.
Testing: The Step Everyone Skips
A backup is Schrödinger's data: it exists in a state of both valid and corrupted until you actually try to restore it. I schedule a "Fire Drill" every quarter. We spin up a fresh KVM instance on CoolVDS, completely isolated from production, and attempt to rebuild the infrastructure solely from our offsite backups.
The "Time-to-Hello-World" Benchmark
| Storage Type | Restore Size | Time to Restore |
|---|---|---|
| Standard SATA HDD | 50 GB | ~45 Minutes |
| Enterprise SSD | 50 GB | ~12 Minutes |
| CoolVDS NVMe | 50 GB | ~4 Minutes |
When your site is down, every minute costs money. The difference between 45 minutes and 4 minutes is often the difference between a frustrated user and a lost customer. NVMe isn't a luxury anymore; for DR, it's a requirement.
The Verdict
Disaster recovery in 2018 is about balancing the paranoid need for data safety with the pragmatic need for performance. We can't predict when a kernel panic will hit or when a fiber line will get cut by an excavator in Oslo. But we can prepare.
By leveraging local Norwegian infrastructure, enforcing strict database consistency with GTID, and utilizing the raw I/O throughput of NVMe, you turn a potential catastrophe into a minor log entry. Don't wait for the crash to realize your backup script failed three months ago.
Ready to build a resilient stack? Deploy a high-availability KVM instance on CoolVDS today and get the I/O performance your DR plan demands.