Disaster Recovery: The "When, Not If" Protocol for Norwegian Infrastructure
There is a distinct sound a server rack makes right before the power supply unit (PSU) capacitors explode. If you've spent enough time in data centers, you know it. It's a high-pitched whine, followed by a pop, followed by the terrifying silence of fans spinning down. I heard that sound in 2014, and it cost my previous employer three days of downtime because we confused "backups" with "disaster recovery."
They are not the same.
A backup is a static copy of data. Disaster Recovery (DR) is the process of restoring function. If you have a backup on a tape drive in a vault but no server to load it onto, you don't have a DR plan. You have a souvenir.
For Norwegian businesses operating under strict Datatilsynet regulations and the shadow of GDPR (which went full force last May), the stakes are higher. You cannot just dump your failover onto a cheap US bucket and hope Privacy Shield holds up forever. You need local, sovereign, low-latency redundancy.
This guide ignores the fluff. We are going to architect a hot-warm failover setup between two geographically separated zones using tools available right now in early 2019.
The Architecture: Primary and Standby
We assume a standard LEMP stack (Linux, Nginx, MySQL/MariaDB, PHP). Your Primary Node is in Oslo (Zone A). Your Standby Node needs to be far enough away to survive a local power grid failure, but close enough to maintain low replication lag. A secondary facility in Norway or a nearby Nordic neighbor is ideal to keep latency under 15ms.
The Goal:
RPO (Recovery Point Objective): < 1 second (Data loss)
RTO (Recovery Time Objective): < 5 minutes (Downtime)
1. The Database: Master-Slave Replication
The filesystem is easy. The database is where DR plans die. We will use standard asynchronous Master-Slave replication. While Galera Cluster is fantastic for multi-master, it adds complexity that can be brittle during a network partition. For a robust DR, simple replication often beats complex clustering.
On your Primary (Master) server, edit your /etc/mysql/my.cnf (or /etc/my.cnf.d/server.cnf on CentOS 7). We need to enable binary logging and assign a unique server ID.
[mysqld]
bind-address = 0.0.0.0
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_do_db = production_app
# Reliability settings for InnoDB
innodb_flush_log_at_trx_commit = 1
sync_binlog = 1
Note: Setting sync_binlog=1 is heavier on disk I/O. This is where hardware matters. On spinning rust (HDD), this kills performance. On CoolVDS NVMe instances, the latency penalty is negligible because we aren't waiting for a physical drive head to seek.
On the Standby (Slave) server, set a different ID:
[mysqld]
bind-address = 0.0.0.0
server-id = 2
relay-log = /var/log/mysql/mysql-relay-bin.log
read_only = 1
Security Critical: Do not expose port 3306 to the public internet. Use a VPN tunnel (OpenVPN is the standard here) or strict iptables rules limiting access only to the specific IP of the other server.
# On Master, allow Slave IP
iptables -A INPUT -p tcp -s 192.168.20.50 --dport 3306 -j ACCEPT
# Create the replication user
mysql -u root -p -e "CREATE USER 'replicator'@'192.168.20.50' IDENTIFIED BY 'SuperSecretStrongPassword'; GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'192.168.20.50'; FLUSH PRIVILEGES;"
Start the replication. If you have existing data, use mysqldump --master-data=2 to get the coordinate position. If it's a fresh setup:
-- On Slave
CHANGE MASTER TO
MASTER_HOST='192.168.10.25',
MASTER_USER='replicator',
MASTER_PASSWORD='SuperSecretStrongPassword',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS= 154;
START SLAVE;
Pro Tip: Monitor Seconds_Behind_Master on the slave. If this creeps up, your standby server is too weak or the network latency is too high. This is why we advise against using "budget" VPS providers for the DR node. CPU Steal time on oversold hosts will cause replication lag. CoolVDS guarantees resources via KVM, ensuring the slave keeps pace.
2. The Filesystem: Keeping Assets in Sync
Databases are structured; uploaded files (images, PDFs) are not. For a 2019-era Linux stack, lsyncd (Live Syncing Daemon) is the most robust tool. It watches the kernel file system events (inotify) and triggers rsync instantly.
Install it on the Master:
apt-get install lsyncd
Configure /etc/lsyncd/lsyncd.conf.lua to push changes to the Standby server immediately.
settings {
logfile = "/var/log/lsyncd/lsyncd.log",
statusFile = "/var/log/lsyncd/lsyncd.status"
}
sync {
default.rsyncssh,
source = "/var/www/html/uploads",
host = "192.168.20.50",
targetdir = "/var/www/html/uploads",
rsync = {
archive = true,
compress = true,
_extra = { "--omit-dir-times" }
},
ssh = {
port = 22
}
}
You will need SSH key authentication set up between the root users for this to work seamlessly. While NFS or GlusterFS are options for shared storage, they introduce a single point of failure or massive complexity. lsyncd is simple, decoupled, and fails safe.
3. The Switch: DNS Failover
When the Master burns down, how do users find the Slave?
In 2019, we don't rely on manual IP updates. DNS propagation takes too long. You should be using a DNS provider with API access (like Cloudflare or AWS Route53) and a low TTL (Time To Live). Set your A record TTL to 60 seconds.
The failover script logic (running on a third monitor node or the slave itself) looks like this:
- Ping Master.
- If Master is down for > 3 checks (30 seconds), trigger Failover.
- Action 1: On Slave, run
STOP SLAVE; RESET SLAVE ALL;to detach it from the dead master. - Action 2: Use DNS API to update
app.yourdomain.noto point to Slave IP. - Action 3: Send alert to Ops team via Slack/SMS.
The Norwegian Context: Latency and Law
Why bother building this yourself on a VPS? Why not use a Managed Cloud Database?
1. Latency: If your customers are in Oslo, routing traffic to a data center in Frankfurt or Ireland adds 20-40ms round trip. For a database-heavy application making 50 queries per page load, that adds up to full seconds of delay. Hosting on CoolVDS servers located physically in Norway keeps that ping inside the single digits via NIX (Norwegian Internet Exchange).
2. Compliance: GDPR Article 32 requires "the ability to restore the availability and access to personal data in a timely manner." Furthermore, keeping data within Norwegian borders simplifies legal jurisdiction. While the EU-US Privacy Shield is currently in effect, legal experts are already wary of its longevity. Data sovereignty is the only future-proof strategy.
War Games: Testing the Plan
A DR plan you haven't tested is a hypothesis. Schedule a maintenance window this Sunday.
- Block port 80/443 on your Master.
- Watch your monitoring scripts trigger.
- Verify the DNS updates.
- Check if the "Read Only" flag was removed from the database on the Slave.
- Load the site.
If it works, you have resilience. If it fails, you have a to-do list.
Infrastructure is about mitigating chaos. We cannot stop the power outage, but we can control the outcome. Building this architecture requires hardware that doesn't choke on I/O during the frantic re-syncing of data. That is why we built CoolVDS on pure KVM and NVMe stacks—because when the alarm bells ring at 3 AM, you want raw performance, not excuses.
Is your failover site ready? Deploy a high-performance standby node on CoolVDS today and start syncing before the capacitors pop.