The "It Won't Happen to Me" Fallacy in Northern Europe
March 2021. The OVH datacenter fire in Strasbourg. A wake-up call that smelled like burnt silicon and lost revenue. If you were a CTO relying on backups stored in the same physical facility as your production servers, you didn't sleep for a week. In 2024, if your Disaster Recovery (DR) plan is simply "we run a nightly cron job," you are negligent.
For Norwegian businesses, the stakes are higher. We aren't just battling entropy; we are navigating the minefield of Schrems II and GDPR. You cannot simply dump your encrypted snapshots into an AWS S3 bucket in Virginia and hope Datatilsynet looks the other way. Data sovereignty is now a technical requirement, not just a legal one.
We are going to dismantle the fluff surrounding DR. We will look at RTO/RPO calculations, configure real-time PostgreSQL replication, and discuss why the underlying hardware of your VPS provider—specifically NVMe throughput—dictates your recovery speed.
RTO and RPO: The Mathematics of Failure
Stop talking about "uptime." Start defining two numbers:
- RPO (Recovery Point Objective): How much data are you willing to lose? (Time).
- RTO (Recovery Time Objective): How long until the service is back online? (Time).
Pro Tip: If your CEO demands Zero RPO and Zero RTO, ask for an infinite budget. Realistically, on a standardized infrastructure like CoolVDS, you can achieve an RPO of < 1 second and an RTO of < 15 minutes without buying a second datacenter. It requires architecture, not magic.
The Database Layer: PostgreSQL Streaming Replication
Dumping a database with pg_dump is fine for development. For production, it's useless for low-RPO scenarios. By the time you restore a 50GB dump, your customers have moved to a competitor. We use Write-Ahead Log (WAL) streaming. This replicates changes to a standby node in real-time.
Here is a production-hardened configuration for PostgreSQL 16 (current as of early 2024). This assumes you have a primary server and a standby server (preferably hosted in a separate availability zone or distinct physical host, which CoolVDS guarantees for enterprise tiers).
1. Primary Configuration (postgresql.conf)
# /etc/postgresql/16/main/postgresql.conf
# Minimal requirement for replication
wal_level = replica
# How many standby servers might connect?
max_wal_senders = 5
# Keep enough WAL segments so the replica doesn't fall too far behind
wal_keep_size = 512MB
# Timeout for disconnected replicas
wal_sender_timeout = 60s
# Security: restrict who can replicate
listen_addresses = '10.10.1.5' # Internal IP of Primary on CoolVDS Private Network
2. Authentication (pg_hba.conf)
Never open replication ports to the public internet. Use a VPN (WireGuard) or a private LAN.
# /etc/postgresql/16/main/pg_hba.conf
# TYPE DATABASE USER ADDRESS METHOD
host replication rep_user 10.10.1.6/32 scram-sha-256
3. The Standby Signal
On the secondary server, you don't initdb. You clone the base backup.
sudo -u postgres pg_basebackup -h 10.10.1.5 -D /var/lib/postgresql/16/main -U rep_user -P -v -R -X stream
This command does the heavy lifting. The -R flag automatically generates the standby.signal file and connection settings. If your network latency is low (e.g., staying within the NIX traffic exchange in Oslo), this sync happens almost instantly.
The Filesystem Layer: ZFS Send/Recv
rsync is slow. It walks the filesystem, checks metadata, and eats CPU cycles. If you have millions of small files (common in Magento or WordPress installs), rsync will choke your I/O.
At CoolVDS, we utilize KVM virtualization on top of high-performance storage. If you run your workload on a filesystem that supports snapshots (like ZFS or Btrfs), you can replicate entire datasets incrementally.
Here is how a battle-hardened sysadmin moves data. No file checks. Just block-level transfer.
#!/bin/bash
# A simple script to replicate ZFS snapshots to a DR site
DATE=$(date +%Y%m%d_%H%M%S)
SNAPSHOT_NAME="zroot/data@backup_$DATE"
REMOTE_HOST="root@dr-node.coolvds.no"
# 1. Take a snapshot of the live data (Instantaneous)
zfs snapshot $SNAPSHOT_NAME
# 2. Send the snapshot stream
# -i: incremental (sends only differences from the last snapshot)
LATEST_REMOTE=$(ssh $REMOTE_HOST "zfs list -t snapshot -o name -s creation | tail -1 | cut -d@ -f2")
zfs send -i @$LATEST_REMOTE $SNAPSHOT_NAME | ssh $REMOTE_HOST zfs recv -F zroot/backup_data
# 3. Clean up old snapshots to save NVMe space
zfs destroy zroot/data@$LATEST_REMOTE
Why this matters: Recovery speed is defined by Disk I/O. If you use a budget provider that throttles your IOPS, a zfs recv can take hours. CoolVDS NVMe instances are unthrottled, meaning you restore at the speed of the bus, not an artificial software limit.
Secure Transport: WireGuard Configuration
Replicating data between servers requires encryption. SSH is fine for periodic transfers, but for continuous streaming, the overhead is high. WireGuard is kernel-level, fast, and modern. It existed well before 2024 and remains the standard for secure point-to-point links.
# /etc/wireguard/wg0.conf (On the DR Node)
[Interface]
Address = 10.200.0.2/24
ListenPort = 51820
PrivateKey =
# Connection to Primary Node
[Peer]
PublicKey =
AllowedIPs = 10.200.0.1/32
Endpoint = primary.coolvds.com:51820
PersistentKeepalive = 25
With WireGuard, your replication traffic is invisible to the public internet. This is a critical compliance step for GDPR. Data in transit must be encrypted.
The "CoolVDS" Advantage in Compliance
Technical architecture is useless if the legal foundation is rotten. Many "cloud" providers are essentially resellers of US-based infrastructure. When you deploy on CoolVDS:
- Data Sovereignty: Your data physically resides in Norway. It falls under Norwegian jurisdiction.
- Latency: Replicating a database from Oslo to Frankfurt adds ~15-20ms of latency. Replicating within the Norwegian network adds < 2ms. This allows for synchronous replication (Zero RPO) without killing application performance.
- Hardware Isolation: We don't use containers for VPS. We use KVM. If a neighbor gets DDoS'd, your kernel doesn't panic.
The Drill
A DR plan that hasn't been tested is a hallucination. Once a quarter, you must perform a "Fire Drill."
- Spin up a fresh instance on CoolVDS.
- Point your
pg_basebackupor ZFS stream to it. - Promote the database to Primary.
- Switch your DNS A record (TTL should be 300s or lower).
If this process takes more than 30 minutes, your architecture is too complex. Simplify it. Remove the abstraction layers. Use raw Linux, fast NVMe storage, and solid networking principles.
Disaster is inevitable. Data loss is a choice. Choose your infrastructure wisely.