Zero-Downtime Database Migration: A Survival Guide for European Infrastructure Teams
It is February 2021. The dust from the Schrems II ruling hasn't settled; it is choking us. If you handle personal data of European citizens, relying on the EU-US Privacy Shield is no longer a strategy—it is a liability. I have spent the last three months helping Oslo-based fintechs repatriate their data from AWS and Azure back to sovereign Norwegian infrastructure. It is not just about compliance; it is about performance.
The problem? You cannot simply turn off a production database for six hours to run scp. In the world of high-frequency transactions, downtime is death.
This guide documents the exact replication-based migration strategy I use to move terabyte-scale databases to local NVMe infrastructure with less than 5 seconds of cutover time.
The "Dump and Restore" Trap
Junior admins love mysqldump. It is safe, text-based, and easy to read. It is also completely useless for datasets larger than 50GB if you care about TTR (Time To Recovery). Parsing SQL text statements is CPU-intensive and slow.
I recently watched a dev team try to migrate a 300GB Magento database to a "budget" VPS provider using a logical dump. The import took 14 hours. Why? Because their provider capped disk IOPS. During the import, the disk queue length hit 150+. The CPU spent 90% of its time in iowait.
Pro Tip: Never perform a logical dump for migration if you can avoid it. Use physical backups (binary copies of data files). For MySQL, use Percona XtraBackup. For PostgreSQL, use pg_basebackup.
The Architecture: Master-Slave Across The WAN
The only robust way to migrate without downtime is to treat your new server (the CoolVDS instance in Oslo) as a read-replica of your current production master (likely in Frankfurt or Virginia).
Step 1: The Secure Tunnel
Do not expose port 3306 to the public internet. Just don't. We use WireGuard for this. It is faster than OpenVPN and has been in the Linux kernel since last year (2020). It handles the latency between the US and Norway surprisingly well.
# On the CoolVDS instance (Target)
wg-quick up wg0
# Latency check to US East
ping -c 4 10.0.0.1
# Result: 98ms. Acceptable for async replication.
Step 2: Preparing the Target (CoolVDS)
Before we start writing data, we need to tune the target kernel for write-heavy throughput. Most default Linux distros are tuned for general usage, not database ingestion.
On your CoolVDS instance, adjust the sysctl.conf to handle the incoming flood of data:
# /etc/sysctl.conf
vm.swappiness = 1
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
Apply with sysctl -p.
Step 3: The Initial Sync (Percona XtraBackup)
We will stream the backup directly from the source to the target over SSH/WireGuard without creating a local file first. This saves disk space and time. Here is the command I use for MySQL 8.0:
# Run this from the SOURCE server
xtrabackup --backup --stream=xbstream --parallel=4 \
| ssh user@coolvds-oslo-ip "xbstream -x -C /var/lib/mysql/data/"
Note the --parallel=4 flag. This is where hardware matters. If you are migrating to a legacy VPS with SATA SSDs, parallel threads will choke the I/O controller. On CoolVDS, where the underlying storage is enterprise NVMe, we can crank this up to saturate the network link before we saturate the disk.
Step 4: Configure Replication
Once the data is restored and permissions fixed, configure the CoolVDS instance as a replica. Ensure GTID (Global Transaction ID) is enabled on both sides—it makes failover trivial compared to log file positioning.
Add this to your my.cnf on the target before starting the replication to speed up the catch-up phase:
[mysqld]
# TEMPORARY SETTINGS FOR IMPORT/CATCH-UP
innodb_flush_log_at_trx_commit = 2
sync_binlog = 0
innodb_buffer_pool_size = 6G # Assume 8GB RAM VPS
innodb_log_file_size = 1G
innodb_write_io_threads = 8
Warning: Change sync_binlog and innodb_flush_log_at_trx_commit back to 1 after the migration is complete to ensure ACID compliance.
The PostgreSQL Approach
For the Postgres fans (and I know there are many of you in the Norwegian dev scene), the logic is identical, but the tools differ. Use pg_basebackup with the -R flag to automatically create the standby.signal file.
pg_basebackup -h source_host -D /var/lib/postgresql/13/main \
-U replicator -P -v -R -X stream -C -S migration_slot
This command creates a replication slot, streams the WAL files, and configures the CoolVDS instance to follow the master immediately.
The Cutover: Managing the DNS Switch
Once your replication lag is zero (check Seconds_Behind_Master), you are ready.
- Lower TTL: Set your DNS TTL to 60 seconds roughly 24 hours before migration.
- Read-Only Mode: Set the source DB to read-only.
SET GLOBAL super_read_only = 1;. This prevents data divergence. - Wait: Watch the CoolVDS replica execute the final transactions. It should take milliseconds.
- Promote: Stop the slave on CoolVDS and reset master.
STOP SLAVE; RESET MASTER;. - Switch: Update your application connection strings or DNS to point to the CoolVDS IP.
Why Infrastructure Choice Dictates Success
In 2021, database performance is rarely CPU-bound; it is I/O bound. When a database is catching up on replication, it is doing massive random writes.
| Storage Type | Random Write IOPS | Replication Lag Recovery |
|---|---|---|
| Standard SATA SSD (Budget VPS) | ~5,000 | Slow (High Risk) |
| CoolVDS NVMe | ~80,000+ | Instant |
I recently audited a client who tried to save 50 NOK/month by hosting on a shared platform with "SSD caching." During the catch-up phase, their replication lag increased instead of decreasing because the disk couldn't write the binary logs fast enough. They had to abort.
CoolVDS instances provide consistent NVMe performance that isolates you from "noisy neighbors." When you are syncing a 10 million row table, you need that isolation.
Legal & Latency: The Nordic Advantage
By moving your database to CoolVDS in Norway, you kill two birds with one stone. First, you satisfy Datatilsynet requirements by keeping data within the EEA and out of US jurisdiction (a massive win post-Schrems II). Second, you slash latency. If your customers are in Oslo, Bergen, or Trondheim, routing packets to Frankfurt adds 20-30ms of round-trip time. Locally peering via NIX (Norwegian Internet Exchange) brings that down to <2ms.
Don't let legacy infrastructure or legal ambiguity hold your data hostage. Spin up a high-performance NVMe instance, set up your WireGuard tunnel, and bring your data home.