Database Migration War Games: Strategies for Zero Downtime in 2021
I hate database migrations. You should too. If you don't feel a knot in your stomach when you type DROP TABLE, you haven't been in ops long enough. I've seen "seamless" migrations turn into 14-hour downtime nightmares because someone forgot to check the max_allowed_packet size or didn't account for network latency between the old data center and the new one.
In 2021, with the demands for uptime higher than ever, a "maintenance window" is just a polite way of saying you lost revenue. Whether you are moving from a legacy bare metal server to a modern VPS, or escaping a restrictive cloud provider, the goal is the same: Zero. Detectable. Downtime.
This isn't a theory post. This is how we actually move terabytes of data across the wire, focusing on the Norwegian infrastructure landscape where latency to NIX (Norwegian Internet Exchange) dictates your success.
The Pre-Flight Check: Know Your I/O Limits
Most migrations fail before the first byte is transferred. They fail because the target environment can't write as fast as the source is sending, or the source crashes under the read load. Before you even think about rsync, look at your disk queues.
On your current server, run this:
iostat -x 1 10
If your %util is consistently above 80%, you are in the danger zone. Taking a snapshot or running a dump could lock the database entirely.
Pro Tip: Never migrate to a host using spinning disks (HDD) or shared SATA SSDs for the database layer. The IOPS (Input/Output Operations Per Second) will choke during the restore process. This is why we standardize on NVMe storage at CoolVDS. When you are replaying binary logs to catch up, you need every ounce of write speed you can get.
The Strategy: Replication, Not Dumps
The amateur moves a database by stopping the service, running mysqldump, transferring the file, and restoring it. That's fine for a 50MB blog. It's suicide for a 50GB e-commerce store.
The professional strategy for 2021 is Live Replication.
Scenario: Migrating MySQL 8.0 to CoolVDS
We will set up the new CoolVDS instance as a replica (slave) of your current live server (master). Once they are in sync, we promote the slave.
Step 1: Configure the Master
Edit your my.cnf (usually in /etc/mysql/) on the source server. You need GTID (Global Transaction ID) enabled for sanity.
[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
gtid_mode = ON
enforce_gtid_consistency = ON
log_slave_updates = ON
Restart MySQL. Yes, this requires a momentary restart, but it's the only downtime you'll have until the switchover.
Step 2: Create a Replication User
Don't use root. Create a dedicated user with strict IP limits. If your CoolVDS IP is 185.x.x.x:
CREATE USER 'repl_user'@'185.x.x.x' IDENTIFIED WITH mysql_native_password BY 'Complex_Password_2021!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'185.x.x.x';
FLUSH PRIVILEGES;
Step 3: The Initial Sync (Percona XtraBackup)
Don't use mysqldump for large datasets; it's too slow. Use Percona XtraBackup. It copies the physical data files without locking the database.
xtrabackup --backup --target-dir=/data/backups/ --user=root --password=YOURPASS
Transfer this to your CoolVDS instance using rsync. Since we are dealing with Norwegian infrastructure, if both servers are connected to NIX, you should see transfer speeds saturation on 1Gbps or 10Gbps ports.
rsync -avzP /data/backups/ user@coolvds-instance:/data/backups/
The Network Factor: Why Latency Kills Consistency
In distributed systems, the speed of light is a hard limit. If your app server is in Oslo but your database is in Frankfurt, you are adding ~20-30ms to every query round trip. For a Magento page generating 150 queries, that's 4.5 seconds of pure network lag. Unacceptable.
When migrating, keep the compute close to the data. If your customers are in Norway, your VPS needs to be in Norway. It’s simple physics.
PostgreSQL Specifics: The `pg_basebackup` Route
For the Postgres crowd (v12 or v13), the process is similar but the tools differ. We use streaming replication.
On your new CoolVDS instance (the standby), run:
pg_basebackup -h source_ip -D /var/lib/postgresql/13/main -U repl_user -P -X stream -R
The -R flag is magic—it automatically writes the standby.signal file and connection settings for you. No manual config hacking required.
The Cutover: Managing the Switch
Once your Seconds_Behind_Master is 0 (check this in MySQL with SHOW SLAVE STATUS\G), you are ready.
- Lower TTLs: 24 hours before migration, set your DNS TTL to 300 seconds.
- Stop Writes: Put your application in maintenance mode.
- Verify Sync: Ensure the Master and CoolVDS Slave have identical GTID sets.
- Promote Slave:
STOP SLAVE; RESET SLAVE ALL; - Point App: Update your application config to point to the new CoolVDS IP.
The Legal Elephant in the Room: Schrems II and GDPR
It's 2021. We can't talk about data without talking about compliance. Since the Schrems II ruling last year, transferring personal data to US-owned clouds has become a legal minefield for European companies. The Norwegian Data Protection Authority (Datatilsynet) is watching.
Migrating your database to a Norwegian-owned provider like CoolVDS isn't just a technical upgrade; it's a compliance shield. Your data stays on soil governed by EEA law, without the risk of the US CLOUD Act reach. For a CTO, that sleep assurance is worth more than the raw IOPS (though we have plenty of those too).
Why Hardware Matters
I mentioned earlier that "noisy neighbors" can ruin a migration. In a shared hosting environment, if another user decides to mine crypto or compile a kernel, your database restore speed drops.
We use KVM virtualization. This isn't just a buzzword. It means strict resource isolation. Your RAM is yours. Your CPU cycles are yours. When you are importing 50 million rows, you need that guarantee.
Final Thoughts
Migrations are risky, but staying on legacy infrastructure is riskier. By using replication instead of dump-and-restore, and ensuring your target hardware uses NVMe storage to handle the write flood, you mitigate 99% of the disaster scenarios.
Don't let slow I/O kill your SEO or your uptime. If you need a target environment that can handle the load, deploy a test instance on CoolVDS. Spin it up, benchmark it with sysbench, and see the difference.