Console Login

Zero-Downtime Database Migration: A DevOps Survival Guide for 2021

Surviving the Migration: Strategies for High-Stakes Data Moves

I have seen grown men cry over corrupted InnoDB tablespaces. I have watched startups burn months of runway because a botched migration resulted in 48 hours of downtime. Moving a database is not like moving a static website; it is akin to performing a heart transplant while the patient is running a marathon. If you are reading this, you are likely planning a move, and you are likely nervous. Good. Paranoia is a virtue in systems administration.

In the Norwegian market, the stakes are compounded by strict compliance requirements. Since the Schrems II ruling last year (July 2020), simply dumping your data into a US-owned cloud bucket is a legal minefield. You need data sovereignty, you need low latency to NIX (Norwegian Internet Exchange), and you need raw I/O performance. This guide covers how to migrate without losing data, sleep, or your job.

The Hardware Bottleneck: Why NVMe Matters

Before touching a single config file, look at your target infrastructure. The number one cause of failed migrations in 2021 isn't bad SQL; it's I/O saturation. During a restore, your disk is writing at maximum capacity. If you are on standard SATA SSDs (or heaven forbid, spinning rust), your import might take days.

We benchmark our CoolVDS instances using NVMe storage specifically for this reason. When you are replaying binary logs or importing a 50GB dump, IOPS (Input/Output Operations Per Second) are your currency. If you run out, your application hangs.

Pro Tip: Before migrating, benchmark your target disk. If you can't sustain at least 10k random write IOPS, you are going to have a bad time with high-traffic databases.

Here is a quick way to test your disk latency on Linux before you start:

fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --direct=1 --size=512M --numjobs=2 --runtime=240 --group_reporting

Strategy 1: The Logical Dump (For Databases < 20GB)

If your database is small, keep it simple. Tools like mysqldump or pg_dump are reliable. However, the default settings are often too slow for production environments. You need to ensure consistency without locking the tables for too long.

For MySQL 8.0, use the single-transaction flag to ensure InnoDB consistency without table locks:

mysqldump -u root -p --single-transaction --quick --compress --hex-blob target_db | gzip > backup.sql.gz

For PostgreSQL 13, utilize directory format for parallel restores, which leverages multiple CPU cores:

pg_dump -j 4 -F d -f out_dir target_db

Strategy 2: Replication & Cutover (The Zero-Downtime Method)

For anything larger than 20GB, or for applications where downtime costs money, you cannot rely on dump/restore. You need replication. The goal is to set up your CoolVDS instance as a slave (replica) of your current production server, let it catch up, and then switch over.

Step 1: Configure the Source (Master)

You need binary logging enabled. Open your my.cnf (usually in /etc/mysql/) and ensure these settings are present. We are using GTID (Global Transaction Identifiers) because traditional file-position replication is fragile and obsolete.

[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
expire_logs_days = 7
max_binlog_size = 100M
# GTID settings for MySQL 5.7/8.0
gtid_mode = ON
enforce_gtid_consistency = ON

Don't forget to restart MySQL after changing these. Then, create a dedicated replication user. Do not use root.

CREATE USER 'repl_user'@'%' IDENTIFIED WITH mysql_native_password BY 'YourStrongPassword123!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

Step 2: The Initial Sync

Use Percona XtraBackup for the initial data copy. It allows you to take a hot backup without locking the database. Once the data is on your CoolVDS instance, configure it as a slave.

Step 3: Configuring the Replica (CoolVDS)

On the destination server, once the data is restored, configure the link to the master. Note that in MySQL 8, the syntax uses GET_MASTER_PUBLIC_KEY if you aren't using SSL certificates for the connection (though you should be).

CHANGE MASTER TO
    MASTER_HOST='192.168.1.50',  -- IP of your old server
    MASTER_USER='repl_user',
    MASTER_PASSWORD='YourStrongPassword123!',
    MASTER_AUTO_POSITION=1;

START SLAVE;

Monitor the status. You are looking for Seconds_Behind_Master to hit 0. If it stays high, your new server lacks the CPU or I/O juice to catch up.

SHOW SLAVE STATUS\G

Performance Tuning for the New Environment

A common mistake is migrating data but forgetting to tune the configuration for the new hardware. If you moved from a 4GB RAM instance to a 32GB CoolVDS NVMe instance, your old my.cnf is crippling your performance.

Adjust your InnoDB buffer pool. This should be roughly 70-80% of your available RAM on a dedicated database server.

[mysqld]
# Example for a 32GB RAM Server
innodb_buffer_pool_size = 24G
innodb_log_file_size = 2G
innodb_flush_log_at_trx_commit = 1 # Set to 2 only if you can tolerate 1s data loss
innodb_flush_method = O_DIRECT
innodb_io_capacity = 2000 # Increase this for NVMe storage! Default is too low.

The innodb_io_capacity is often overlooked. On standard VPS hosting, 200 is fine. On our NVMe storage, you can push this much higher to utilize the full throughput of the drive.

The Compliance Check: GDPR & Datatilsynet

Technical execution is only half the battle. In Norway, we operate under the watchful eye of Datatilsynet. The Schrems II judgment invalidated the Privacy Shield framework, meaning data transfers to US providers are under heavy scrutiny.

When you migrate to CoolVDS, ensure your data resides in our Oslo data center. This is not just about latency (which is often sub-5ms within Norway); it is about legal safety. Verify your firewall rules to strictly limit access.

ufw allow from 10.0.0.5 to any port 3306 proto tcp

Final Cutover Checklist

Action Details Criticality
TTL Reduction Lower DNS TTL to 300s 24 hours prior. High
Read-Only Mode Set old DB to read-only to prevent split-brain. Critical
Sanity Check Compare row counts on Master vs Slave. High

Migration is stress testing for your infrastructure. If your current provider throttles your CPU during the dump process, or if your I/O wait spikes to 40%, it is time to move. Don't let slow hardware kill your SEO or your uptime.

Ready to validate your migration plan? Deploy a high-performance NVMe instance on CoolVDS today and test your restore speeds. You might be surprised at what real hardware can do.