The Art of the Zero-Downtime Database Migration
There is a specific kind of silence that falls over an office when a database migration goes wrong. I experienced it back in 2015 during a botched transfer of a Magento backend. The dump took too long, the lock times exceeded the maintenance window, and we lost transactional data. It was a nightmare.
It is August 2018. We have tools now that make mysqldump > data.sql look like a stone-age relic. With the GDPR enforcement that kicked in last May, losing data or exposing it during transit isn't just an operational failure; it is a legal liability attracting the gaze of Datatilsynet.
Whether you are moving from a legacy bare-metal server in Oslo to a modern cloud setup, or scaling out to a managed VPS provider, the objective remains the same: Zero Downtime. Or at least, downtime so short the users attribute it to a network blip.
The Strategy: Replication, Not Just Restoration
The biggest mistake junior admins make is treating a migration as a copy-paste job. You cannot simply stop the service, zip the files, move them, and unzip. For a 500GB+ database, that implies hours of downtime.
The only professional way to handle this in a production environment is establishing a temporary Master-Slave replication link. The new server (CoolVDS instance) acts as a slave, catching up to the master in real-time. Once the lag is zero, you promote the slave.
Step 1: The Non-Blocking Snapshot
Forget mysqldump. It locks tables. For MySQL/MariaDB, the industry standard right now is Percona XtraBackup. It performs hot backups of InnoDB tables without locking the database.
Here is how we stream a backup directly from the old server to the new CoolVDS host using `xbstream`. This avoids writing to the local disk (which is likely already I/O constrained).
# On the SOURCE server
innobackupex --user=root --password=YOUR_PASS --stream=xbstream --parallel=4 /tmp | \
ssh root@185.x.x.x "xbstream -x -C /var/lib/mysql_temp/"
Pro Tip: If your network link between the data centers is weak, throw `gzip` into the pipe. However, if you are migrating to a CoolVDS instance, our peering at NIX (Norwegian Internet Exchange) usually saturates the link before CPU becomes the bottleneck, so compression might actually slow you down.
Step 2: Preparing the Data
Once the data lands on the target, it is in a "raw" state. You need to apply the transaction logs to make it consistent. This is where disk I/O becomes the limiting factor. On standard SATA SSDs, this can take hours. On NVMe storage (which is why I prefer CoolVDS for DB workloads), it flies.
# On the TARGET server
innobackupex --apply-log --use-memory=4G /var/lib/mysql_temp/
Once prepared, swap the ownership and move it to the real data directory:
chown -R mysql:mysql /var/lib/mysql_temp
rsync -avP /var/lib/mysql_temp/ /var/lib/mysql/
Configuring the Replication Link
Check the `xtrabackup_binlog_info` file on the target machine. It contains the exact binary log coordinates where the snapshot ended. You need these to tell the slave where to start reading.
cat /var/lib/mysql/xtrabackup_binlog_info
# Output: mysql-bin.000456 1285093
Now, configure the slave:
CHANGE MASTER TO
MASTER_HOST='10.x.x.x',
MASTER_USER='repl_user',
MASTER_PASSWORD='secure_password',
MASTER_LOG_FILE='mysql-bin.000456',
MASTER_LOG_POS=1285093;
Start the slave and watch the magic happen:
START SLAVE;
SHOW SLAVE STATUS\G
Look for Seconds_Behind_Master. It should start high and decrease. If it isn't decreasing, your new server's disk I/O cannot keep up with the write rate of the master. This is a common issue with budget VPS providers who oversell storage throughput.
The Hardware Reality: NVMe vs. SATA
In 2018, we are seeing a pivot. SATA SSDs are fine for web servers, but for databases, the protocol overhead prevents high concurrency. NVMe connects directly via the PCIe bus. When you are replaying binary logs to catch up a slave, you are essentially hammering the disk with random writes.
| Metric | Standard VPS (SATA SSD) | CoolVDS (NVMe) |
|---|---|---|
| Random IOPS | ~5,000 - 10,000 | ~300,000+ |
| Latency | ~0.5ms | ~0.02ms |
| Replication Lag | Often spikes under load | Near instantaneous |
Tuning for Norway: Latency and Compliance
Latency is physics. If your users are in Oslo and your server is in Frankfurt, you are adding 20-30ms round trip time (RTT) to every query that isn't cached. For a Magento or WordPress site making 50 queries per page load, that adds up to 1.5 seconds of pure network wait time.
Furthermore, GDPR Article 44 restricts data transfers outside the EEA. Hosting locally in Norway or within the EU isn't just about speed; it's about not waking up to a fine. We ensure all CoolVDS infrastructure complies strictly with Norwegian data sovereignty requirements.
Optimizing `my.cnf` for the New Hardware
Do not copy your old `my.cnf` blindly. You have better hardware now. Adjust your InnoDB buffer pool to use 70-80% of available RAM, and ensure your log file size is adequate to prevent aggressive checkpointing.
[mysqld]
# Use 70-80% of RAM
innodb_buffer_pool_size = 12G
# Critical for write-heavy workloads
innodb_log_file_size = 2G
# Ensure durability (ACID compliance)
innodb_flush_log_at_trx_commit = 1
# Optimize for SSD/NVMe
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
The Cutover
Once Seconds_Behind_Master is 0, you are ready.
- Put your application in maintenance mode (or read-only).
- Verify the slave has processed the final transaction.
- Stop the slave.
- Update your application config to point to the new CoolVDS IP.
- Start the application.
Total downtime? About 30 to 60 seconds. No data lost. No corrupted tables. Just a faster, more responsive database.
Don't let legacy infrastructure throttle your growth. If you are ready to test how NVMe handles your workload, deploy a high-performance instance on CoolVDS today.