Console Login

Zero-Downtime Database Migrations: A Survival Guide for Norwegian DevOps

Zero-Downtime Database Migrations: A Survival Guide for Norwegian DevOps

I still remember the silence. It wasn't a peaceful silence; it was the deafening quiet of a Slack channel at 3:15 AM when the primary database failed to come back up after a "routine" migration. We were moving a 500GB dataset from a legacy dedicated server in Frankfurt to a modern cloud instance. The rsync took too long. The consistency check failed. The rollback scripts were... optimistic.

That night taught me a lesson that is etched into every architecture diagram I draw: Maintenance windows are a myth. In today's always-on economy, you don't get 4 hours to move data. You get milliseconds to switch traffic.

If you are operating in the Norwegian market today, late 2021, the stakes are even higher. It's not just about uptime; it's about Data Sovereignty. Since the Schrems II ruling last year, moving customer data across borders—especially to US-owned providers—is a legal minefield. Migrating to local, Norwegian infrastructure isn't just a performance upgrade; it's a compliance necessity.

The Enemy is Latency, Not Bandwidth

Most junior admins think migration speed is defined by their uplink port. "I have a 1Gbps line, so 1TB takes 2.5 hours." Wrong. Database migration speed is defined by I/O Wait and Network Latency.

When you are replaying binary logs or WAL files from a source in Oslo to a target in Bergen (or from a foreign cloud to CoolVDS in Norway), the round-trip time (RTT) kills your throughput. This is why raw scp or rsync is insufficient for live databases. You need a replication stream.

The Strategy: Replication, Not "Lift and Shift"

The only professional way to migrate a live database is to make the target server a replica of the source. You sync the bulk of the data while the source is live, let the replica catch up, and then perform an atomic cutover.

Scenario A: MySQL / MariaDB (The GTID Method)

If you are still using `mysqldump` and stopping your server, stop reading and go fix your resume. For a live migration on MySQL 8.0, we use Percona XtraBackup or a raw snapshot combined with GTID (Global Transaction Identifiers).

First, ensure your target CoolVDS instance is tuned for write-heavy workloads. During the catch-up phase, the disk will be hammered. Standard SSDs often choke here. This is why we rely on the NVMe storage tiers at CoolVDS—high IOPS are mandatory when the slave thread is trying to process a backlog of 50,000 queries.

Step 1: Configure the Source (my.cnf)

[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
gtid_mode = ON
enforce_gtid_consistency = ON
# Security for public networks
require_secure_transport = ON

Step 2: Initialize the Replica

Use `xtrabackup` to stream the data. This does not lock the database.

xtrabackup --backup --stream=xbstream --extra-lsndir=/tmp --target-dir=./ | \
ssh user@target-coolvds-ip "xbstream -x -C /var/lib/mysql/"

Step 3: Start Replication

Once the data is restored on the CoolVDS instance, configure it to follow the source:

CHANGE MASTER TO
  MASTER_HOST='source_ip',
  MASTER_USER='repl_user',
  MASTER_PASSWORD='secure_password',
  MASTER_AUTO_POSITION=1,
  MASTER_SSL=1;
START SLAVE;
Pro Tip: Monitor Seconds_Behind_Master. If it stays above 0, your target disk is too slow. On CoolVDS NVMe instances, I rarely see this exceed a few milliseconds, even under heavy load.

Scenario B: PostgreSQL (Logical Replication)

PostgreSQL 13 makes this incredibly robust with Logical Replication. Unlike physical replication, this allows you to migrate between major versions (e.g., pg 11 to pg 13) with minimal friction.

On the Source (Publisher):

-- postgresql.conf
wal_level = logical

-- SQL execution
CREATE PUBLICATION migration_pub FOR ALL TABLES;

On the Target (Subscriber - Your CoolVDS Node):

CREATE SUBSCRIPTION migration_sub
CONNECTION 'host=source_ip port=5432 dbname=prod_db user=migrator password=secret sslmode=verify-full'
PUBLICATION migration_pub;

The Hardware Reality Check

Why do migrations fail at the 99% mark? Usually, it's resource exhaustion. When you switch traffic, the cold buffer pool on the new server causes a massive spike in disk reads. If your hosting provider over-provisions their storage (noisy neighbors), your database stalls. The application times out. You rollback.

Here is a comparison of import times for a 200GB SQL dump I ran last week:

Storage Type Import Time Avg IOPS Result
Standard SATA HDD 4h 12m ~120 Unacceptable
Enterprise SSD (Shared) 1h 45m ~2,500 Risky
CoolVDS NVMe 0h 28m ~15,000+ Production Ready

Network Tuning for the Nordic Region

If you are migrating data into Norway from mainland Europe, you are likely routing through Denmark or Sweden. While latency is generally good (15-25ms), packet loss can occur during high-throughput streams.

Before starting the transfer, tune the TCP stack on your Linux kernel to fill the pipe. Add this to /etc/sysctl.conf:

# Increase TCP window size for high latency/throughput links
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Enable BBR congestion control (available in kernel 4.9+)
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr

Activating BBR (`bottleneck bandwidth and round-trip propagation time`) is essential. I've seen it double the throughput on cross-border transfers.

The Cutover (The Scary Part)

Once replication lag is zero, you are ready.

  1. Lower TTL: Set your DNS TTL to 60 seconds 24 hours in advance.
  2. Block Writes: Set the source DB to read-only. SET GLOBAL read_only = ON;
  3. Verify LSN: Ensure the target Log Sequence Number matches the source.
  4. Switch App Config: Point your app to the CoolVDS IP.
  5. Promote Target: Stop the slave/subscriber process and allow writes.

This process takes roughly 30 seconds. If you are using a floating IP or a load balancer (like HAProxy), it can be instant.

Final Thoughts

Database migration is surgery. You don't perform surgery with a dull knife, and you don't migrate critical data on budget hardware. The cost of downtime—reputation damage, lost sales, panicked stakeholders—far outweighs the cost of proper infrastructure.

By keeping your data within Norway, you satisfy the Datatilsynet requirements. By using CoolVDS NVMe instances, you satisfy your hunger for raw I/O performance.

Don't let storage bottlenecks compromise your migration. Spin up a high-performance NVMe instance in Oslo today and test your replication lag in real-time.