Console Login

Disaster Recovery in 2014: Why Your "Backup Strategy" Is Likely a Ticking Time Bomb

Disaster Recovery in 2014: Why Your "Backup Strategy" Is Likely a Ticking Time Bomb

I still wake up in a cold sweat thinking about "Black Friday" 2012. Not the shopping holiday. The Friday night a legacy dedicated server's RAID controller decided to silently corrupt data while reporting "Health Status: OK." By the time we noticed the I/O errors in dmesg, the filesystem was essentially Swiss cheese. We lost 48 hours of transaction data. Why? Because our "backup" was a nightly job that saved to a secondary partition on the same physical disk array.

Amateur hour. I learned the hard way so you don't have to.

In the systems administration world, uptime is vanity, but recovery is sanity. With the recent explosion of cloud technologies and the shifting sands of virtualization (KVM vs OpenVZ), relyng on your hosting provider's "guarantee" is not a strategy. It's negligence.

The 3-2-1 Rule: Non-Negotiable Physics

You have heard it before, but you are probably not doing it. The 3-2-1 rule is the only religion I subscribe to in the data center:

  • 3 copies of your data.
  • 2 different media types (e.g., Live SSD RAID + Rotation HDD Backup).
  • 1 copy offsite (If your server is in Oslo, your backup should be in Bergen or Frankfurt).

At CoolVDS, we utilize enterprise-grade RAID-10 arrays with battery-backed cache units. This covers hardware redundancy. But it does not protect you from rm -rf /var/www or a SQL injection attack dropping your tables. You need an external lifeline.

Automating the "Lifeline" with Bash and Rsync

Stop using FTP scripts. They are insecure and fragile. In 2014, if you aren't using SSH keys and rsync, you are doing it wrong. We need a script that dumps the database, archives the web root, and pushes it to a remote storage box—preferably one connected via the Norwegian Internet Exchange (NIX) for low latency transfer.

1. The Database Dump

Do not just copy the raw /var/lib/mysql folder while the server is running. That guarantees corruption. Use mysqldump with the --single-transaction flag to avoid locking InnoDB tables during the backup.

#!/bin/bash

# Configuration
DB_USER="root"
DB_PASS="sUp3rS3cr3tP4ss"
BACKUP_DIR="/backup/daily"
DATE=$(date +"%Y-%m-%d")

# Ensure backup directory exists
mkdir -p $BACKUP_DIR

# Dump all databases
# --single-transaction: Critical for InnoDB consistency without locking
# --routines: Don't forget stored procedures!
echo "Starting MySQL dump..."
mysqldump -u$DB_USER -p$DB_PASS --all-databases --single-transaction --routines --triggers | gzip > "$BACKUP_DIR/db_dump_$DATE.sql.gz"

# Check status
if [ $? -eq 0 ]; then
    echo "Database backup successful."
else
    echo "Database backup FAILED!"
    exit 1
fi

2. The Filesystem Sync

Incremental backups are key. You don't want to transfer 50GB of static images every night if only three changed. This is where rsync shines.

# Remote server details (Set up SSH keys first!)
REMOTE_USER="backupuser"
REMOTE_HOST="backup.coolvds-offsite.no"
REMOTE_DIR="/home/backupuser/backups/"

# Sync command
# -a: Archive mode (preserves permissions, times, owners)
# -v: Verbose
# -z: Compress during transfer (saves bandwidth)
# --delete: Removes files on destination that no longer exist on source (Careful with this!)

rsync -avz --delete $BACKUP_DIR /var/www/html $REMOTE_USER@$REMOTE_HOST:$REMOTE_DIR
Pro Tip: Always exclude the cache folders. There is no point in backing up var/cache or tmp directories. Use --exclude 'cache/' in your rsync command to save bandwidth and storage costs.

Data Sovereignty: The Norwegian Advantage

Since the Snowden leaks last year, data sovereignty has moved from a legal buzzword to a critical business requirement. If you are hosting sensitive customer data, relying on US-based cloud giants puts you under the jurisdiction of the Patriot Act.

This is where local topology matters. Norway's Personopplysningsloven (Personal Data Act) and the oversight of Datatilsynet offer a robust privacy framework.

By hosting on CoolVDS, your data resides physically in Oslo. We peer directly at NIX. This offers two distinct advantages:

  1. Latency: Your offsite backup transfer to another Norwegian endpoint stays within the country's fiber ring, often giving you throughput speeds that rival local disk I/O.
  2. Legal Clarity: Your data never crosses borders, keeping your compliance officer happy.

Testing: The Fire Drill

A backup you haven't restored is just a rumor. I dedicate the first Monday of every month to "Recovery Drills." I spin up a fresh KVM instance on CoolVDS (takes about 55 seconds), and attempt to restore the service from the previous night's backup scripts.

Here is a snippet of a restoration check script I run on the fresh node:

#!/bin/bash

# Unpack database
gunzip < db_dump_2014-09-24.sql.gz | mysql -u root -p

# Verify table integrity
# This loops through tables and runs a CHECK TABLE command
mysql -u root -p -N -e "SHOW TABLES" | while read table; do 
    mysql -u root -p -e "CHECK TABLE $table;"
done

If you see OK across the board, you sleep easy. If you see Corrupt, you thank the gods you found it on a test node and not during a production outage.

The Hardware Reality Check

Software redundancy is vital, but hardware performance dictates recovery speed (RTO - Recovery Time Objective). Many VPS providers oversell their nodes, packing hundreds of OpenVZ containers onto a single spinning HDD array. When everyone tries to backup at 3:00 AM, the I/O wait shoots up, and your site crawls.

Feature Budget Hosting (OpenVZ) CoolVDS (KVM)
Isolation Shared Kernel (Insecure) Full Hardware Virtualization
Storage SATA HDD (Shared) Pure SSD RAID-10
Swap Usage Often unavailable Dedicated Swap Partition
Noisy Neighbors High Impact Minimal Impact

We use KVM virtualization exclusively. This means your RAM is your RAM, and your kernel is your kernel. Combined with pure SSD storage, restoration tasks that used to take hours on spinning rust now take minutes.

Final Thoughts

Disaster recovery isn't a product you buy; it's a process you enforce. But having the right infrastructure underneath that process makes all the difference. Don't wait for the smoke to appear.

Secure your infrastructure on a platform that respects data integrity and raw performance. Deploy your Disaster Recovery site on a CoolVDS SSD instance today.