Console Login

Disaster Recovery in 2019: If You Can't Script It, You Can't Survive It

Disaster Recovery in 2019: If You Can't Script It, You Can't Survive It

I recently watched a Senior SysAdmin weep. Not figuratively. He was literally crying in a server room (well, a Zoom call, given the times) because a rm -rf /var/lib/mysql command was run on the wrong terminal window. He thought he had backups. He had snapshots from six months ago and a corruption-riddled dump file that hadn't been tested since 2017.

If your disaster recovery (DR) plan is a PDF document stored on the same file server that just caught fire, you don't have a plan. You have a suicide note.

In the Norwegian hosting market, where we pride ourselves on stability and strict adherence to GDPR, losing customer data isn't just an operational failure; it's a legal catastrophe involving Datatilsynet. Whether you are running a Magento cluster or a simple WordPress stack, the principles of survival remain the same: Redundancy, Automation, and Verification.

The Myth of the "Snapshot"

Let's clear the air immediately. A VPS snapshot is not a backup. I repeat: A snapshot is not a backup.

Snapshots are excellent for short-term rollback points before you run apt-get dist-upgrade or mess with your iptables. But if the underlying storage array fails, or if filesystem corruption occurs at the block level, that snapshot is just as dead as your production data. At CoolVDS, we use KVM virtualization which allows for robust snapshots, but we will be the first to tell you: move your data off-site.

Feature Snapshot Offsite Backup
Creation Speed Seconds (Copy-on-Write) Minutes/Hours (Network dependent)
Data Consistency Crash-consistent (usually) Application-consistent (if scripted right)
Protection Scope OS/Config errors Hardware failure, Data Center loss
Storage Location Same Infrastructure Remote (e.g., separate CoolVDS instance)

Step 1: Database Consistency is King

Copying files while the database server is running is a recipe for corruption. You need a consistent dump. For MySQL/MariaDB (still the kings of the web in 2019), you must use --single-transaction to avoid locking up your tables and bringing your site down during the backup.

Here is the bare minimum command you should be running:

mysqldump -u root -p --all-databases --single-transaction --quick --lock-tables=false > full_backup.sql

However, putting the password in the command line is a security risk (visible via ps aux). Instead, use a .my.cnf file.

Pro Tip: Always compress your text-based SQL dumps. The compression ratio on text is massive, often reducing a 10GB database to 500MB, saving you bandwidth costs and transfer time.

Here is a robust Bash function to handle the dump and compression:


# Function to perform consistent MySQL Backup
backup_mysql() {
    local DATE=$(date +%F_%H-%M-%S)
    local BACKUP_DIR="/var/backups/sql"
    local FILE_NAME="db_dump_$DATE.sql.gz"

    mkdir -p $BACKUP_DIR

    # Using --defaults-extra-file to hide credentials
    # Ensure /root/.my.cnf exists with [client] section
    mysqldump --defaults-extra-file=/root/.my.cnf \
        --all-databases \
        --single-transaction \
        --quick \
        --events \
        --routines \
        --triggers | gzip > "$BACKUP_DIR/$FILE_NAME"

    if [ $? -eq 0 ]; then
        echo "[SUCCESS] Database backup created: $FILE_NAME"
        return 0
    else
        echo "[ERROR] Database backup failed!"
        return 1
    fi
}

Step 2: The Filesystem & Incremental Transfers

Once your database is safe, you need your static files (images, configs, uploads). Transferring the whole server every night is inefficient and eats up I/O. This is where rsync shines.

If you aren't using hard links for versioning, you are wasting space. Tools like rsnapshot automate this, but understanding the raw command is vital.

rsync -avz --delete /var/www/html/ remote_user@backup_server:/backups/current/

The --delete flag is dangerous. It mirrors the source perfectly, meaning if you delete a file on production, it vanishes from the backup. This is why we need rotation.

Step 3: The "Battle-Hardened" Backup Script

Here is a complete, production-grade backup script. It dumps the database, archives the webroot, encrypts the archive (because GDPR requires encryption at rest), and ships it to a remote storage server. This assumes you have SSH keys set up between your CoolVDS production instance and your backup node.


#!/bin/bash

# CONFIGURATION
PROJECT_NAME="magento_production"
BACKUP_ROOT="/backup/staging"
REMOTE_DEST="user@192.168.1.50:/mnt/storage/backups/$PROJECT_NAME"
RETENTION_DAYS=7
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
LOG_FILE="/var/log/backup_script.log"

# LOGGING FUNCTION
log() {
    echo "[$(date +'%Y-%m-%d %H:%M:%S')] $1" | tee -a $LOG_FILE
}

log "Starting Disaster Recovery Backup for $PROJECT_NAME"

# 1. PREPARE STAGING
mkdir -p $BACKUP_ROOT/$TIMESTAMP

# 2. DATABASE DUMP
log "Dumping Database..."
mysqldump --defaults-extra-file=/root/.my.cnf --single-transaction --all-databases | gzip > $BACKUP_ROOT/$TIMESTAMP/db.sql.gz

if [ ${PIPESTATUS[0]} -ne 0 ]; then
    log "CRITICAL: Database dump failed. Aborting."
    exit 1
fi

# 3. FILESYSTEM ARCHIVE
log "Archiving Filesystem..."
tar -czf $BACKUP_ROOT/$TIMESTAMP/files.tar.gz -C /var/www/html .

# 4. ENCRYPTION (GPG)
# We use a symmetric cipher for simplicity here, but public key is better
log "Encrypting Archive..."
tar -cf - $BACKUP_ROOT/$TIMESTAMP | gpg --symmetric --cipher-algo AES256 --passphrase-file /root/.backup_pass -o $BACKUP_ROOT/$PROJECT_NAME-$TIMESTAMP.tar.gpg

# 5. OFFSITE TRANSFER
log "Shipping to Remote Storage..."
rsync -av --remove-source-files $BACKUP_ROOT/$PROJECT_NAME-$TIMESTAMP.tar.gpg $REMOTE_DEST

if [ $? -eq 0 ]; then
    log "Transfer Successful."
    # CLEANUP LOCAL STAGING
    rm -rf $BACKUP_ROOT/$TIMESTAMP
else
    log "CRITICAL: Rsync failed. Local backup kept at $BACKUP_ROOT."
    exit 1
fi

log "Backup Complete."

Step 4: Infrastructure as Code (IaC)

Having the data is only half the battle. If your server melts, how fast can you provision a new one? In 2019, manually installing packages is obsolete. We use Ansible to define the state of our servers.

Imagine your CoolVDS instance goes dark. You spin up a new NVMe instance in the control panel (taking about 55 seconds), and then you run this Ansible playbook against the fresh IP:


---
- hosts: disaster_recovery
  become: yes
  vars:
    mysql_root_pass: "super_secure_password"

  tasks:
    - name: Install LEMP Stack Requirements
      apt:
        name: ['nginx', 'mysql-server', 'php-fpm', 'php-mysql', 'unzip']
        state: present
        update_cache: yes

    - name: Configure Nginx VHost
      template:
        src: templates/vhost.j2
        dest: /etc/nginx/sites-available/default
      notify: restart_nginx

    - name: Secure MySQL Installation
      mysql_user:
        name: root
        password: "{{ mysql_root_pass }}"
        host_all: yes
        priv: '*.*:ALL,GRANT'

    - name: Create Web Root
      file:
        path: /var/www/html/public
        state: directory
        owner: www-data
        group: www-data
        mode: 0755

  handlers:
    - name: restart_nginx
      service:
        name: nginx
        state: restarted

The Norwegian Context: Latency and Law

Why does geography matter in DR? Two reasons: Latency and Legality.

If your primary audience is in Oslo or Bergen, restoring a 500GB backup from a server in Virginia, USA is going to be painfully slow. You are limited by the speed of light and trans-Atlantic fiber congestion. Restoring from a secondary CoolVDS location within Scandinavia ensures you can hit those max transfer speeds, getting your business back online in minutes, not hours.

Furthermore, under GDPR (Article 32), you are required to have the ability to restore the availability and access to personal data in a timely manner. If you are storing backups outside the EEA, you are entering a legal minefield regarding data transfer mechanisms. Keep it local. Keep it safe.

Verifying the Integrity

A backup you haven't restored is just a file taking up space. You must automate the verification. Once a week, your script should fetch the latest backup, restore it to a temporary Docker container or a staging VPS, and check if the database responds.

mysqlcheck -u root -p --all-databases --check-upgrade --auto-repair

Don't wait for a crisis to find out your GPG key expired or your SQL dump was truncated.

Final Thoughts

Hardware fails. Software has bugs. Humans make typos. This is the reality of our industry. The difference between a minor hiccup and a business-ending event is the quality of your scripts and the reliability of your infrastructure partner.

CoolVDS offers the raw NVMe performance required to perform these backups quickly without locking your I/O, and the stability to ensure we are there when you need to restore. But the scripts? Those are up to you. Start writing.

Ready to test your DR plan? Deploy a sandbox instance on CoolVDS today and see how fast you can break—and fix—your stack.