Disaster Recovery in 2019: If You Can't Script It, You Can't Survive It
I recently watched a Senior SysAdmin weep. Not figuratively. He was literally crying in a server room (well, a Zoom call, given the times) because a rm -rf /var/lib/mysql command was run on the wrong terminal window. He thought he had backups. He had snapshots from six months ago and a corruption-riddled dump file that hadn't been tested since 2017.
If your disaster recovery (DR) plan is a PDF document stored on the same file server that just caught fire, you don't have a plan. You have a suicide note.
In the Norwegian hosting market, where we pride ourselves on stability and strict adherence to GDPR, losing customer data isn't just an operational failure; it's a legal catastrophe involving Datatilsynet. Whether you are running a Magento cluster or a simple WordPress stack, the principles of survival remain the same: Redundancy, Automation, and Verification.
The Myth of the "Snapshot"
Let's clear the air immediately. A VPS snapshot is not a backup. I repeat: A snapshot is not a backup.
Snapshots are excellent for short-term rollback points before you run apt-get dist-upgrade or mess with your iptables. But if the underlying storage array fails, or if filesystem corruption occurs at the block level, that snapshot is just as dead as your production data. At CoolVDS, we use KVM virtualization which allows for robust snapshots, but we will be the first to tell you: move your data off-site.
| Feature | Snapshot | Offsite Backup |
|---|---|---|
| Creation Speed | Seconds (Copy-on-Write) | Minutes/Hours (Network dependent) |
| Data Consistency | Crash-consistent (usually) | Application-consistent (if scripted right) |
| Protection Scope | OS/Config errors | Hardware failure, Data Center loss |
| Storage Location | Same Infrastructure | Remote (e.g., separate CoolVDS instance) |
Step 1: Database Consistency is King
Copying files while the database server is running is a recipe for corruption. You need a consistent dump. For MySQL/MariaDB (still the kings of the web in 2019), you must use --single-transaction to avoid locking up your tables and bringing your site down during the backup.
Here is the bare minimum command you should be running:
mysqldump -u root -p --all-databases --single-transaction --quick --lock-tables=false > full_backup.sql
However, putting the password in the command line is a security risk (visible via ps aux). Instead, use a .my.cnf file.
Pro Tip: Always compress your text-based SQL dumps. The compression ratio on text is massive, often reducing a 10GB database to 500MB, saving you bandwidth costs and transfer time.
Here is a robust Bash function to handle the dump and compression:
# Function to perform consistent MySQL Backup
backup_mysql() {
local DATE=$(date +%F_%H-%M-%S)
local BACKUP_DIR="/var/backups/sql"
local FILE_NAME="db_dump_$DATE.sql.gz"
mkdir -p $BACKUP_DIR
# Using --defaults-extra-file to hide credentials
# Ensure /root/.my.cnf exists with [client] section
mysqldump --defaults-extra-file=/root/.my.cnf \
--all-databases \
--single-transaction \
--quick \
--events \
--routines \
--triggers | gzip > "$BACKUP_DIR/$FILE_NAME"
if [ $? -eq 0 ]; then
echo "[SUCCESS] Database backup created: $FILE_NAME"
return 0
else
echo "[ERROR] Database backup failed!"
return 1
fi
}
Step 2: The Filesystem & Incremental Transfers
Once your database is safe, you need your static files (images, configs, uploads). Transferring the whole server every night is inefficient and eats up I/O. This is where rsync shines.
If you aren't using hard links for versioning, you are wasting space. Tools like rsnapshot automate this, but understanding the raw command is vital.
rsync -avz --delete /var/www/html/ remote_user@backup_server:/backups/current/
The --delete flag is dangerous. It mirrors the source perfectly, meaning if you delete a file on production, it vanishes from the backup. This is why we need rotation.
Step 3: The "Battle-Hardened" Backup Script
Here is a complete, production-grade backup script. It dumps the database, archives the webroot, encrypts the archive (because GDPR requires encryption at rest), and ships it to a remote storage server. This assumes you have SSH keys set up between your CoolVDS production instance and your backup node.
#!/bin/bash
# CONFIGURATION
PROJECT_NAME="magento_production"
BACKUP_ROOT="/backup/staging"
REMOTE_DEST="user@192.168.1.50:/mnt/storage/backups/$PROJECT_NAME"
RETENTION_DAYS=7
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
LOG_FILE="/var/log/backup_script.log"
# LOGGING FUNCTION
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $1" | tee -a $LOG_FILE
}
log "Starting Disaster Recovery Backup for $PROJECT_NAME"
# 1. PREPARE STAGING
mkdir -p $BACKUP_ROOT/$TIMESTAMP
# 2. DATABASE DUMP
log "Dumping Database..."
mysqldump --defaults-extra-file=/root/.my.cnf --single-transaction --all-databases | gzip > $BACKUP_ROOT/$TIMESTAMP/db.sql.gz
if [ ${PIPESTATUS[0]} -ne 0 ]; then
log "CRITICAL: Database dump failed. Aborting."
exit 1
fi
# 3. FILESYSTEM ARCHIVE
log "Archiving Filesystem..."
tar -czf $BACKUP_ROOT/$TIMESTAMP/files.tar.gz -C /var/www/html .
# 4. ENCRYPTION (GPG)
# We use a symmetric cipher for simplicity here, but public key is better
log "Encrypting Archive..."
tar -cf - $BACKUP_ROOT/$TIMESTAMP | gpg --symmetric --cipher-algo AES256 --passphrase-file /root/.backup_pass -o $BACKUP_ROOT/$PROJECT_NAME-$TIMESTAMP.tar.gpg
# 5. OFFSITE TRANSFER
log "Shipping to Remote Storage..."
rsync -av --remove-source-files $BACKUP_ROOT/$PROJECT_NAME-$TIMESTAMP.tar.gpg $REMOTE_DEST
if [ $? -eq 0 ]; then
log "Transfer Successful."
# CLEANUP LOCAL STAGING
rm -rf $BACKUP_ROOT/$TIMESTAMP
else
log "CRITICAL: Rsync failed. Local backup kept at $BACKUP_ROOT."
exit 1
fi
log "Backup Complete."
Step 4: Infrastructure as Code (IaC)
Having the data is only half the battle. If your server melts, how fast can you provision a new one? In 2019, manually installing packages is obsolete. We use Ansible to define the state of our servers.
Imagine your CoolVDS instance goes dark. You spin up a new NVMe instance in the control panel (taking about 55 seconds), and then you run this Ansible playbook against the fresh IP:
---
- hosts: disaster_recovery
become: yes
vars:
mysql_root_pass: "super_secure_password"
tasks:
- name: Install LEMP Stack Requirements
apt:
name: ['nginx', 'mysql-server', 'php-fpm', 'php-mysql', 'unzip']
state: present
update_cache: yes
- name: Configure Nginx VHost
template:
src: templates/vhost.j2
dest: /etc/nginx/sites-available/default
notify: restart_nginx
- name: Secure MySQL Installation
mysql_user:
name: root
password: "{{ mysql_root_pass }}"
host_all: yes
priv: '*.*:ALL,GRANT'
- name: Create Web Root
file:
path: /var/www/html/public
state: directory
owner: www-data
group: www-data
mode: 0755
handlers:
- name: restart_nginx
service:
name: nginx
state: restarted
The Norwegian Context: Latency and Law
Why does geography matter in DR? Two reasons: Latency and Legality.
If your primary audience is in Oslo or Bergen, restoring a 500GB backup from a server in Virginia, USA is going to be painfully slow. You are limited by the speed of light and trans-Atlantic fiber congestion. Restoring from a secondary CoolVDS location within Scandinavia ensures you can hit those max transfer speeds, getting your business back online in minutes, not hours.
Furthermore, under GDPR (Article 32), you are required to have the ability to restore the availability and access to personal data in a timely manner. If you are storing backups outside the EEA, you are entering a legal minefield regarding data transfer mechanisms. Keep it local. Keep it safe.
Verifying the Integrity
A backup you haven't restored is just a file taking up space. You must automate the verification. Once a week, your script should fetch the latest backup, restore it to a temporary Docker container or a staging VPS, and check if the database responds.
mysqlcheck -u root -p --all-databases --check-upgrade --auto-repair
Don't wait for a crisis to find out your GPG key expired or your SQL dump was truncated.
Final Thoughts
Hardware fails. Software has bugs. Humans make typos. This is the reality of our industry. The difference between a minor hiccup and a business-ending event is the quality of your scripts and the reliability of your infrastructure partner.
CoolVDS offers the raw NVMe performance required to perform these backups quickly without locking your I/O, and the stability to ensure we are there when you need to restore. But the scripts? Those are up to you. Start writing.
Ready to test your DR plan? Deploy a sandbox instance on CoolVDS today and see how fast you can break—and fix—your stack.