Automated Backups or Die Trying: A SysAdmin’s Guide to Data Survival
There are two types of system administrators: those who have lost data, and those who will. I learned this the hard way in 2008, staring at a corrupted filesystem on a Friday night while a client screamed about their missing e-commerce orders. RAID 10 saved the hardware uptime, but it didn't save the data when the file table got corrupted. It was a painful lesson.
If you are managing a VPS in Norway, relying on your host's snapshots is not a strategy; it's a gamble. A true disaster recovery plan requires automated, offsite, and verified backups. In this guide, we will architect a bulletproof backup solution using standard Linux tools available on CentOS 5 and Ubuntu 10.04 LTS.
The "RAID is Not Backup" Mantra
Let's clear the air immediately. Redundant Arrays of Independent Disks (RAID) provide redundancy against disk failure. They do not protect against:
- Human error (
rm -rf /var/www/). - File system corruption.
- Malicious intrusions or SQL injection attacks.
- Catastrophic data center failure (fire, flood).
For our clients in Oslo and the broader Nordic region, data integrity isn't just a technical requirement; it's a legal one. Under the Personal Data Act (Personopplysningsloven), you have a responsibility to secure user data. If you lose customer data because you relied solely on a local RAID array, the Datatilsynet will not be lenient.
Step 1: The Database Dump (MySQL)
You cannot simply copy the raw /var/lib/mysql directory while the server is running. Doing so results in inconsistent data. You need a logical dump. We use mysqldump with the --single-transaction flag to avoid locking InnoDB tables for too long, which keeps your site responsive.
Create a script at /root/scripts/db_backup.sh:
#!/bin/bash
# Configuration
DB_USER="root"
DB_PASS="StrongPassword123!"
BACKUP_DIR="/backup/mysql"
DATE=$(date +%F_%H-%M)
# Ensure backup directory exists
mkdir -p $BACKUP_DIR
# Get list of databases
DATABASES=$(mysql -u $DB_USER -p$DB_PASS -e "SHOW DATABASES;" | grep -Ev "(Database|information_schema)")
for db in $DATABASES; do
echo "Dumping database: $db"
mysqldump -u $DB_USER -p$DB_PASS --single-transaction --quick --lock-tables=false $db | gzip > $BACKUP_DIR/$db-$DATE.sql.gz
done
# Delete backups older than 7 days
find $BACKUP_DIR -name "*.gz" -type f -mtime +7 -exec rm {} \;
This script iterates through all your databases, dumps them safely, compresses them to save space, and rotates old files. Efficiency is key here. On CoolVDS's high-performance SAS storage arrays, this operation completes rapidly, minimizing I/O wait times.
Step 2: File System Synchronization (Rsync)
For static files (images, configuration, code), rsync is the gold standard. It only transfers the differences between files, saving massive amounts of bandwidth—crucial if you are pushing data from a VPS in Oslo to a secondary storage server in Germany or the UK.
We need to push these backups offsite. Assuming you have a remote storage server set up with SSH keys for passwordless login:
#!/bin/bash
SOURCE_DIR="/var/www/html"
REMOTE_USER="backupuser"
REMOTE_HOST="backup.coolvds.net"
REMOTE_DIR="/home/backupuser/backups/"
rsync -avz --delete -e "ssh -p 22" $SOURCE_DIR $REMOTE_USER@$REMOTE_HOST:$REMOTE_DIR
Flags explained:
-a: Archive mode (preserves permissions, owners, groups).-v: Verbose (so you can see what's happening in logs).-z: Compress file data during the transfer.--delete: Delete files on the remote side if they were deleted on source (maintains an exact mirror).
Step 3: Encryption for Compliance
If you are handling sensitive personal data, sending it over the wire unencrypted is negligence. While SSH provides transport encryption, the files sit unencrypted on the remote server. For true peace of mind, tar and encrypt the archive before transfer using GPG.
tar -czf - /backup/mysql | gpg --encrypt --recipient admin@yourdomain.com -o /tmp/backup_encrypted.tar.gpg
Pro Tip: Always keep your GPG private key on a local workstation, never on the server itself. If the server is compromised, the attacker can steal the data, but they cannot read your encrypted backups.
Step 4: Automation with Cron
A script that isn't run is useless. Add your jobs to the root crontab.
# Edit crontab
crontab -e
Add the following lines to run database backups every 6 hours and file syncing every night at 3 AM (when traffic is lowest):
0 */6 * * * /bin/bash /root/scripts/db_backup.sh >> /var/log/db_backup.log 2>&1
0 3 * * * /bin/bash /root/scripts/file_sync.sh >> /var/log/file_sync.log 2>&1
Why Infrastructure Matters
You can write the best scripts in the world, but if the underlying virtualization layer is unstable, your backup process can stall. I've seen OpenVZ containers hang during heavy I/O operations because the host node was oversold.
This is where the architecture of your provider becomes critical. At CoolVDS, we utilize KVM (Kernel-based Virtual Machine) virtualization. Unlike container-based solutions, KVM provides true hardware isolation. When your backup script hits the disk hard to compress gigabytes of logs, you aren't fighting for CPU cycles with the noisy neighbor next door. Our storage backends utilize enterprise-grade 15k RPM SAS drives and early-adoption SSD caching layers to ensure that high I/O operations don't degrade your web server's latency.
Monitoring Your Backups
Silence is not golden; it's suspicious. If your cron job fails, you need to know. A simple addition to your script can email you upon failure:
if [ $? -ne 0 ]; then
echo "Backup failed on $(hostname) at $(date)" | mail -s "CRITICAL: Backup Failed" admin@yourdomain.com
fi
Conclusion
In the world of systems administration, paranoia is a virtue. Data loss is an eventuality, not a possibility. By implementing these scripts today, you are purchasing insurance for your future self. Whether you are running a high-traffic Magento store or a critical email server, the principles remain the same: Consistency, Redundancy, and Automation.
Don't wait for the crash to test your recovery plan. Deploy a test instance on CoolVDS today, set up these scripts, and verify your restore process. Your data—and your sleep schedule—depends on it.