Console Login

Sleep at Night: The Art of Automated Server Backups with Rsync and Cron

Sleep at Night: The Art of Automated Server Backups with Rsync and Cron

There are two types of system administrators: those who have lost data, and those who will. I learned this the hard way back in '04 when a RAID controller failed silently on a mail server, corrupting weeks of data. No amount of apology emails brings back a client's database. If you are manually tar-ing directories and FTP-ing them to your desktop, you are playing Russian Roulette with your infrastructure.

In the Norwegian hosting market, where we pride ourselves on stability and adherence to the Personopplysningsloven (Personal Data Act), relying on hope is not a strategy. Here is how to build a bulletproof, automated backup routine using standard tools available on any proper Linux distribution today, like CentOS 5 or Debian Lenny.

The Golden Rule: 3-2-1

Before we touch a single config file, understand the architecture. You need three copies of your data, on two different media types, with at least one copy off-site. For a Virtual Dedicated Server (VDS), this usually translates to:

  1. Live Data: The data running on your high-speed SAS or SSD production disks.
  2. Local Backup: A snapshot or compressed archive on a separate partition or secondary virtual disk.
  3. Remote Backup: Data pushed to a completely different physical location—perhaps a backup server in a different data center or an Amazon S3 bucket (if you trust the cloud).

The Toolchain: Rsync is King

Forget complex enterprise suites that cost more than your server. `rsync` is robust, bandwidth-efficient, and installed everywhere. It only transfers the deltas (changes), which is critical when you are pushing gigabytes of data over the wire.

Here is a battle-hardened script snippet for a nightly backup rotation. It uses hard links to save space, giving you "snapshots" without duplicating unchanged files.

#!/bin/bash
# /usr/local/sbin/daily_backup.sh

SOURCE="/var/www/html"
DEST="/backup/snapshots"
DATE=$(date +%F)

# Rotate snapshots if previous exists
if [ -d "$DEST/current" ]; then
    rsync -a --delete --link-dest="$DEST/current" "$SOURCE" "$DEST/$DATE"
else
    rsync -a "$SOURCE" "$DEST/$DATE"
fi

# Update current pointer
rm -f "$DEST/current"
ln -s "$DEST/$DATE" "$DEST/current"

# Remove backups older than 7 days to save space
find "$DEST" -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;

Handling the Database: MySQL is Tricky

You cannot simply copy `/var/lib/mysql` while the server is running. You will end up with corrupted tables. You need a consistent dump.

If you are running MyISAM tables, you must lock them. If you are using InnoDB (which you should be for data integrity), use the `--single-transaction` flag to dump without locking the entire site.

mysqldump -u root -p[password] --all-databases --single-transaction --quick > /backup/mysql_full_$(date +%F).sql
Pro Tip: Do not store your MySQL password in the script. Create a `~/.my.cnf` file with restricted permissions (chmod 600) containing your credentials. `mysqldump` will read it automatically.

The Local Angle: Latency and Legality

Why does geography matter? Speed and Law. If your off-site backup is in Texas, your restore time (RTO) will suffer due to latency. But more importantly, Datatilsynet (The Norwegian Data Protection Authority) keeps a close eye on where personal data flows. Under current regulations, keeping your backup data within the EEA (European Economic Area) simplifies compliance significantly.

At CoolVDS, we see this constantly. Clients deploy our KVM-based instances in Oslo because the latency to the Norwegian Internet Exchange (NIX) is practically zero. When you pipe your off-site backups to another Norwegian facility, you get gigabit speeds and zero legal headaches regarding cross-border data transfer.

Automation: Cron it or Lose it

A script that you have to run manually is worthless. Add it to your crontab. Edit it with `crontab -e` and set it to run during low-traffic hours, typically 03:00 AM CET.

0 3 * * * /bin/bash /usr/local/sbin/daily_backup.sh >> /var/log/backup.log 2>&1

Always log the output. If the backup fails silently, you won't know until you need it, and that is a conversation you don't want to have with your boss.

Infrastructure Reality Check

Software can only do so much. If the underlying host node is overcommitted, your I/O wait times during backups will skyrocket, causing your web server to hang. This is the "noisy neighbor" effect common with budget VPS providers using older OpenVZ containers.

This is why we architect CoolVDS differently. We utilize hardware virtualization (KVM) and high-performance RAID 10 disk arrays. Even when your neighbors are hammering their disk I/O, your backup process has guaranteed resources. We don't oversell our storage throughput because we know that when 03:00 AM hits, everyone's cron jobs fire at once.

Verify or Die

Finally, perform a dry run. Delete a non-critical file and try to restore it from your backup. If you can't do it in 5 minutes, your backup strategy is flawed. Automation is great, but verification is mandatory.

Don't wait for the inevitable hardware failure. Audit your `/etc/fstab`, check your cron logs, and ensure your hosting platform is as serious about data integrity as you are. If you need a sandbox to test your recovery scripts, spin up a CoolVDS instance. It takes less than a minute, and it might just save your career.