Disaster Recovery for Ops: Surviving Hardware Failure in the Norwegian Hosting Market
Let’s be honest with ourselves. Hardware fails. It doesn't matter if you are running on expensive enterprise gear or a budget box; eventually, a disk controller will panic, a power supply unit will blow, or a fiber cut in Oslo will isolate your rack. If your strategy is hope, you are already down.
In the Norwegian hosting market, we have the benefit of stable power grids and the Norwegian Internet Exchange (NIX) providing excellent local peering. However, reliance on a single point of failure—even a reliable one—is negligence. As of early 2013, the toolkit for Disaster Recovery (DR) has matured significantly. We aren't just talking about tape backups anymore. We are talking about hot spares, asynchronous replication, and rapid failover.
This guide cuts through the marketing fluff. We will look at how to configure a robust DR setup using tools available right now: MySQL 5.5 Master-Slave replication, rsync for file consistency, and why the virtualization technology you choose (KVM vs. OpenVZ) dictates your recovery speed.
The Data Layer: MySQL 5.5 Master-Slave Replication
Your database is your single most critical asset. If you lose your web nodes, you lose traffic. If you lose your database, you lose the business. In 2013, the standard for high availability without breaking the bank is asynchronous Master-Slave replication.
Many sysadmins rely on nightly mysqldump. That is not disaster recovery; that is archiving. If your server dies at 4:00 PM and your last dump was at 3:00 AM, you have lost 13 hours of data. That is unacceptable.
Here is how to set up a Slave node on a secondary CoolVDS instance. We assume you are running CentOS 6 or Debian Squeeze.
1. Configure the Master
Edit your /etc/mysql/my.cnf (or /etc/my.cnf) to enable binary logging. The server ID must be unique.
[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_do_db = production_db
innodb_flush_log_at_trx_commit = 1
sync_binlog = 1
Pro Tip: Setting sync_binlog = 1 is crucial for durability. It forces MySQL to synchronize the binary log to disk after every commit. It adds a slight I/O penalty, but on CoolVDS instances equipped with high-performance SSD storage, this latency is negligible compared to rotational SAS drives.
2. Create the Replication User
Log into the MySQL shell on the Master:
CREATE USER 'repl_user'@'10.%.%.%' IDENTIFIED BY 'StrongPassword2013!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'10.%.%.%';
FLUSH PRIVILEGES;
Note the IP restriction. Never open replication to % (the world). Ideally, use a private backend network or iptables to restrict traffic to your slave IP.
3. Configure the Slave
On your secondary CoolVDS server (preferably in a different datacenter zone if available), edit my.cnf:
[mysqld]
server-id = 2
relay-log = /var/log/mysql/mysql-relay-bin.log
read_only = 1
Setting read_only = 1 ensures you don't accidentally write data to the slave, which would break the replication chain.
Filesystem Synchronization: The Power of Rsync
Database replication handles the structured data, but what about user uploads, configuration files, and application assets? rsync remains the undisputed king of efficient file transfer.
Do not use FTP scripts. They are insecure and slow. Rsync over SSH is the standard. Here is a battle-tested bash script to sync your web root to your DR node. Put this in /usr/local/bin/sync_dr.sh and chmod it to 700.
#!/bin/bash
SOURCE_DIR="/var/www/html/"
REMOTE_USER="backupuser"
REMOTE_HOST="192.168.1.50" # Your Secondary CoolVDS IP
REMOTE_DIR="/var/www/html/"
LOG_FILE="/var/log/dr_sync.log"
# Exclude temporary files and git directories
EXCLUDE_LIST="--exclude '.git' --exclude 'cache/*' --exclude 'tmp/*'"
echo "Starting sync at $(date)" >> $LOG_FILE
rsync -avz --delete -e "ssh -p 22" $EXCLUDE_LIST $SOURCE_DIR $REMOTE_USER@$REMOTE_HOST:$REMOTE_DIR >> $LOG_FILE 2>&1
if [ $? -eq 0 ]; then
echo "Sync successful at $(date)" >> $LOG_FILE
else
echo "Sync FAILED at $(date)" | mail -s "DR Sync FAILED" ops@yourdomain.no
fi
Add this to your crontab to run every 5 or 15 minutes, depending on your tolerance for data loss (RPO - Recovery Point Objective).
The Hypervisor Matters: KVM vs. OpenVZ
In the VPS market, there is a dirty secret: overselling. Many budget providers use OpenVZ (container-based virtualization). In an OpenVZ environment, you share the kernel with every other customer on the host node. If a neighbor gets hit with a DDoS attack, your kernel locks up, and your DR plan fails.
This is why CoolVDS standardizes on KVM (Kernel-based Virtual Machine). KVM provides true hardware virtualization.
| Feature | OpenVZ | KVM (CoolVDS) |
|---|---|---|
| Kernel | Shared | Dedicated |
| Isolation | Low | High |
| Resource Guarantee | Burst/Shared | Hard Limits |
| Swap | Fake (venet) | Real Partition |
For a Disaster Recovery node, you need predictability. You cannot afford to wait for a shared kernel to free up resources when you are trying to restore service. KVM ensures that the RAM and CPU cores you pay for are actually yours.
Compliance and the "Personopplysningsloven"
Operating in Norway means adhering to strict privacy laws, specifically the Personal Data Act (Personopplysningsloven) of 2000. Data sovereignty is not just a buzzword; it is a legal requirement for many Norwegian businesses.
When you replicate data, you must ensure your secondary location complies with the same standards as the primary. Hosting your DR node with CoolVDS ensures your data stays within Norwegian jurisdiction (or the EEA), satisfying the requirements of the Datatilsynet (Norwegian Data Protection Authority). Moving data to US-based clouds can introduce complex legal liabilities regarding safe harbor.
Monitoring Your Pulse
A DR plan is useless if you don't know when to trigger it. You need external monitoring. Tools like Nagios are industry standard for a reason. Here is a simple Nagios service definition to check HTTP latency. If the latency to your primary node exceeds 2 seconds (critical), you should be alerted immediately.
define service{
use generic-service
host_name production-web-01
service_description HTTP_Latency
check_command check_http!-w 0.5 -c 2.0
notifications_enabled 1
}
Conclusion
Disaster recovery isn't about buying the most expensive hardware; it's about architecture. By leveraging MySQL replication, robust bash scripting, and the isolation guarantees of KVM, you can build a resilience tier that rivals enterprise setups.
The hardware underlying your VPS matters. Slow I/O kills recovery times. At CoolVDS, we utilize high-performance SSD arrays that provide the IOPS necessary to catch up on replication lags instantly. Don't let your backup plan be the bottleneck.
Ready to harden your infrastructure? Deploy a KVM-based SSD instance on CoolVDS today and secure your data against the unexpected.