Cloud Storage vs. Iron RAID: Optimizing I/O for High-Load Systems
Let’s be honest: the word "Cloud" has been thrown around marketing meetings so much in 2010 that it has lost almost all technical meaning. For those of us actually managing root shells and staring at top output at 3 AM, "Cloud" usually just means "someone else's computer"—specifically, a virtualized slice of a SAN (Storage Area Network) that you have zero control over.
If you are running a static brochure site, this doesn't matter. But if you are scaling a Magento installation or a heavy MySQL backend, disk I/O is the silent killer. I have seen decent quad-core nodes brought to their knees not by CPU load, but by "I/O Wait" (wa) spiking because the hosting provider oversold their storage backend.
Today, we are going to look at the state of storage in 2011, how to benchmark your actual throughput, and how to configure a Red Hat/CentOS system to handle the load without melting down.
The Latency Lie: Local RAID vs. Network Storage
Most "Cloud" providers (including the big players like Amazon EC2) rely heavily on network-attached storage. When your application writes to disk, that data traverses the network to a storage filer. This adds latency. In a high-transaction database, adding 2-3ms of latency per write can reduce your transactions per second (TPS) by 50% or more.
This is why, at CoolVDS, we still believe in the power of local RAID-10 for performance-critical instances. Physics is physics. A SAS 15k RPM drive array directly attached to the hypervisor bus will almost always smoke a congested SAN in random 4K read/write operations.
War Story: The "Stuck" Web Shop
Last month, a client came to us migrating from a US-based cloud giant. Their checkout process was taking 8 seconds. PHP memory limit was fine. CPU was idle. The culprit? Their database was waiting 400ms for every disk commit because of "noisy neighbors" on the shared storage volume.
We moved them to a CoolVDS KVM instance with dedicated disk allocation. The checkout time dropped to 1.2 seconds. No code changes. Just better I/O.
Benchmarking Your Current I/O
Don't take a provider's word for it. Test it. While hdparm is good for raw reads, it doesn't represent real-world server loads. In 2011, the gold standard for testing is Bonnie++.
If you are on CentOS 5 or 6, install it from the EPEL repository:
yum install -y bonnie++
# Run a test in the /tmp directory (ensure you have space)
bonnie++ -d /tmp -r 2048 -u root
Pay close attention to the Random Seeks column. If you are seeing under 200/sec, your database is going to suffer under load. On our high-performance setups, we aim for numbers significantly higher by leveraging caching controllers and high-speed SAS/SSD hybrid arrays.
Tuning Linux for Virtualized Storage
Out of the box, Linux assumes it is running on physical hardware with a spinning platter. In a virtualized environment (Xen, KVM, or VMware), we need to change how the kernel talks to the "disk."
1. The I/O Scheduler
The default scheduler in CentOS is usually CFQ (Completely Fair Queuing). This is great for a desktop, but terrible for a server where the hypervisor is already handling the ordering of requests. You are essentially double-queueing.
Switch to the deadline or noop scheduler. noop is often best for virtualized guests as it simply passes requests to the hypervisor in a FIFO (First-In-First-Out) manner, reducing CPU overhead.
Check your current scheduler:
cat /sys/block/sda/queue/scheduler
# Output: [cfq] deadline noop
Change it on the fly:
echo noop > /sys/block/sda/queue/scheduler
To make it permanent, edit /boot/grub/menu.lst and append elevator=noop to your kernel line.
2. Filesystem Mount Options
Every time you read a file, Linux updates the "access time" (atime). For a web server reading thousands of PHP files and images a second, this is thousands of unnecessary write operations. Disable it.
Edit your /etc/fstab:
# Before
/dev/sda1 / ext3 defaults 1 1
# After
/dev/sda1 / ext3 defaults,noatime,nodiratime 1 1
Remount with: mount -o remount /
The Norwegian Context: Data Sovereignty
Beyond raw speed, we need to talk about location. With the increasing scrutiny from Datatilsynet (The Norwegian Data Protection Authority) regarding the export of personal data, hosting your customer database outside the EEA (European Economic Area) is becoming a legal headache.
While the US "Safe Harbor" framework currently allows data transfer, many Norwegian CIOs are understandably nervous about relying on self-certification from US vendors. Hosting on CoolVDS in our Oslo datacenter ensures your data never crosses the border. You get the benefit of millisecond latency to the NIX (Norwegian Internet Exchange) and full compliance with the Personal Data Act (Personopplysningsloven).
Pro Tip: If you are serving primarily Nordic customers, the round-trip time (RTT) from Oslo to a server in Virginia (US-East) is roughly 90-110ms. From Oslo to our datacenter, it is <5ms. That 100ms difference happens on every TCP handshake, every image load, and every AJAX request. It adds up fast.
Configuring MySQL for Stability
Finally, if you are running MySQL 5.1 or 5.5, the default configuration is rarely optimized for virtual environments. One specific flag controls how strictly InnoDB flushes data to the disk.
Inside /etc/my.cnf:
[mysqld]
# Default is 1 (safest, slow).
# Set to 2 to flush to OS cache instead of physical disk every commit.
innodb_flush_log_at_trx_commit = 2
# Ensure you have enough RAM allocated to the pool
innodb_buffer_pool_size = 512M
Warning: Setting innodb_flush_log_at_trx_commit = 2 means you might lose 1 second of transactions if the OS crashes, but the performance gain on virtual disks is often 300% or more. For most web applications, this is an acceptable trade-off.
The Verdict
Cloud storage is flexible, but it isn't magic. It follows the laws of physics and network congestion. If you are tired of unpredictable I/O wait times and sluggish databases, you need a provider that understands the hardware layer.
At CoolVDS, we don't treat storage as a black box. We use enterprise-grade RAID arrays and strictly limit the number of tenants per node to guarantee your throughput. Don't let slow I/O kill your SEO rankings.
Ready to test real performance? Deploy a CentOS 6 instance on CoolVDS today and run your own Bonnie++ benchmark.