Container vs. Hypervisor: Locking Down LXC and OpenVZ
Let’s be honest: the virtualization overhead is killing your I/O. I see it every day. You have a perfectly good RAID-10 SSD array, but by the time your request traverses the hypervisor, the device driver, and the guest kernel, your database latency has spiked. This is why everyone in the DevOps scene from Oslo to Berlin is talking about Linux Containers (LXC) right now. Even that new project, "Docker," released just this month at PyCon, is essentially putting a friendly face on LXC.
But here is the cold, hard reality that most hosting providers won't tell you while selling you cheap "Cloud VPS" instances: Containers share the host kernel.
If you are running a high-traffic Magento store or handling sensitive customer data under the Norwegian Personal Data Act (Personopplysningsloven), a shared kernel is a massive attack surface. If a hacker triggers a kernel panic inside a container, the whole node goes down. If they find a kernel exploit, they aren't just root in the container—they could potentially be root on the host.
I’m not saying don't use containers. I’m saying you need to architect for the risk. Here is how we harden containerized environments at CoolVDS.
1. The Root Privilege Escalation Trap
In a standard LXC or OpenVZ setup, root (UID 0) inside the container is often mapped to root on the host. This is terrifying. If I can escape the chroot (which is not a security feature!), I own the box.
We are seeing promising work in the mainline Linux kernel regarding User Namespaces (introduced experimentally in 3.8), but most production servers are running CentOS 6 (kernel 2.6.32) or Ubuntu 12.04 LTS (3.2). You likely don't have stable user namespaces yet.
The Fix: Capabilities Dropping
Since we can't fully rely on UID mapping in stable kernels yet, we must strip capabilities. Do not give your container CAP_SYS_ADMIN unless absolutely necessary. In your LXC config, you need to be explicit.
# /var/lib/lxc/my-container/config
# Drop dangerous capabilities
lxc.cap.drop = sys_module sys_rawio mac_admin sys_time
# Deny access to all devices by default
lxc.cgroup.devices.deny = a
# Allow only specific devices (null, zero, random, urandom, tty, etc.)
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
This whitelist approach ensures that even if an attacker compromises the container, they cannot mount filesystems or load kernel modules to hide a rootkit.
2. Resource Exhaustion and the "Noisy Neighbor"
In Norway, we prize stability. But in a multi-tenant environment, one runaway PHP process can starve your neighbors. This is where cgroups (Control Groups) come in. Many sysadmins ignore this, assuming the hypervisor handles it. In containers, you are the hypervisor.
We see this often with MySQL databases. Without limits, the cache devours all available RAM. You need to set hard limits on memory and swap usage to prevent OOM killers from taking down innocent services.
# Set memory limit to 4GB
lxc.cgroup.memory.limit_in_bytes = 4G
# Set memory + swap limit to 6GB
lxc.cgroup.memory.memsw.limit_in_bytes = 6G
# CPUShares - prioritize critical containers (1024 is default)
lxc.cgroup.cpu.shares = 2048
Pro Tip: Never disable swap entirely for a container. Java applications (JBoss, Tomcat) behave erratically when they see zero swap space, often crashing immediately upon startup. Give them a small buffer, but cap it strictly.
3. Network Isolation with IPTables
Bridge networking is standard, but it allows ARP spoofing if not configured correctly. You don't want Container A sniffing traffic destined for Container B, especially if Container B is processing credit cards.
At CoolVDS, we use strict ebtables and iptables rules on the host node. You should manually verify that your bridge does not allow promiscuous mode for guests.
# On the HOST node:
# Drop traffic affecting other IPs on the same bridge
iptables -A FORWARD -s 192.168.1.10 -d 192.168.1.20 -j DROP
# Prevent IP spoofing
iptables -A FORWARD -i br0 -s ! 192.168.1.10 -m physdev --physdev-in veth10 -j DROP
4. The Ultimate Security: Hardware Virtualization (KVM)
Sometimes, hardening isn't enough. If you are dealing with Datatilsynet requirements or strict corporate compliance, shared kernels are a liability. This is why we still heavily advocate for KVM (Kernel-based Virtual Machine).
Unlike OpenVZ or LXC, KVM provides full hardware virtualization. Each VPS gets its own kernel. If a user crashes their kernel, it affects only them. The isolation is enforced by the CPU's VT-x extensions, not just software namespaces.
| Feature | LXC/OpenVZ (Container) | CoolVDS KVM (Hypervisor) |
|---|---|---|
| Kernel | Shared with Host | Dedicated |
| Isolation | Process/Namespace | Hardware (VT-x) |
| Performance | Near Native | ~2-5% Overhead |
| Security | Moderate (needs hardening) | High (complete separation) |
Storage Performance: The SSD Revolution
Whether you choose containers or KVM, the bottleneck in 2013 is almost always storage I/O. Traditional spinning rust (HDD) cannot handle the random I/O patterns of virtualization. We are seeing average wait times drop from 200ms to under 1ms just by switching to Enterprise SSDs.
We are closely watching the emerging NVMe specifications, which promise to bypass the SATA bottleneck entirely, but until that hardware becomes commodity, we utilize high-end PCIe-attached flash storage and RAID-10 SSD arrays to ensure that your database isn't waiting on the disk.
If you are running a database on a VPS, ensure you are using the deadline or noop I/O scheduler inside your guest, rather than cfq, to let the host's high-speed storage handle the optimization.
# Check your scheduler
cat /sys/block/sda/queue/scheduler
# Output: [noop] deadline cfq
Conclusion
The speed of containers is seductive, but in a hostile environment, isolation is sanity. If you need raw speed for stateless web workers, hardened LXC is brilliant. But for your core database or sensitive data handling, don't cut corners.
At CoolVDS, we offer both. Our infrastructure is hosted right here in connected datacenters with direct peering to NIX (Norwegian Internet Exchange), ensuring your latency to Oslo is minimal. We use KVM by default for maximum security, backed by pure SSD storage to negate the virtualization overhead.
Don't let a noisy neighbor or a kernel panic take your business offline. Deploy a secure, KVM-backed instance on CoolVDS today and experience the stability of dedicated resources.