Securing Linux Containers: Hardening LXC and OpenVZ in Production Environments
Let’s be honest: we all love the raw speed of container-based virtualization. Whether you are spinning up OpenVZ slices or experimenting with the new LXC (Linux Containers) tools that are finally maturing in Ubuntu 12.04 LTS, the lack of hypervisor overhead is addictive. No emulated BIOS, no hardware instruction translation—just raw syscalls.
But there is a dark side to this density. In a shared kernel environment, you are one zero-day local root exploit away from a total host compromise. I have spent the last week debugging a legacy OpenVZ node where a single client's fork bomb managed to starve the entire host of PIDs because the beancounters weren't tuned correctly.
If you are deploying containers in 2013, you cannot treat them like Virtual Machines. They are essentially fancy chroot environments with a seatbelt. Here is how to tighten that seatbelt until it actually offers protection, and why we at CoolVDS architect our infrastructure the way we do.
The Shared Kernel Dilemma
The fundamental difference between a container (LXC/OpenVZ) and a hypervisor (KVM/Xen) is the kernel. In LXC, /proc/sys is a window into the host's soul. If a containerized process manages to mount the host's filesystem or manipulate kernel modules, it's game over.
In Norway, where the Personopplysningsloven (Personal Data Act) places strict liability on data controllers, relying on standard, out-of-the-box container configurations is negligence. If you are hosting customer data in Oslo, you need to ensure isolation is absolute.
1. Resource Metering with Cgroups
The first line of defense isn't preventing breakouts; it's preventing Denial of Service (DoS). Without Control Groups (cgroups), a single container can consume all CPU cycles or memory, crashing the host. While OpenVZ uses ubc (User Beancounters), LXC relies on the mainline Linux cgroups subsystem.
You must verify that your kernel has cgroups mounted and active. Do not rely on defaults.
# Check cgroup mounts
mount -t cgroup
# Output should look something like this:
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
If you are managing raw LXC containers, you define these limits in your configuration file. Never deploy a container without a memory ceiling. It protects the host kernel from OOM (Out of Memory) killer rampages.
# /var/lib/lxc/my-container/config
# Hard Memory Limit (512MB)
lxc.cgroup.memory.limit_in_bytes = 536870912
# Memory + Swap Limit (1GB)
lxc.cgroup.memory.memsw.limit_in_bytes = 1073741824
# CPU Shares (Low priority)
lxc.cgroup.cpu.shares = 512
2. Dropping Capabilities
By default, root inside a container retains too many Linux Capabilities. Does your web server really need CAP_SYS_MODULE (loading kernel modules) or CAP_SYS_TIME (changing the system clock)? Absolutely not.
If an attacker gains root in your container, these capabilities are the keys they use to escape. We strip these aggressively. In your LXC config, explicitly drop everything you don't need.
# Drop dangerous capabilities
lxc.cap.drop = sys_module
lxc.cap.drop = mac_admin
lxc.cap.drop = mac_override
lxc.cap.drop = sys_time
lxc.cap.drop = sys_boot
This ensures that even if someone manages to run a binary as root inside the container, the kernel will reject their attempt to load a malicious module to hide their tracks.
3. Network Isolation: The veth Bridge
Don't share the host's network namespace (`lxc.network.type = none`). It allows containers to sniff traffic on the host interface. Always use a virtual ethernet pair (`veth`) bridged to a separate interface.
At CoolVDS, when we provision a VPS, we use a bridged setup that isolates customer traffic at Layer 2. Here is a standard safe configuration for Debian Wheezy hosts:
# /etc/network/interfaces on the Host
auto br0
iface br0 inet static
address 192.168.10.1
netmask 255.255.255.0
bridge_ports eth0
bridge_stp off
bridge_fd 0
bridge_maxwait 0
Then, bind the container to this bridge:
# Container Config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
Pro Tip: Useiptableson thebr0interface to prevent ARP spoofing between containers. A simple script to lock MAC addresses to IP addresses prevents "noisy neighbors" from stealing IP traffic.
4. The AppArmor Safety Net
If you are on Ubuntu, AppArmor is your best friend. It provides Mandatory Access Control (MAC) that sits above standard Unix permissions. Even if the user is root, AppArmor can forbid file access.
LXC ships with a default profile, but you should ensure it is enforcing. Check the status:
sudo apparmor_status | grep lxc
If you see 0 processes are in complain mode, you are in good shape. If you are writing custom profiles, remember to deny mounting behavior:
# /etc/apparmor.d/lxc/lxc-default
profile lxc-container-default flags=(attach_disconnected,mediate_deleted) {
# Deny mounting of filesystems
deny mount,
deny remount,
deny umount,
# Allow standard paths
file,
/proc/** rw,
/sys/** r,
}
Why CoolVDS Chooses KVM (Usually)
We love the efficiency of containers. However, for true multi-tenant security in 2013, Hardware Virtualization (KVM) is the superior choice for mission-critical workloads.
Why? Because KVM (Kernel-based Virtual Machine) gives every CoolVDS instance its own independent kernel. If a process crashes the kernel in your VPS, only your VPS reboots. The host remains stable. More importantly, memory pages are strictly isolated by hardware virtualization extensions (Intel VT-x or AMD-V).
However, many of our clients use our robust KVM instances to run their own LXC or OpenVZ clusters internally. This is the "Matryoshka doll" approach: robust hardware isolation on the outside (CoolVDS), and high-efficiency containerization on the inside for your applications.
Conclusion: Trust No One (Not Even Root)
If you are running a high-traffic e-commerce site targeting the Nordic market, latency matters. Our datacenter peering at NIX (Norwegian Internet Exchange) ensures your packets hit Oslo ISPs in under 5ms. But speed is nothing without uptime.
If you must use bare-metal containers, strip capabilities, enforce cgroups, and use AppArmor. If you want sleep at night, wrap those containers in a KVM slice.
Ready to build a fortress? Deploy a high-performance KVM SSD VPS with CoolVDS today and get full root access to your own dedicated kernel.