Console Login

Container Security in 2014: Why Shared Kernels Keep Me Up at Night (And How to Fix It)

Locking Down Linux Containers: LXC, Docker, and the Myth of Isolation

It is July 2014, and the hype train has officially left the station. Docker 1.0 dropped last month, and suddenly every developer I know wants to ship containers to production. They scream about "write once, run anywhere" and sub-second boot times. Meanwhile, those of us responsible for keeping servers online and secure in Oslo are looking at the architecture and breaking into a cold sweat.

Here is the brutal truth: Containers are not Virtual Machines.

When you spin up a standard VPS on a cheap host, you are often getting an OpenVZ container. When you run Docker, you are using cgroups and namespaces. In both cases, you are sharing the host kernel. If a rogue process triggers a kernel panic inside a container, the whole host goes down. If there is a vulnerability in a syscall, isolation creates a false sense of security. I have seen production environments melt because a single container exhausted the host's file descriptors, bringing down every other client on the box.

The "Root" of the Problem

By default, the root user inside a Docker container is the same as the root user on the host machine. Pause and let that sink in. If an attacker manages to break out of the container (escape the cgroup/namespace jail), they have root access to your bare metal. This is not theoretical. We have seen proof-of-concept exploits regarding /sys file system manipulation.

If you are deploying critical applications—especially here in Norway where we have to answer to Datatilsynet (The Data Inspectorate) and the Personal Data Act—you cannot rely on default settings. You need defense in depth.

1. Drop Capabilities Like They Are Hot

The Linux kernel divides root privileges into distinct units called "capabilities." A web server does not need to load kernel modules or change the system time. Yet, by default, many containers grant these permissions.

When launching a container, you must explicitly drop everything you do not need. If you are using Docker 1.0, use the --cap-drop flag. If you are using raw LXC on Ubuntu 14.04, edit your config.

# Docker example: Running an Nginx container with minimal privileges sudo docker run -d --name secure-web \ --cap-drop=ALL \ --cap-add=NET_BIND_SERVICE \ --cap-add=SETUID \ --cap-add=SETGID \ nginx

This command drops all root capabilities and adds back only the three required to bind port 80 and manage user IDs. If an attacker compromises Nginx, they cannot use SYS_MODULE to insert a rootkit into the kernel.

2. Network Isolation with iptables

Another nightmare is the default bridge networking. By default, containers can talk to each other. If your frontend container gets compromised, it can start port scanning your database container on the private bridge network.

We need to tighten the iptables rules on the host. Do not rely on the daemon to do this for you correctly every time.

# Block inter-container communication on the docker0 bridge
iptables -I DOCKER-USER -i docker0 -o docker0 -j DROP

# Allow specific link (if not using --link legacy linking)
# Assuming 172.17.0.2 is DB and 172.17.0.3 is Web
iptables -A DOCKER-USER -i docker0 -s 172.17.0.3 -d 172.17.0.2 -p tcp --dport 3306 -j ACCEPT

At CoolVDS, we configure our upstream routers to drop spoofed traffic instantly. But inside your VPS, you are the captain. If you are running multiple services on one host, strict firewalling is not optional.

3. The Storage I/O Trap

Containers are fantastic at CPU sharing, but disk I/O is where the "noisy neighbor" effect destroys performance. In a shared kernel environment, one container writing massive logs can saturate the SATA bus for everyone else. This is why "cheap" VPS providers (often overselling OpenVZ) suffer from random latency spikes.

We mitigate this at CoolVDS by using KVM (Kernel-based Virtual Machine) for our instances. Unlike containers, KVM provides hardware-level virtualization. Your OS has its own kernel. If you want to run Docker, run it inside a CoolVDS KVM instance. This gives you the flexibility of containers for your app deployment, but the hard security boundary of a hypervisor protecting your data.

Pro Tip: If you are running high-transaction databases (MySQL/PostgreSQL) inside a container, map the data volume to the host's SSD directly to bypass the union filesystem overhead. The performance penalty of AUFS or DeviceMapper loops can be significant.
# Mapping host directory to container for raw disk speed sudo docker run -d -v /opt/mysql/data:/var/lib/mysql -p 3306:3306 mysql:5.5

4. Mandatory Access Control (AppArmor/SELinux)

If you are on Ubuntu 14.04 LTS (Trusty Tahr), AppArmor is your best friend. Docker ships with a default AppArmor profile that is decent, but for high-security environments, you should enforce it strictly.

Check if the profile is loaded:

sudo apparmor_status | grep docker

If you are on CentOS 6 or 7, you are dealing with SELinux. I know the temptation to run setenforce 0 is strong when things break, but don't do it. SELinux type enforcement creates a label for the container process (svirt_lxc_net_t) and the file system content (svirt_sandbox_file_t). This prevents a compromised container process from touching files on the host, even if it is running as root.

The CoolVDS Advantage: Why We Chose KVM

We built CoolVDS to serve the Nordic market's need for stability and data integrity. While container-based hosting (like OpenVZ) is cheaper to run, the security isolation just isn't there yet for 2014's threat landscape.

We use KVM to ensure that your memory and CPU instructions are hardware-isolated. Furthermore, our infrastructure in Oslo connects directly to the NIX (Norwegian Internet Exchange), ensuring that even if you have heavy security filtering, your latency remains negligible.

Final Checklist for Production:

  • Update Kernel: Ensure you are on at least Linux 3.13 (Ubuntu 14.04 default) for stable cgroups support.
  • Memory Limits: Always run with -m to prevent a memory leak in one app crashing the server.
  • SSHD: Do not run sshd inside containers. SSH into the host (CoolVDS instance), then use nsenter or docker exec (new in 1.3, coming soon hopefully) to debug.

Security is a process, not a product. But starting with the right architecture makes that process a lot less painful.

Ready to build a fortress? Deploy a KVM-based Linux VPS with pure SSD storage on CoolVDS today.