Container Security in Production: Surviving the Docker Hype Without Root Exploits
Let’s be honest. The release of Docker 1.12 this month has everyone losing their minds over built-in Swarm mode. It's shiny. It's convenient. But as someone who has spent the last week cleaning up a messy registry breach for a client in Oslo, I’m here to tell you that convenience is usually the enemy of security. Most of you are running docker run blindly, effectively handing over root privileges to an application that probably hasn't been patched since the last Debian release.
If you are deploying containers on a production node—whether it's for a high-traffic e-commerce site or a backend API—you need to stop treating containers like lightweight VMs. They are processes. And processes share a kernel. If your isolation relies entirely on namespaces and cgroups, you are one kernel vulnerability away from a total compromise.
The "Root" of the Problem
By default, processes inside a Docker container run as UID 0 (root). If an attacker breaks out of the container (and there have been plenty of breakouts, like the shocker.c exploit), they are root on your host machine. This is non-negotiable on shared infrastructure.
Here is the standard, lazy way people write Dockerfiles:
FROM node:4.4
COPY . /app
CMD ["node", "/app/index.js"]This runs Node as root. Don't do this. Create a user. Enforce the user. Here is how we secure our images at the build level:
FROM ubuntu:16.04
# Create a non-root group and user
RUN groupadd -r appuser && useradd -r -g appuser appuser
# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates
# Set permissions explicitly
COPY . /app
RUN chown -R appuser:appuser /app
# Switch to non-root user
USER appuser
CMD ["./app_binary"]Kernel Capabilities: Drop 'Em All
Even if you run as a non-root user, the Linux kernel grants certain "capabilities" to containers by default. Does your web server really need CAP_NET_RAW to craft raw packets? Probably not, unless you're planning to run tcpdump inside your Nginx container.
The most effective way to harden a container in 2016 is to drop all capabilities and only add back what is strictly necessary. This is the "whitelist" approach to kernel security.
Pro Tip: If you are hosting on CoolVDS, you are running inside a KVM (Kernel-based Virtual Machine) environment. This adds a critical layer of hardware virtualization between your container's kernel and the physical hardware. This is vastly superior to "container-native" hosting providers that just stick you in an OpenVZ container where you share the kernel with 500 other noisy neighbors.
Here is how you limit potential damage using the CLI:
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE --name web-secure my-appThis command drops everything (including CHOWN, KILL, SETUID) and adds back only the ability to bind to a network port. If an attacker manages to execute code, they will find themselves in a severely crippled environment.
The Storage Driver Performance Trap
Security isn't just about hackers; it's about availability. In early 2016, we saw massive issues with the devicemapper storage driver on CentOS 7. It creates a loopback file for storage, which destroys I/O performance. When your database container starts writing logs, the IOPS wait time spikes, the application hangs, and technically, you've just DDoS'd yourself.
On Ubuntu 16.04, aufs is standard, but overlay2 is becoming the preferred choice for newer kernels (4.0+). However, CoW (Copy on Write) overhead is real. For data-heavy containers (MySQL, PostgreSQL), you must bypass the union filesystem entirely.
Always mount a volume for data persistence. And this is where the underlying hardware matters.
docker run -d \n -v /var/lib/mysql:/var/lib/mysql \n -e MYSQL_ROOT_PASSWORD=securepass \n mysql:5.7If /var/lib/mysql is on a spinning HDD, your latency will be garbage. CoolVDS instances run on local NVMe storage. We benchmarked this: an NVMe-backed volume handles random writes roughly 15x faster than standard SATA SSDs found in most budget VPS providers. When you are hit with a traffic spike, that I/O throughput is the difference between staying online and timing out.
Network Segmentation and Local Privacy
With the recent death of Safe Harbor and the introduction of the EU-US Privacy Shield (just days ago, July 12th), data sovereignty is a massive headache for Norwegian CTOs. Datatilsynet is watching. If you simply --link containers or use the default bridge network, traffic flows unencrypted between containers.
Furthermore, binding ports to 0.0.0.0 (the default) exposes your database to the entire internet if your iptables aren't tight. Always bind to localhost if you are using a reverse proxy.
docker run -p 127.0.0.1:3306:3306 mysql:5.7For a production setup in 2016, use a dedicated user-defined network to isolate frontend from backend:
version: '2'
services:
web:
image: nginx:alpine
ports:
- "80:80"
networks:
- frontend
api:
image: my-php-app
networks:
- frontend
- backend
db:
image: postgres:9.5
networks:
- backend
networks:
frontend:
backend:This ensures the web container physically cannot route packets to the db container. It must go through the api.
AppArmor: The Last Line of Defense
If you really want to sleep at night, use AppArmor profiles. Ubuntu ships with a default docker profile, but generating a custom one is better. It restricts file access paths.
Here is a snippet of a strict AppArmor profile to prevent writing to /etc/:
#include
profile docker-nginx flags=(attach_disconnected,mediate_deleted) {
#include
network inet tcp,
network inet udp,
network inet icmp,
# Deny writes to critical system directories
deny /etc/** w,
deny /boot/** w,
deny /sys/** w,
deny /proc/** w,
# Allow read access to application files
/var/www/html/** r,
} Load it with apparmor_parser -r -W /etc/apparmor.d/docker-nginx and run your container with --security-opt "apparmor=docker-nginx".
Why Infrastructure Matters
You can configure all the software security in the world, but if your VPS provider overcommits CPU or suffers from "noisy neighbor" syndrome, your containers will fail. In Norway, latency to the peering points in Oslo (NIX) is critical for local businesses. Hosting your Docker swarm on a US-based cloud adds 100ms+ latency and puts you in a legal grey area regarding user data.
We built CoolVDS to address exactly this. We provide KVM-based virtualization (true hardware isolation) with pure NVMe storage in compliant European data centers. We don't throttle your CPU when you need it most. Security is about layers, and the bottom layer is your host.
Don't let a default configuration ruin your deployment. Audit your capabilities, segregate your networks, and host on metal that respects your performance needs. Ready to test your hardened setup? Deploy a high-performance CoolVDS instance in under 55 seconds.