Hardening Containers: A Survival Guide for the Paralyzed Sysadmin
It is October 2018. Meltdown and Spectre are still haunting our dreams, GDPR has the C-suite sweating bullets since May, and yet I still see developers deploying Docker containers running as root. It is absolute madness.
Let’s get one thing straight: Containers are not Virtual Machines. They are processes lying to themselves about how much of the computer they own. They share a kernel. If that kernel goes down—or gets exploited—your entire host is compromised. In a shared hosting environment, that’s bad. In a production environment handling Norwegian citizen data, that’s a call from Datatilsynet (The Norwegian Data Protection Authority) you do not want to answer.
I have spent the last six months migrating legacy monoliths to Kubernetes 1.11 and 1.12 clusters. Here is what breaks, what works, and how to secure your stack without destroying performance.
1. The "Root" of All Evil
By default, processes inside a Docker container run as root. If an attacker breaks out of the container (via a dirty COW exploit or similar runtime vulnerability), they are root on your host node. Game over.
You need to force your developers to strip privileges. It is not optional.
The Fix: User Namespaces and Dockerfile Hygiene
Stop writing Dockerfiles that end with CMD [...] running as root. Create a dedicated user.
FROM alpine:3.8
# Create a group and user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Tell Docker to switch to this user
USER appuser
WORKDIR /home/appuser
COPY . .
CMD ["./my-binary"]
If you are running a legacy application that demands root (bind mounting to port 80, for example), use capabilities instead of full privileges. Do not use --privileged. Ever. That flag basically hands over the keys to the castle.
2. Kernel Capabilities: Drop 'Em Like It's Hot
The Linux kernel breaks down root privileges into distinct units called "capabilities". A web server needs to bind to a network port (NET_BIND_SERVICE), but it definitely does not need to audit system logs (AUDIT_WRITE) or reboot the server (SYS_BOOT).
When we deploy sensitive workloads, we default to dropping ALL capabilities and adding back only what is strictly necessary. Here is how you run a container securely:
docker run --d \
--cap-drop=ALL \
--cap-add=NET_BIND_SERVICE \
--read-only \
--tmpfs /run \
--tmpfs /tmp \
nginx:1.15-alpine
Notice the --read-only flag? That makes the container's root filesystem immutable. If an attacker gets in, they can't write a backdoor script to /bin. They are stuck in a read-only jail.
3. Isolation Matters: KVM vs. Shared Kernel
This is where infrastructure choice becomes a security feature. Many "cloud" providers sell you container-based VPS instances (like OpenVZ or LXC) and call it a day. The problem? You are sharing a kernel with their other customers. If their neighbor has a noisy, compromised container, your latency spikes, or worse, their exploit bleeds over.
This is why we architect differently at CoolVDS. We use KVM (Kernel-based Virtual Machine). Every CoolVDS instance runs its own isolated kernel.
Pro Tip: If you are running Docker, run it inside a KVM VPS. The overhead is negligible in 2018 thanks to hardware virtualization extensions (VT-x), but the security boundary is solid. If your container crashes the kernel inside a CoolVDS instance, only your instance reboots. Your neighbors are unaffected, and more importantly, they can't access your memory space.
4. Network Security & GDPR Compliance
With GDPR now fully enforceable, where your data physically sits matters. If you are serving Norwegian users, routing traffic through Frankfurt or Amsterdam adds unnecessary latency and legal complexity.
You want your packets staying within the NIX (Norwegian Internet Exchange) as much as possible. Low latency isn't just about speed; it's about reducing the attack surface of data in transit. We position our infrastructure to ensure minimal hops to major Nordic ISPs.
Firewalling at the Host Level
Docker manipulates iptables rules dynamically. This can sometimes bypass your UFW or Firewalld rules if you aren't careful. I always explicitly bind ports to the internal interface if public access isn't required.
# Don't do this (binds to 0.0.0.0 by default)
docker run -p 3306:3306 mysql
# Do this (binds only to localhost/private IP)
docker run -p 127.0.0.1:3306:3306 mysql
5. Kubernetes Pod Security Policies (PSP)
If you have graduated to Kubernetes, you need to enable Pod Security Policies. As of Kubernetes 1.12, these are stable enough for production use. A PSP allows you to control the security specification of the pods running in your cluster.
Here is a restrictive policy we apply to all non-system namespaces:
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
# Prevents escalation to root
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group
- min: 1
max: 65535
fsGroup:
rule: 'MustRunAs'
ranges:
- min: 1
max: 65535
Applying this prevents any developer from accidentally (or maliciously) deploying a privileged pod that mounts the host filesystem.
6. The Storage Bottle Neck
Security scanning takes I/O. If you are running tools like Clair or Anchore to scan your images for vulnerabilities, you are churning through disk reads. On traditional spinning rust (HDD) or even cheap SATA SSDs, this kills performance for your live apps.
This is a hardware problem, not a software one. We deploy NVMe storage across our CoolVDS fleet specifically to handle high-IOPS operations like continuous security scanning and database transaction logs. Don't let your security tools become your performance bottleneck.
Final Thoughts
Security is a process, not a product. But the foundation matters. You can have the best iptables rules in the world, but if you are running on a noisy, oversold shared platform, you are building on sand.
If you need a sandbox to test these hardening techniques, spin up a KVM instance. It’s the closest you will get to bare metal isolation without the dedicated server price tag. And for the love of code, stop running as root.
Ready to lock down your infrastructure? Deploy a CoolVDS NVMe instance in Oslo today and get the isolation your compliance officer demands.