The "It Works on My Machine" Security Nightmare
Let’s get one thing straight: Containers are not Virtual Machines.
If you are treating a Docker container like a lightweight VM, you are creating a ticking time bomb. I’ve audited infrastructure for startups in Oslo where the developers mounted the host’s root filesystem into a container just to "fix permissions." That is not a fix; that is an invitation for a privilege escalation attack. With the CVE-2019-5736 runc vulnerability still fresh in our collective memory, we need to stop pretending default configurations are safe.
In 2020, security isn't just about firewalls; it's about reducing the blast radius. If you are deploying to production without stripping Linux capabilities, you might as well hand over your SSH keys.
1. The Root Problem (Literally)
By default, Docker containers run as root. If an attacker compromises the process inside the container, and that process is root, they are theoretically contained. But kernel vulnerabilities exist. If they break out, they are root on your host server. Game over.
The fix is boring but mandatory: create a user.
FROM alpine:3.11
# Create a group and user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Tell Docker to use this user
USER appuser
# Now run your app
ENTRYPOINT ["./my-secure-app"]This simple change in your Dockerfile mitigates a massive class of breakout exploits. However, sometimes you need to bind to privileged ports (under 1024). In that case, don't run as root; use setcap or, better yet, use a reverse proxy.
2. Drop Capabilities Like They’re Hot
Linux capabilities break down the binary "root vs non-root" dichotomy into granular permissions. Does your Nginx container need to audit system logs? No. Does your Node.js app need to modify kernel modules? Absolutely not.
Docker gives you a default whitelist, but it's too generous. I prefer the "deny all, allow some" approach. Here is how you should be starting your containers in production:
docker run --d --name secure-app \
--cap-drop=ALL \
--cap-add=NET_BIND_SERVICE \
--read-only \
--tmpfs /tmp \
my-image:latestLet's break that down:
--cap-drop=ALL: Strips all kernel privileges.--cap-add=NET_BIND_SERVICE: Adds back only the ability to bind a port.--read-only: Makes the container filesystem immutable. If an attacker gets a shell, they can't write a backdoor script to disk.--tmpfs /tmp: Gives the app a place to write temporary files (in RAM), disappearing on restart.
3. The "Noisy Neighbor" and Resource Limits
Security isn't just about hackers; it's about availability. A memory leak in one container shouldn't crash the entire host. In a shared environment, this is critical.
We saw this recently with a client running a Magento stack. A Redis container went rogue, consumed 100% of the host RAM, and triggered the OOM (Out of Memory) Killer, which ironically killed the SSH daemon instead of Redis. We had to reboot via the console.
Always set limits in your docker-compose.yml:
version: '3.7'
services:
web:
image: nginx:alpine
deploy:
resources:
limits:
cpus: '0.50'
memory: 512M
reservations:
cpus: '0.25'
memory: 256MPro Tip: If you are hosting on CoolVDS, our KVM virtualization layer enforces hard limits at the kernel level. Unlike OpenVZ or LXC providers common in the budget market, a CoolVDS NVMe instance ensures that your neighbors' high load never steals your CPU cycles. We isolate the noise so your containers run on predictable resources.
4. Network Isolation and the Local Context
In Norway, data sovereignty is becoming a massive legal headache with GDPR. You need to ensure your database traffic isn't accidentally routing through a relay outside the EEA. Docker’s default bridge network allows containers to talk to each other by IP. That’s sloppy.
Create user-defined networks to isolate tiers.
# Create networks
docker network create --driver bridge frontend
docker network create --driver bridge backend
# Connect containers
docker network connect frontend web_server
docker network connect backend db_server
# The web_server is NOT connected to backend, preventing direct DB access if web is compromised?
# Actually, you bridge the web server to both, but keep the DB only on backend.Better yet, bind your exposed ports strictly to your internal interface or localhost if you are using a reverse proxy on the host.
ports:
- "127.0.0.1:8080:80"This ensures that even if your firewall (UFW/iptables) fails during a reload, your container ports aren't suddenly naked to the public internet.
5. Storage I/O Security
Logging in containers can be brutal on disk I/O. If you are logging to json-file (the Docker default), and you get DDoS'd, your disk writes can saturate the controller, effectively locking the server.
You need to limit log sizes. Add this to your daemon.json:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}On standard HDD VPS hosting, high log churn causes "I/O Wait" to spike, killing application performance. This is why we standardized on NVMe storage at CoolVDS. The queue depth on NVMe is deep enough to handle massive concurrent log writes without blocking the CPU, keeping your latency to the NIX (Norwegian Internet Exchange) low.
Conclusion: Defense in Depth
There is no single switch for security. It is a layering of user permissions, capability dropping, and network segmentation. Furthermore, the foundation matters. Running secure containers on an oversold, insecure host is futile.
If you are handling sensitive data subject to Datatilsynet auditing, you need a host that respects isolation. CoolVDS provides the raw KVM isolation and NVMe performance required to run hardened container stacks without the performance penalty.
Stop guessing. Audit your stack today.