Stop Running Containers as Root: A Survival Guide for 2021
I audited a Kubernetes cluster for a FinTech startup in Oslo last week. The architecture was decent, the code was clean, but the deployment manifests were a disaster. 90% of their pods were running with UID 0. In a post-Schrems II world, where the Norwegian Data Protection Authority (Datatilsynet) is scrutinizing data transfers and security posture more than ever, this isn't just sloppy—it is negligence.
Containers are not magic boxes. They are processes. If that process runs as root and escapes the container runtime—a scenario famously demonstrated by the runC vulnerability (CVE-2019-5736)—the attacker owns the host. If you are on a cheap VPS provider using shared-kernel virtualization (like OpenVZ), they own the neighbors too.
We are going to fix that. Today. No fluff, just the configurations you need to survive.
1. The Non-Negotiable: User Namespaces and Non-Root Users
By default, Docker containers run as root. This is convenient for development but catastrophic for production. If an attacker exploits a remote code execution (RCE) vulnerability in your application, they inherit those root privileges.
You must enforce a non-privileged user inside your Dockerfile. It creates a simple, effective barrier.
The Wrong Way
FROM node:14-alpine
WORKDIR /app
COPY . .
CMD ["node", "index.js"]
The Battle-Hardened Way
Here, we create a specific user and group, ensuring the process has zero administrative rights over the container filesystem.
FROM node:14-alpine
# Create a group and user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
# Ownership must be explicit
COPY --chown=appuser:appgroup . .
# Switch to non-root user
USER appuser
CMD ["node", "index.js"]
2. Dropping Linux Capabilities
Even if you are not root, the Linux kernel grants certain capabilities to processes. Does your Nginx container need to change the system time? No. Does it need to load kernel modules? Absolutely not.
The concept of "Least Privilege" applies to kernel calls too. We use the securityContext in Kubernetes or --cap-drop in Docker to strip these away.
Below is a standard hardening configuration for a web service deployed on Kubernetes (v1.19+). We drop ALL capabilities and add back only what is necessary (usually NET_BIND_SERVICE if binding to low ports, though standard practice is binding high ports internally).
apiVersion: v1
kind: Pod
metadata:
name: secured-nginx
spec:
containers:
- name: nginx
image: nginx:1.19-alpine
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 101
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
readOnlyRootFilesystem: true
Pro Tip: When usingreadOnlyRootFilesystem: true, applications that need to write temp files (like Nginx caching or logs) will crash. Map anemptyDirvolume to/tmpand/var/cache/nginxto solve this without compromising the root filesystem.
3. The Infrastructure Layer: Why "Shared Kernel" is a Security Lie
This is where your choice of hosting provider becomes a security decision. Many budget VPS providers in Europe still use container-based virtualization (LXC/OpenVZ). In these environments, your "server" is just a container on a massive host. You share the kernel with every other customer on that physical machine.
If a neighbor triggers a kernel panic, you go down. If a neighbor exploits a kernel vulnerability, they can theoretically access your memory space.
For serious workloads, Hardware Virtualization (KVM) is the minimum standard. KVM provides a strict isolation layer where your OS has its own kernel.
| Feature | Container VPS (OpenVZ/LXC) | CoolVDS (KVM + NVMe) |
|---|---|---|
| Kernel Isolation | Shared (High Risk) | Dedicated (High Security) |
| Neighbor Impact | Noisy neighbors affect CPU/RAM | Strict resource allocation |
| Docker Support | Often requires hacks/patches | Native, standard Linux behavior |
At CoolVDS, we don't mess around with shared kernels. Every instance is a KVM slice backed by NVMe storage. When you run docker run on CoolVDS, you are interacting with your own kernel, not ours.
4. Supply Chain: Scanning Before You Ship
You locked down the runtime, but what about the image itself? Pulling latest from Docker Hub is asking for trouble. Vulnerabilities in base OS layers (Alpine, Debian) are discovered daily.
In 2021, you should be integrating a scanner into your CI/CD pipeline. Trivy by Aqua Security is currently the fastest, easiest tool for this. It installs in seconds and runs locally.
# Install Trivy (v0.15.0 - Current Stable)
$ apt-get install wget apt-transport-https gnupg lsb-release
$ wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | apt-key add -
$ echo deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main | tee -a /etc/apt/sources.list.d/trivy.list
$ apt-get update && apt-get install trivy
# Scan your image
$ trivy image python:3.4-alpine
If the scan returns CRITICAL CVEs, the build fails. Simple.
5. The Local Angle: Latency and Sovereignty
Security isn't just about hackers; it's about lawyers. Since the CJEU invalidated the Privacy Shield framework last year (Schrems II), storing customer data on US-owned clouds has become a legal minefield for Norwegian companies. Datatilsynet has made it clear: relying on Standard Contractual Clauses (SCCs) without supplementary measures is risky.
Hosting locally in Norway solves two problems:
- Compliance: Data stays within the EEA, on infrastructure owned by a Norwegian entity (like CoolVDS), significantly simplifying your GDPR documentation.
- Performance: If your users are in Oslo or Bergen, routing traffic through Frankfurt is inefficient. CoolVDS peers directly at NIX (Norwegian Internet Exchange), offering single-digit millisecond latency to local users.
Final Thoughts
Container security is a layered defense. You strip privileges in the Dockerfile, you drop capabilities in the orchestration, and you ensure the underlying infrastructure offers true hardware isolation.
Don't build a fortress on a swamp. Ensure your foundation is solid.
Need a KVM environment that respects your data sovereignty? Deploy a CoolVDS NVMe instance in Oslo today and lock it down the right way.