Container Security in 2023: Stop Running as Root or Face the Consequences
If you are still typing docker run without a second thought in June 2023, you are not deploying infrastructure; you are deploying liabilities. I have spent the last decade cleaning up after developers who treat containers like lightweight virtual machines. They aren't. They are processes on a shared kernel, and if that kernel is exposed, your entire infrastructure is compromised. The days of trusting the default configuration are over.
We saw it with Log4Shell. We saw it with the runC vulnerability. The attack surface is massive. This guide isn't about high-level theory; it's about the specific, dirty commands and configurations you need to apply right now to stop a breakout. We will cover image provenance, runtime confinement, and why your physical hosting location—specifically here in Norway—is a security feature you can't patch with software.
1. The Supply Chain: Trust No One, Scan Everything
Most Docker images on Docker Hub are bloated and riddled with vulnerabilities. Using the latest tag is negligence. A production-grade pipeline requires deterministic builds and rigorous scanning before an image ever touches a runtime environment.
In 2023, tools like Trivy or Grype are mandatory. If you aren't gating your CI/CD with a vulnerability scanner, you are flying blind. Here is how a standard scan should look in your pipeline:
trivy image --severity HIGH,CRITICAL --exit-code 1 my-app:v1.0.2
This command fails the build if critical CVEs are found. But scanning isn't enough. You need to reduce the attack surface by using minimal base images. Forget ubuntu:22.04. Use Distroless or Alpine, though Distroless is superior for production because it lacks a shell entirely. If an attacker gets in, they have no /bin/sh to execute commands.
The "Golden" Dockerfile
Here is a reference implementation for a secure, multi-stage build for a Go application. This pattern strips the image down to the bare metal necessities and enforces a non-root user.
# Build Stage
FROM golang:1.20-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o main .
# Final Stage
FROM gcr.io/distroless/static-debian11
# Never run as root. Create a specific UID.
USER 65532:65532
COPY --from=builder /app/main /
# Read-only root filesystem is the goal
CMD ["/main"]
2. Runtime Confinement: Lock the Door
By default, Docker allows a container to do almost anything the kernel allows. You need to strip these capabilities. The --cap-drop=ALL flag is your friend. Only add back what is strictly necessary (e.g., NET_BIND_SERVICE if you need to bind port 80, though you should really bind high ports and use a reverse proxy).
Pro Tip: Never mount the Docker socket (/var/run/docker.sock) inside a container unless you absolutely know what you are doing. It gives the container root access to the host system. It is essentiallysudowithout a password.
Furthermore, enforce read-only filesystems. If an attacker exploits your application, they will try to download a payload or modify a config file. If the filesystem is read-only, they hit a wall. Here is how you run a hardened Nginx container on a CoolVDS NVMe instance:
docker run -d \
--name secure-nginx \
--read-only \
--tmpfs /var/cache/nginx \
--tmpfs /var/run \
--cap-drop=ALL \
--cap-add=CHOWN \
--cap-add=SETGID \
--cap-add=SETUID \
--cap-add=NET_BIND_SERVICE \
-p 8080:80 \
nginx:1.24-alpine
Notice the usage of --tmpfs. Nginx needs to write temporary files, but we don't want those persisting or allowing execution. Mapping them to memory satisfies the application without compromising the disk.
3. Kernel Isolation & The Neighbor Problem
Containers share the host kernel. If you are on a cheap, oversold VPS provider, "noisy neighbors" aren't just an annoyance; they are a side-channel attack vector. CPU stealing and cache timing attacks are real risks in high-density environments.
This is where the infrastructure choice becomes a security decision. At CoolVDS, we use KVM (Kernel-based Virtual Machine) for our instances. KVM provides hardware-assisted virtualization. Your kernel is your kernel. It is distinct from the host and distinct from other tenants. Running a hardened container inside a KVM slice is the gold standard for multi-tenant security.
4. Network Defense: The Firewall is Dead, Long Live the Firewall
In a microservices architecture, "allow all" is a disaster waiting to happen. If your database container can talk to the internet, you have failed. In Kubernetes, NetworkPolicies are not optional. They are the only thing stopping lateral movement.
If you are using Docker Compose, do not expose ports to the host (0.0.0.0) unless necessary. Bind strictly to localhost or a specific internal interface.
version: '3.8'
services:
db:
image: postgres:15
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
volumes:
- db_data:/var/lib/postgresql/data
# No ports exposed to host. Only accessible within the 'backend' network.
networks:
- backend
deploy:
resources:
limits:
cpus: '0.50'
memory: 512M
app:
image: my-secure-app:latest
networks:
- backend
- frontend
ports:
# Bind only to localhost if using a reverse proxy on the host
- "127.0.0.1:8080:8080"
networks:
frontend:
backend:
internal: true # No internet access for this network
5. The Norwegian Context: Data Sovereignty
We are in June 2023. The legal landscape regarding data transfer between Europe and the US is still a minefield following the Schrems II ruling. While we wait for a new adequacy decision, the reality is stark: if you host personal data of Norwegian citizens on US-owned cloud providers (even in their EU regions), you are operating in a grey area that the Datatilsynet (Norwegian Data Protection Authority) is watching closely.
Security is not just about hackers; it's about compliance. Hosting on CoolVDS means your data resides physically in Oslo. It means you are protected by Norwegian privacy laws, which are among the strictest in the world. When you run containers here, you eliminate the risk of extraterritorial data subpoenas that plague the hyperscalers.
Advanced Configuration: Seccomp Profiles
For the truly paranoid (which you should be), Seccomp (Secure Computing Mode) acts as a firewall for system calls. It restricts what kernel calls a container can make. Docker applies a default profile, but it is often too permissive.
Here is a snippet of a custom JSON profile that blocks 32-bit system calls and other dangerous operations. You pass this to Docker via --security-opt seccomp=profile.json.
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": [
"SCMP_ARCH_X86_64",
"SCMP_ARCH_X86",
"SCMP_ARCH_X32"
],
"syscalls": [
{
"names": [
"accept4",
"bind",
"clone",
"execve",
"write",
"read"
// ... strictly limit to what the app needs
],
"action": "SCMP_ACT_ALLOW",
"args": []
}
]
}
Conclusion: Performance Meets Security
Security usually comes with a performance tax. Encryption takes CPU cycles. Seccomp filters add overhead. This is why underlying hardware matters. You cannot afford to run heavy security layers on legacy spinning rust or oversold CPUs.
CoolVDS instances are built on NVMe storage and high-frequency processors. This allows you to enable aggressive security auditing (like auditd logging) and complex ingress filtering without killing your request latency. Security is a process, not a product, but starting with the right foundation makes the process a hell of a lot easier.
Don't let your infrastructure be the weak link. Deploy a hardened KVM instance on CoolVDS today and keep your data in Norway where it belongs.