Your Containers Are Leaking: Hardening Docker & Kubernetes for Production in 2023
I recently audited a deployment for a mid-sized fintech based here in Oslo. They were proud of their CI/CD pipeline and their shiny new microservices architecture. Yet, within ten minutes of gaining access to a low-privilege pod, I had root access to the underlying node. Why? Because they treated containers like lightweight VMs. They aren't. They are just processes lying to the kernel about who they are.
If you are running default Docker or Kubernetes configurations in production, you are playing Russian Roulette with your data. And given the current stance of Datatilsynet (The Norwegian Data Protection Authority) on data breaches, that is a gamble you cannot afford to lose.
This isn't a theoretical lecture. This is a field manual on how to lock down your infrastructure before the inevitable scan hits your IP.
1. The Root of All Evil: Running as UID 0
The most common sin I see in Dockerfiles is the absence of a USER instruction. By default, containers run as root. If an attacker breaks out of the container (via a kernel exploit like Dirty Pipe, which terrified us all last year), they are root on your host. Game over.
You must force your processes to run as an unprivileged user. It is not optional.
The Fix
Create a specific user in your Dockerfile. Do not rely on the host's users.
# The Wrong Way
FROM ubuntu:22.04
COPY app /app
CMD ["/app/start.sh"]
# The Right Way
FROM alpine:3.18
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
WORKDIR /home/appuser
COPY --chown=appuser:appgroup app /home/appuser/app
CMD ["./app"]
Pro Tip: When moving to unprivileged containers, you will likely hit permission errors writing to logs or temp files. Do not chmod 777 everything. Instead, map volumes to the correct UID/GID on the host system. This requires precision, not brute force.
2. Immutable Infrastructure: Read-Only Root Filesystems
If an attacker manages to inject a shell into your web application, their first move is usually to download a payload (like a crypto miner or a reverse shell script) and execute it. If your file system is read-only, wget might work, but chmod +x will fail. They can't write the malware to disk.
In Docker, this is a simple flag:
docker run --read-only -v /tmp_volume:/tmp my-app
In Kubernetes (v1.27), we define this in the security context. This creates a massive headache for attackers trying to establish persistence.
apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-nginx
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.24-alpine
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 101
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
volumeMounts:
- mountPath: /var/cache/nginx
name: cache-volume
- mountPath: /var/run
name: run-volume
volumes:
- name: cache-volume
emptyDir: {}
- name: run-volume
emptyDir: {}
Notice the capabilities section? We dropped ALL and only added back NET_BIND_SERVICE. Most web apps don't need SYS_ADMIN or NET_RAW. Strip them.
3. Supply Chain Security: Trust Nothing
In 2023, the attack vector isn't always your code; it's the libraries you import. NPM, PyPI, and even Docker Hub images are littered with vulnerabilities. You cannot deploy an image without scanning it first.
I use Trivy in every pipeline. Itβs fast, comprehensive, and integrates easily. However, scanning large images is I/O intensive. If your build server is running on spinning rust (HDD) or cheap, throttled cloud storage, your CI/CD pipeline will crawl.
This is where infrastructure choice matters. On CoolVDS, we use enterprise-grade NVMe storage. When I run a Trivy scan against a 2GB image on a CoolVDS instance, it finishes in seconds because the disk throughput isn't being strangled. On budget VPS providers, I've seen this timeout.
Implementing the Scan
Add this step to your CI pipeline before pushing to your registry:
trivy image --severity HIGH,CRITICAL --exit-code 1 my-app:latest
If this command returns a non-zero exit code, the build fails. No vulnerabilities make it to production.
4. Network Policies: The Firewall Inside the Cluster
By default, in Kubernetes, every pod can talk to every other pod. If your frontend is compromised, the attacker has a direct line to your database pod. This flat network topology is a disaster waiting to happen.
You need a NetworkPolicy that denies all traffic by default, then whitelists only what is necessary. It is the "Zero Trust" model applied to pod-to-pod communication.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
5. The Norway Factor: Latency and Legality
Technical security controls are useless if you fail legal compliance. Under Schrems II and GDPR, transferring data outside the EEA is fraught with risk. Many US-owned cloud providers claim compliance, but the CLOUD Act complicates things.
Hosting your container infrastructure on a Norwegian provider like CoolVDS simplifies this matrix. Your data stays in Oslo. You aren't routing traffic through Frankfurt or London unless you want to.
Furthermore, latency matters. If you are serving the Nordic market, why round-trip your packets to the US?
Performance Comparison: Image Pull Times
| Metric | Standard HDD VPS | CoolVDS (NVMe) |
|---|---|---|
| Docker Pull (500MB) | 12.4s | 2.1s |
| Trivy Scan Duration | 45s | 8s |
| Database IOPS | ~400 | ~15,000+ |
Conclusion
Container security is an ongoing war. The vulnerabilities change every week, but the principles of least privilege, isolation, and immutable infrastructure remain constant.
Don't let your infrastructure be the weak link. You can write the best NetworkPolicy in the world, but if your underlying host is unstable or legally compromised, it won't matter. For my critical workloads, I need raw NVMe performance for fast builds and the legal safety of Norwegian data residency. Thatβs why I deploy on CoolVDS.
Stop guessing. Secure your stack. Spin up a hardened CoolVDS instance today and see the I/O difference yourself.