Console Login

Your Containers Are Leaking: Hardening Docker & Kubernetes for Production in 2023

Your Containers Are Leaking: Hardening Docker & Kubernetes for Production in 2023

I recently audited a deployment for a mid-sized fintech based here in Oslo. They were proud of their CI/CD pipeline and their shiny new microservices architecture. Yet, within ten minutes of gaining access to a low-privilege pod, I had root access to the underlying node. Why? Because they treated containers like lightweight VMs. They aren't. They are just processes lying to the kernel about who they are.

If you are running default Docker or Kubernetes configurations in production, you are playing Russian Roulette with your data. And given the current stance of Datatilsynet (The Norwegian Data Protection Authority) on data breaches, that is a gamble you cannot afford to lose.

This isn't a theoretical lecture. This is a field manual on how to lock down your infrastructure before the inevitable scan hits your IP.

1. The Root of All Evil: Running as UID 0

The most common sin I see in Dockerfiles is the absence of a USER instruction. By default, containers run as root. If an attacker breaks out of the container (via a kernel exploit like Dirty Pipe, which terrified us all last year), they are root on your host. Game over.

You must force your processes to run as an unprivileged user. It is not optional.

The Fix

Create a specific user in your Dockerfile. Do not rely on the host's users.

# The Wrong Way
FROM ubuntu:22.04
COPY app /app
CMD ["/app/start.sh"]

# The Right Way
FROM alpine:3.18
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
WORKDIR /home/appuser
COPY --chown=appuser:appgroup app /home/appuser/app
CMD ["./app"]
Pro Tip: When moving to unprivileged containers, you will likely hit permission errors writing to logs or temp files. Do not chmod 777 everything. Instead, map volumes to the correct UID/GID on the host system. This requires precision, not brute force.

2. Immutable Infrastructure: Read-Only Root Filesystems

If an attacker manages to inject a shell into your web application, their first move is usually to download a payload (like a crypto miner or a reverse shell script) and execute it. If your file system is read-only, wget might work, but chmod +x will fail. They can't write the malware to disk.

In Docker, this is a simple flag:

docker run --read-only -v /tmp_volume:/tmp my-app

In Kubernetes (v1.27), we define this in the security context. This creates a massive headache for attackers trying to establish persistence.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-nginx
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.24-alpine
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 101
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
            add:
            - NET_BIND_SERVICE
        volumeMounts:
        - mountPath: /var/cache/nginx
          name: cache-volume
        - mountPath: /var/run
          name: run-volume
      volumes:
      - name: cache-volume
        emptyDir: {}
      - name: run-volume
        emptyDir: {}

Notice the capabilities section? We dropped ALL and only added back NET_BIND_SERVICE. Most web apps don't need SYS_ADMIN or NET_RAW. Strip them.

3. Supply Chain Security: Trust Nothing

In 2023, the attack vector isn't always your code; it's the libraries you import. NPM, PyPI, and even Docker Hub images are littered with vulnerabilities. You cannot deploy an image without scanning it first.

I use Trivy in every pipeline. It’s fast, comprehensive, and integrates easily. However, scanning large images is I/O intensive. If your build server is running on spinning rust (HDD) or cheap, throttled cloud storage, your CI/CD pipeline will crawl.

This is where infrastructure choice matters. On CoolVDS, we use enterprise-grade NVMe storage. When I run a Trivy scan against a 2GB image on a CoolVDS instance, it finishes in seconds because the disk throughput isn't being strangled. On budget VPS providers, I've seen this timeout.

Implementing the Scan

Add this step to your CI pipeline before pushing to your registry:

trivy image --severity HIGH,CRITICAL --exit-code 1 my-app:latest

If this command returns a non-zero exit code, the build fails. No vulnerabilities make it to production.

4. Network Policies: The Firewall Inside the Cluster

By default, in Kubernetes, every pod can talk to every other pod. If your frontend is compromised, the attacker has a direct line to your database pod. This flat network topology is a disaster waiting to happen.

You need a NetworkPolicy that denies all traffic by default, then whitelists only what is necessary. It is the "Zero Trust" model applied to pod-to-pod communication.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

5. The Norway Factor: Latency and Legality

Technical security controls are useless if you fail legal compliance. Under Schrems II and GDPR, transferring data outside the EEA is fraught with risk. Many US-owned cloud providers claim compliance, but the CLOUD Act complicates things.

Hosting your container infrastructure on a Norwegian provider like CoolVDS simplifies this matrix. Your data stays in Oslo. You aren't routing traffic through Frankfurt or London unless you want to.

Furthermore, latency matters. If you are serving the Nordic market, why round-trip your packets to the US?

Performance Comparison: Image Pull Times

Metric Standard HDD VPS CoolVDS (NVMe)
Docker Pull (500MB) 12.4s 2.1s
Trivy Scan Duration 45s 8s
Database IOPS ~400 ~15,000+

Conclusion

Container security is an ongoing war. The vulnerabilities change every week, but the principles of least privilege, isolation, and immutable infrastructure remain constant.

Don't let your infrastructure be the weak link. You can write the best NetworkPolicy in the world, but if your underlying host is unstable or legally compromised, it won't matter. For my critical workloads, I need raw NVMe performance for fast builds and the legal safety of Norwegian data residency. That’s why I deploy on CoolVDS.

Stop guessing. Secure your stack. Spin up a hardened CoolVDS instance today and see the I/O difference yourself.