The Container "Isolation" Lie

Let's rip the band-aid off: Containers are not real virtualization. They are processes masquerading as isolated units, sharing the same kernel as the host. If you are deploying Docker containers in production with default settings in 2019, you aren't just opening a door for attackers; you are holding it open and serving them coffee.

I recently audited a setup for a client in Oslo—a fintech startup moving from bare metal to Kubernetes. They were proud of their CI/CD velocity. Then I looked at their definitions. Every single pod was running as root. They had mounted the host filesystem /var/run/docker.sock to "make building easier." I demonstrated how a simple shell injection in their Node.js app could allow me to wipe their entire cluster, including the persistent volumes storing customer data. The silence in the room was deafening.

Speed means nothing if your infrastructure is compromised. In Norway, where the Datatilsynet (Data Protection Authority) does not mess around with GDPR breaches, security is not optional. It is survival.

1. The Root Problem (Literally)

By default, a process inside a Docker container runs as PID 1 with UID 0 (root). If an attacker breaks out of the application, they are root on your container. If they exploit a kernel vulnerability (like the Dirty COW exploit from a couple of years ago), they are root on the host node.

You must enforce non-root execution at the image build level. Do not rely on runtime flags alone.

Correct Dockerfile Pattern

FROM alpine:3.9

# Create a group and user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# Install dependencies
RUN apk add --no-cache python3

# Tell Docker to switch context
USER appuser

WORKDIR /home/appuser
COPY . .

CMD ["python3", "app.py"]

This is basic, yet 80% of the images I see on Docker Hub ignore it. When you run this, even if the application is compromised, the attacker finds themselves trapped as appuser with limited permissions.

2. Immutable Infrastructure & Read-Only Filesystems

Containers should be ephemeral. If you are patching a running container, you are doing it wrong. Rebuild the image. To enforce this, mount the root filesystem as read-only. This prevents attackers from downloading malicious scripts or modifying binaries.

Here is how you enforce that via `docker run`:

docker run --read-only \
  --tmpfs /run \
  --tmpfs /tmp \
  -v /my/data:/data:rw \
  my-secure-image

We use `tmpfs` for temporary directories because the application might crash if it can't write to `/tmp`, but the rest of the OS remains frozen.

3. Kubernetes PodSecurityPolicies (PSP)

If you are orchestrating with Kubernetes (and by now, in mid-2019, most serious shops are moving to v1.14+), you cannot trust developers to write secure YAMLs. You need to enforce it at the cluster level.

PodSecurityPolicies are currently the gold standard for admission control. They prevent pods from starting if they violate your security profile.

Pro Tip: Do not just apply a restrictive PSP blindly. You will break your CNI plugins (like Calico or Flannel) and system controllers. Create a specific PSP for business logic workloads.

Here is a strict PSP that denies privilege escalation:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted-psp
spec:
  privileged: false
  # Prevent changing user ID to root
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'persistentVolumeClaim'
    - 'secret'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'MustRunAs'
    ranges:
      - min: 1
        max: 65535
  fsGroup:
    rule: 'MustRunAs'
    ranges:
      - min: 1
        max: 65535

4. Limiting Kernel Capabilities

Linux capabilities break down the "all-or-nothing" power of root into smaller privileges. A web server does not need `NET_ADMIN` (network configuration) or `SYS_MODULE` (loading kernel modules). Drop everything and add back only what is strictly necessary.

In your docker-compose or run command:

cap_drop:
  - ALL
cap_add:
  - NET_BIND_SERVICE

This ensures that even if an escalation occurs, the attacker cannot modify network tables or load a rootkit.

5. The Infrastructure Layer: Why CoolVDS Matters

Software limits like cgroups and namespaces are robust, but bugs happen. Hardware isolation is the final backstop. This is where your choice of hosting becomes a security architecture decision.

Many budget providers use container-based virtualization (like OpenVZ/LXC) for their VPS offerings. This is dangerous for high-security workloads because you are sharing the kernel with the provider's other customers. A kernel panic in their container affects you. A kernel exploit in your neighbor's instance could theoretically expose your memory.

Feature	Container VPS (OpenVZ)	CoolVDS (KVM)
Kernel Isolation	Shared	Dedicated
Memory Privacy	Software Restricted	Hardware Virtualized
Custom Kernel	No	Yes (Install SELinux/Grsecurity)
Disk I/O	Often Shared/Noisy	Dedicated NVMe

At CoolVDS, we exclusively use KVM (Kernel-based Virtual Machine) virtualization. Each VPS Norway instance runs its own independent kernel. Even if your container runtime is compromised, the attacker is trapped inside a VM sandbox, not on the bare metal host. Combined with our local NVMe storage, you get the I/O throughput needed for database-heavy microservices without the "noisy neighbor" security risks.

Network & Latency Considerations

Security is also about availability (the 'A' in CIA triad). DDoS attacks are rampant in Europe right now. Running your cluster on a provider with weak upstream connectivity is a risk. We peer directly at NIX (Norwegian Internet Exchange), ensuring that local traffic stays local—low latency for your Oslo users and compliance with data residency requirements.

6. Continuous Scanning

Finally, static analysis is mandatory. You cannot deploy black boxes. Use tools like Clair or Anchore Engine to scan your images for CVEs before they hit the registry. Integrate this into your Jenkins or GitLab CI pipelines.

# Example Anchore CLI check
anchore-cli image add myapp:latest
anchore-cli image wait myapp:latest
anchore-cli image vulns myapp:latest all

If the scan returns High severity vulnerabilities (like the recent runc vulnerability CVE-2019-5736), the build fails. No exceptions.

Summary

Container security in 2019 requires a defense-in-depth strategy:

Build Secure: Non-root users, minimal base images (Alpine).
Run Secure: Read-only filesystems, dropped capabilities, PSPs.
Host Secure: Strong isolation via KVM on CoolVDS.

Don't let a misconfigured YAML file be the reason you have to explain a data breach to the Datatilsynet. Secure your infrastructure from the bottom up.

Need a hardened environment for your Kubernetes cluster? Deploy a KVM-based, NVMe-powered instance on CoolVDS today and sleep better tonight.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Container Security in 2019: Stop Running as Root or Get Hacked

The Container "Isolation" Lie

1. The Root Problem (Literally)

Correct Dockerfile Pattern

2. Immutable Infrastructure & Read-Only Filesystems

3. Kubernetes PodSecurityPolicies (PSP)

4. Limiting Kernel Capabilities

5. The Infrastructure Layer: Why CoolVDS Matters

Network & Latency Considerations

6. Continuous Scanning

Summary

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025