Serverless on Metal: Implementing FaaS Patterns Without the Hyperscaler Tax

Let’s be honest for a moment. The promise of "Serverless"—infinite scaling, zero management, and paying only for what you use—often collapses under the weight of reality. For a startup in Silicon Valley, AWS Lambda might be the default. But for a CTO in Oslo dealing with Datatilsynet audits, strict GDPR requirements following Schrems II, and the latency penalty of routing traffic through Frankfurt or Dublin, the calculation changes.

I recently audited a Norwegian fintech setup relying heavily on Azure Functions. They were bleeding money on execution time for tasks that were essentially idle waits, and their cold start times for users in Trondheim were averaging 400ms. Unacceptable.

Serverless is an architectural pattern, not a vendor product. You don't need a hyperscaler to build event-driven systems. In fact, running a framework like OpenFaaS or Knative on high-performance KVM instances (like those we provision at CoolVDS) often yields better price-to-performance ratios and keeps your data strictly under Norwegian jurisdiction.

The Architecture: Queue-Based Load Leveling

One of the most robust serverless patterns is Queue-Based Load Leveling. Instead of your web server handling heavy processing (PDF generation, image resizing) synchronously, it pushes a job to a queue. A function then picks it up.

On a managed cloud, this gets expensive quickly. On a self-hosted VPS, it's efficient. Here is how we implement this using NATS and OpenFaaS on a standard Linux node.

1. The Infrastructure Layer

We avoid containers-on-containers limitations by using KVM virtualization. Docker containers need kernel access. If your VPS provider uses OpenVZ, you are going to hit a wall with cgroups. At CoolVDS, we strictly use KVM to ensure your Docker daemon has the isolation it needs.

2. The Function Definition

Let's look at a pragmatic stack.yml for OpenFaaS. This defines a function that handles image processing, a common high-CPU task that kills shared hosting environments.

version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  img-processor:
    lang: python3-http
    handler: ./img-processor
    image: registry.coolvds.com/img-processor:latest
    labels:
      com.openfaas.scale.min: 1
      com.openfaas.scale.max: 15
    environment:
      write_debug: true
      read_timeout: 10s
      write_timeout: 10s

Note the com.openfaas.scale.max label. We cap this to prevent the noisy neighbor effect on ourselves. If you are on a CoolVDS NVMe 4GB instance, you know exactly how many concurrent python processes your CPU can handle before context switching degrades performance. You control the hardware constraints, not an opaque algorithm.

Handling the "Cold Start" on Self-Hosted Hardware

The biggest enemy of serverless is the cold start—the time it takes to spin up a container to handle a request. In a hyperscaler environment, you are at the mercy of their placement algorithms. On your own VPS, you can optimize this aggressively.

The bottleneck is almost always Disk I/O. Pulling the image and extracting layers onto the overlay filesystem hits the disk hard. This is where the hardware difference becomes undeniable. Spinning rust (HDD) or even standard SATA SSDs will introduce 200-500ms of lag here.

Pro Tip: Using NVMe storage changes the physics of cold starts. On our internal benchmarks, an OpenFaaS function cold start dropped from 1.2s on standard SSD to 0.3s on NVMe. If you are building low-latency APIs, storage speed is your new CPU speed.

Optimizing the Container Runtime

Don't just use the default Docker settings. For a high-density FaaS setup, you need to tune the daemon to handle rapid container churn.

{
  "storage-driver": "overlay2",
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  },
  "default-ulimits": {
    "nofile": {
      "Name": "nofile",
      "Hard": 64000,
      "Soft": 64000
    }
  }
}

Place this in /etc/docker/daemon.json. The ulimits are critical. A serverless architecture spawns hundreds of sockets and file descriptors. The default Linux limit of 1024 is a recipe for a Too many open files crash during a traffic spike.

The Gateway Pattern: Nginx as the Guard Dog

You should never expose the function gateway (like OpenFaaS or Kubeless) directly to the internet. You need a reverse proxy to handle SSL termination, rate limiting, and timeout handling. Nginx is the industry standard here.

A common mistake I see is leaving the default timeouts. Functions are often long-lived. If your PDF generator takes 45 seconds, Nginx will kill the connection at 30 seconds by default, leaving your client with a 504 Gateway Timeout while the server burns CPU finishing the task.

server {
    listen 443 ssl http2;
    server_name functions.your-domain.no;

    # SSL Config omitted for brevity

    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        
        # Critical for FaaS
        proxy_read_timeout 300s;
        proxy_connect_timeout 300s;
        proxy_send_timeout 300s;
        
        # Buffer settings for large payloads
        client_max_body_size 50M;
        proxy_buffers 8 16k;
        proxy_buffer_size 32k;
    }
}

Data Sovereignty and Latency

For Norwegian businesses, the "Cloud Act" in the US is a looming shadow. Storing temporary processing data in US-owned data centers (even those located in Europe) creates a compliance grey area. By hosting your FaaS infrastructure on a Norwegian provider like CoolVDS, you ensure data residency.

Furthermore, physics is undefeated.
Oslo to Frankfurt: ~25-30ms
Oslo to Oslo (NIX): ~1-3ms

If your architecture involves chaining functions (Function A calls Function B), that latency compounds. Running everything inside a single high-performance VPS or a private cluster in Oslo eliminates network hops entirely.

Implementation Strategy

If you are ready to move away from unpredictably monthly cloud bills, here is your roadmap:

Provision the Metal: Start with a CoolVDS NVMe instance. I recommend at least 4 vCPUs if you plan to run Kubernetes/K3s. For lighter loads, standard Docker swarm works.
Install the Platform: arkade is a great tool for installing OpenFaaS quickly on Kubernetes.
Secure the Edge: Configure ufw to block all incoming traffic except port 443 and your SSH port.
Test the Limits: Use hey or Apache Bench to flood your functions. Watch htop. If you see I/O wait (wa) spiking, you need faster storage.

Serverless isn't magic. It's just someone else's server, or in this case, your efficiently managed server. By stripping away the managed service markup, you gain control, compliance, and raw performance.

Ready to build a compliant, low-latency FaaS cluster? Deploy a CoolVDS NVMe instance today and stop paying for cold starts.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Serverless on Metal: Implementing FaaS Patterns Without the Hyperscaler Tax

Serverless on Metal: Implementing FaaS Patterns Without the Hyperscaler Tax

The Architecture: Queue-Based Load Leveling

1. The Infrastructure Layer

2. The Function Definition

Handling the "Cold Start" on Self-Hosted Hardware

Optimizing the Container Runtime

The Gateway Pattern: Nginx as the Guard Dog

Data Sovereignty and Latency

Implementation Strategy

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025