Serverless on Metal: Implementing FaaS Patterns Without the Hyperscaler Tax
Let’s be honest for a moment. The promise of "Serverless"—infinite scaling, zero management, and paying only for what you use—often collapses under the weight of reality. For a startup in Silicon Valley, AWS Lambda might be the default. But for a CTO in Oslo dealing with Datatilsynet audits, strict GDPR requirements following Schrems II, and the latency penalty of routing traffic through Frankfurt or Dublin, the calculation changes.
I recently audited a Norwegian fintech setup relying heavily on Azure Functions. They were bleeding money on execution time for tasks that were essentially idle waits, and their cold start times for users in Trondheim were averaging 400ms. Unacceptable.
Serverless is an architectural pattern, not a vendor product. You don't need a hyperscaler to build event-driven systems. In fact, running a framework like OpenFaaS or Knative on high-performance KVM instances (like those we provision at CoolVDS) often yields better price-to-performance ratios and keeps your data strictly under Norwegian jurisdiction.
The Architecture: Queue-Based Load Leveling
One of the most robust serverless patterns is Queue-Based Load Leveling. Instead of your web server handling heavy processing (PDF generation, image resizing) synchronously, it pushes a job to a queue. A function then picks it up.
On a managed cloud, this gets expensive quickly. On a self-hosted VPS, it's efficient. Here is how we implement this using NATS and OpenFaaS on a standard Linux node.
1. The Infrastructure Layer
We avoid containers-on-containers limitations by using KVM virtualization. Docker containers need kernel access. If your VPS provider uses OpenVZ, you are going to hit a wall with cgroups. At CoolVDS, we strictly use KVM to ensure your Docker daemon has the isolation it needs.
2. The Function Definition
Let's look at a pragmatic stack.yml for OpenFaaS. This defines a function that handles image processing, a common high-CPU task that kills shared hosting environments.
version: 1.0
provider:
name: openfaas
gateway: http://127.0.0.1:8080
functions:
img-processor:
lang: python3-http
handler: ./img-processor
image: registry.coolvds.com/img-processor:latest
labels:
com.openfaas.scale.min: 1
com.openfaas.scale.max: 15
environment:
write_debug: true
read_timeout: 10s
write_timeout: 10s
Note the com.openfaas.scale.max label. We cap this to prevent the noisy neighbor effect on ourselves. If you are on a CoolVDS NVMe 4GB instance, you know exactly how many concurrent python processes your CPU can handle before context switching degrades performance. You control the hardware constraints, not an opaque algorithm.
Handling the "Cold Start" on Self-Hosted Hardware
The biggest enemy of serverless is the cold start—the time it takes to spin up a container to handle a request. In a hyperscaler environment, you are at the mercy of their placement algorithms. On your own VPS, you can optimize this aggressively.
The bottleneck is almost always Disk I/O. Pulling the image and extracting layers onto the overlay filesystem hits the disk hard. This is where the hardware difference becomes undeniable. Spinning rust (HDD) or even standard SATA SSDs will introduce 200-500ms of lag here.
Pro Tip: Using NVMe storage changes the physics of cold starts. On our internal benchmarks, an OpenFaaS function cold start dropped from 1.2s on standard SSD to 0.3s on NVMe. If you are building low-latency APIs, storage speed is your new CPU speed.
Optimizing the Container Runtime
Don't just use the default Docker settings. For a high-density FaaS setup, you need to tune the daemon to handle rapid container churn.
{
"storage-driver": "overlay2",
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Hard": 64000,
"Soft": 64000
}
}
}
Place this in /etc/docker/daemon.json. The ulimits are critical. A serverless architecture spawns hundreds of sockets and file descriptors. The default Linux limit of 1024 is a recipe for a Too many open files crash during a traffic spike.
The Gateway Pattern: Nginx as the Guard Dog
You should never expose the function gateway (like OpenFaaS or Kubeless) directly to the internet. You need a reverse proxy to handle SSL termination, rate limiting, and timeout handling. Nginx is the industry standard here.
A common mistake I see is leaving the default timeouts. Functions are often long-lived. If your PDF generator takes 45 seconds, Nginx will kill the connection at 30 seconds by default, leaving your client with a 504 Gateway Timeout while the server burns CPU finishing the task.
server {
listen 443 ssl http2;
server_name functions.your-domain.no;
# SSL Config omitted for brevity
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
# Critical for FaaS
proxy_read_timeout 300s;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
# Buffer settings for large payloads
client_max_body_size 50M;
proxy_buffers 8 16k;
proxy_buffer_size 32k;
}
}
Data Sovereignty and Latency
For Norwegian businesses, the "Cloud Act" in the US is a looming shadow. Storing temporary processing data in US-owned data centers (even those located in Europe) creates a compliance grey area. By hosting your FaaS infrastructure on a Norwegian provider like CoolVDS, you ensure data residency.
Furthermore, physics is undefeated.
Oslo to Frankfurt: ~25-30ms
Oslo to Oslo (NIX): ~1-3ms
If your architecture involves chaining functions (Function A calls Function B), that latency compounds. Running everything inside a single high-performance VPS or a private cluster in Oslo eliminates network hops entirely.
Implementation Strategy
If you are ready to move away from unpredictably monthly cloud bills, here is your roadmap:
- Provision the Metal: Start with a CoolVDS NVMe instance. I recommend at least 4 vCPUs if you plan to run Kubernetes/K3s. For lighter loads, standard Docker swarm works.
- Install the Platform:
arkadeis a great tool for installing OpenFaaS quickly on Kubernetes. - Secure the Edge: Configure
ufwto block all incoming traffic except port 443 and your SSH port. - Test the Limits: Use
heyorApache Benchto flood your functions. Watchhtop. If you see I/O wait (wa) spiking, you need faster storage.
Serverless isn't magic. It's just someone else's server, or in this case, your efficiently managed server. By stripping away the managed service markup, you gain control, compliance, and raw performance.
Ready to build a compliant, low-latency FaaS cluster? Deploy a CoolVDS NVMe instance today and stop paying for cold starts.