Serverless Patterns Without the Cloud Bill: Building FaaS Architectures on Bare-Metal VPS
There is a dangerous misconception spreading through CTO circles in Oslo right now: that "Serverless" is synonymous with AWS Lambda or Azure Functions. It is not. Serverless is an architectural pattern, not a billing model. And for many of us operating under strict GDPR mandates or managing predictable budgets, handing the keys to a US hyperscaler is simply not an option.
I recently audited a media processing startup here in Scandinavia. They went "all-in" on public cloud functions. Their bill was manageable in development. Then they hit production traffic. The combination of API Gateway request fees, NAT Gateway charges, and execution time costs exploded. Their infrastructure bill exceeded their payroll.
The solution wasn't to abandon the event-driven architecture—that part was sound. The solution was to repatriate the compute. By deploying a Function-as-a-Service (FaaS) framework on high-performance KVM instances, we cut their monthly spend by 70% while dropping latency to local Norwegian users by 15ms.
Today, we represent the Pragmatic Architecture approach: implementing Serverless patterns on your own terms, using OpenFaaS, Docker, and NVMe-backed infrastructure.
The "Hybrid-FaaS" Architecture
Pure serverless applications are rare. Most systems in 2019 are hybrids. You have a monolith (likely a Rails or Django app) handling the core CRUD operations, and you need to offload heavy, asynchronous tasks—image resizing, PDF generation, or data enrichment—to ephemeral workers.
Running these on a standard VPS using OpenFaaS gives you the best of both worlds: the developer experience of serverless (deploy code, not servers) with the cost predictability of a fixed monthly VPS.
The Stack:
- Orchestrator: Docker Swarm (simpler than K8s for small-to-medium clusters).
- FaaS Framework: OpenFaaS.
- Message Queue: NATS (embedded in OpenFaaS) or RabbitMQ for external triggers.
- Infrastructure: CoolVDS NVMe Instances (CentOS 7 or Ubuntu 18.04).
Implementation: The "Fan-Out" Pattern
Let's look at a common scenario: A user uploads a high-resolution image, and we need to generate thumbnails, extract EXIF data, and upload it to object storage. Doing this synchronously in your main web thread is a death sentence for performance.
Instead, we use the Fan-Out pattern. The monolith accepts the upload and pushes a message. The FaaS cluster picks it up.
1. Infrastructure Setup
First, we initialize a Swarm cluster. Why Swarm? Because for a team of five developers, Kubernetes v1.14 is often overkill. Swarm is baked into Docker engine 18.09. It just works.
# On the Manager Node (CoolVDS Instance 1)
$ docker swarm init --advertise-addr 192.168.10.2
# On Worker Nodes (CoolVDS Instances 2 & 3)
$ docker swarm join --token SWMTKN-1-49nj1... 192.168.10.2:2377
Next, we deploy OpenFaaS. This pulls the gateway, the queue worker, and Prometheus for metrics.
$ git clone https://github.com/openfaas/faas
$ cd faas && ./deploy_stack.sh
Check that your gateway is responding. If you are serving traffic from Oslo, verify the latency. On a CoolVDS instance in our local zone, internal networking latency between nodes should be sub-millisecond.
2. The Function Code
Here is a Python 3 handler for the image resizing. Note that we don't manage the HTTP server; OpenFaaS handles the watchdog.
# handler.py
import os
from PIL import Image
import io
def handle(req):
"""handle a request to the function
Args:
req (str): request body
"""
try:
# Assume req is binary image data
image_data = io.BytesIO(req)
img = Image.open(image_data)
# Resize operation
img.thumbnail((128, 128))
output = io.BytesIO()
img.save(output, format=img.format)
# In a real app, push to S3/Minio here
return "Resized successfully"
except Exception as e:
return str(e)
3. Deployment Config
Define the function in the stack.yml file. This is your infrastructure-as-code.
provider:
name: faas
gateway: http://127.0.0.1:8080
functions:
img-resize:
lang: python3
handler: ./img-resize
image: registry.coolvds.internal:5000/img-resize:latest
environment:
write_debug: true
labels:
com.openfaas.scale.min: 2
com.openfaas.scale.max: 15
Deploying this takes seconds using the CLI: faas-cli up -f stack.yml.
Optimizing for Throughput and "Cold Starts"
One of the biggest complaints about AWS Lambda is the "cold start" penalty—the time it takes to spin up a container after inactivity. When you control the infrastructure, you control the warmth.
By setting com.openfaas.scale.min: 1 (or higher), we ensure a container is always alive. However, this consumes RAM. This is where the underlying hardware matters.
Pro Tip: Docker containers are generally efficient, but heavy I/O operations (like writing temporary image files) can cause "I/O Wait" spikes that freeze the CPU. This is why we default to NVMe storage for all CoolVDS instances. In our benchmarks, an NVMe drive handles parallel image processing 4x faster than standard SSDs found in budget VPS providers.
Nginx Tuning for Long-Running Functions
The default Nginx configuration often kills connections that take longer than 60 seconds. If you are doing video transcoding, this is a failure point. You must inject custom configuration into the OpenFaaS gateway.
# Inside your reverse proxy configuration
location / {
proxy_pass http://gateway:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
# Crucial for heavy processing
proxy_read_timeout 300s;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
# Buffering tuning
proxy_buffer_size 128k;
proxy_buffers 4 256k;
proxy_busy_buffers_size 256k;
}
The Data Sovereignty Factor (Norway Context)
Since the implementation of GDPR last year, and with Datatilsynet (The Norwegian Data Protection Authority) increasing audits, data residency is critical. When you use a US-based public cloud, you are navigating a legal minefield regarding the US CLOUD Act.
Hosting your FaaS cluster on CoolVDS ensures that:
- Data Residency: The data processing happens physically in our datacenter.
- Audit Trails: You have root access to the underlying logs (syslog, auth.log, Docker logs). You are not relying on a sanitized "CloudWatch" stream.
- Predictable Billing: You pay for the VPS resources (CPU/RAM), not per-invocation. If a script goes rogue and triggers 10 million events, your bill remains the same; your queue just gets longer.
When NOT to Use This Pattern
Honesty builds trust. This architecture is not for everyone. If your traffic is truly "bursty" in the extreme—meaning zero traffic for three weeks and then 10 million requests in an hour—public cloud auto-scaling is theoretically superior (though expensive).
However, for 95% of businesses operating consistent workloads with predictable daily peaks, a fixed cluster of powerful VPS instances running a FaaS orchestrator offers superior TCO (Total Cost of Ownership).
Final Thoughts
The "Serverless" revolution is about developer velocity, not abandoning servers entirely. Somebody still has to manage the kernel. By taking ownership of that layer, you gain performance, compliance, and cost control.
Don't let legacy SATA drives or noisy neighbors bottleneck your event pipeline. Build your FaaS cluster on infrastructure designed for I/O heavy workloads.
Ready to construct your own serverless platform? Spin up a high-performance NVMe KVM instance on CoolVDS today and get your Swarm cluster live in under 60 seconds.