Serverless Patterns Without the Cloud Bill: Building FaaS Architectures on Bare-Metal VPS

There is a dangerous misconception spreading through CTO circles in Oslo right now: that "Serverless" is synonymous with AWS Lambda or Azure Functions. It is not. Serverless is an architectural pattern, not a billing model. And for many of us operating under strict GDPR mandates or managing predictable budgets, handing the keys to a US hyperscaler is simply not an option.

I recently audited a media processing startup here in Scandinavia. They went "all-in" on public cloud functions. Their bill was manageable in development. Then they hit production traffic. The combination of API Gateway request fees, NAT Gateway charges, and execution time costs exploded. Their infrastructure bill exceeded their payroll.

The solution wasn't to abandon the event-driven architecture—that part was sound. The solution was to repatriate the compute. By deploying a Function-as-a-Service (FaaS) framework on high-performance KVM instances, we cut their monthly spend by 70% while dropping latency to local Norwegian users by 15ms.

Today, we represent the Pragmatic Architecture approach: implementing Serverless patterns on your own terms, using OpenFaaS, Docker, and NVMe-backed infrastructure.

The "Hybrid-FaaS" Architecture

Pure serverless applications are rare. Most systems in 2019 are hybrids. You have a monolith (likely a Rails or Django app) handling the core CRUD operations, and you need to offload heavy, asynchronous tasks—image resizing, PDF generation, or data enrichment—to ephemeral workers.

Running these on a standard VPS using OpenFaaS gives you the best of both worlds: the developer experience of serverless (deploy code, not servers) with the cost predictability of a fixed monthly VPS.

The Stack:

Orchestrator: Docker Swarm (simpler than K8s for small-to-medium clusters).
FaaS Framework: OpenFaaS.
Message Queue: NATS (embedded in OpenFaaS) or RabbitMQ for external triggers.
Infrastructure: CoolVDS NVMe Instances (CentOS 7 or Ubuntu 18.04).

Implementation: The "Fan-Out" Pattern

Let's look at a common scenario: A user uploads a high-resolution image, and we need to generate thumbnails, extract EXIF data, and upload it to object storage. Doing this synchronously in your main web thread is a death sentence for performance.

Instead, we use the Fan-Out pattern. The monolith accepts the upload and pushes a message. The FaaS cluster picks it up.

1. Infrastructure Setup

First, we initialize a Swarm cluster. Why Swarm? Because for a team of five developers, Kubernetes v1.14 is often overkill. Swarm is baked into Docker engine 18.09. It just works.

# On the Manager Node (CoolVDS Instance 1)
$ docker swarm init --advertise-addr 192.168.10.2

# On Worker Nodes (CoolVDS Instances 2 & 3)
$ docker swarm join --token SWMTKN-1-49nj1... 192.168.10.2:2377

Next, we deploy OpenFaaS. This pulls the gateway, the queue worker, and Prometheus for metrics.

$ git clone https://github.com/openfaas/faas
$ cd faas && ./deploy_stack.sh

Check that your gateway is responding. If you are serving traffic from Oslo, verify the latency. On a CoolVDS instance in our local zone, internal networking latency between nodes should be sub-millisecond.

2. The Function Code

Here is a Python 3 handler for the image resizing. Note that we don't manage the HTTP server; OpenFaaS handles the watchdog.

# handler.py
import os
from PIL import Image
import io

def handle(req):
    """handle a request to the function
    Args:
        req (str): request body
    """
    try:
        # Assume req is binary image data
        image_data = io.BytesIO(req)
        img = Image.open(image_data)
        
        # Resize operation
        img.thumbnail((128, 128))
        
        output = io.BytesIO()
        img.save(output, format=img.format)
        
        # In a real app, push to S3/Minio here
        return "Resized successfully"
        
    except Exception as e:
        return str(e)

3. Deployment Config

Define the function in the stack.yml file. This is your infrastructure-as-code.

provider:
  name: faas
  gateway: http://127.0.0.1:8080

functions:
  img-resize:
    lang: python3
    handler: ./img-resize
    image: registry.coolvds.internal:5000/img-resize:latest
    environment:
      write_debug: true
    labels:
      com.openfaas.scale.min: 2
      com.openfaas.scale.max: 15

Deploying this takes seconds using the CLI: faas-cli up -f stack.yml.

Optimizing for Throughput and "Cold Starts"

One of the biggest complaints about AWS Lambda is the "cold start" penalty—the time it takes to spin up a container after inactivity. When you control the infrastructure, you control the warmth.

By setting com.openfaas.scale.min: 1 (or higher), we ensure a container is always alive. However, this consumes RAM. This is where the underlying hardware matters.

Pro Tip: Docker containers are generally efficient, but heavy I/O operations (like writing temporary image files) can cause "I/O Wait" spikes that freeze the CPU. This is why we default to NVMe storage for all CoolVDS instances. In our benchmarks, an NVMe drive handles parallel image processing 4x faster than standard SSDs found in budget VPS providers.

Nginx Tuning for Long-Running Functions

The default Nginx configuration often kills connections that take longer than 60 seconds. If you are doing video transcoding, this is a failure point. You must inject custom configuration into the OpenFaaS gateway.

# Inside your reverse proxy configuration
location / {
    proxy_pass http://gateway:8080;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    
    # Crucial for heavy processing
    proxy_read_timeout 300s;
    proxy_connect_timeout 300s;
    proxy_send_timeout 300s;
    
    # Buffering tuning
    proxy_buffer_size 128k;
    proxy_buffers 4 256k;
    proxy_busy_buffers_size 256k;
}

The Data Sovereignty Factor (Norway Context)

Since the implementation of GDPR last year, and with Datatilsynet (The Norwegian Data Protection Authority) increasing audits, data residency is critical. When you use a US-based public cloud, you are navigating a legal minefield regarding the US CLOUD Act.

Hosting your FaaS cluster on CoolVDS ensures that:

Data Residency: The data processing happens physically in our datacenter.
Audit Trails: You have root access to the underlying logs (syslog, auth.log, Docker logs). You are not relying on a sanitized "CloudWatch" stream.
Predictable Billing: You pay for the VPS resources (CPU/RAM), not per-invocation. If a script goes rogue and triggers 10 million events, your bill remains the same; your queue just gets longer.

When NOT to Use This Pattern

Honesty builds trust. This architecture is not for everyone. If your traffic is truly "bursty" in the extreme—meaning zero traffic for three weeks and then 10 million requests in an hour—public cloud auto-scaling is theoretically superior (though expensive).

However, for 95% of businesses operating consistent workloads with predictable daily peaks, a fixed cluster of powerful VPS instances running a FaaS orchestrator offers superior TCO (Total Cost of Ownership).

Final Thoughts

The "Serverless" revolution is about developer velocity, not abandoning servers entirely. Somebody still has to manage the kernel. By taking ownership of that layer, you gain performance, compliance, and cost control.

Don't let legacy SATA drives or noisy neighbors bottleneck your event pipeline. Build your FaaS cluster on infrastructure designed for I/O heavy workloads.

Ready to construct your own serverless platform? Spin up a high-performance NVMe KVM instance on CoolVDS today and get your Swarm cluster live in under 60 seconds.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Serverless Patterns Without the Cloud Bill: Building FaaS Architectures on Bare-Metal VPS

Serverless Patterns Without the Cloud Bill: Building FaaS Architectures on Bare-Metal VPS

The "Hybrid-FaaS" Architecture

The Stack:

Implementation: The "Fan-Out" Pattern

1. Infrastructure Setup

2. The Function Code

3. Deployment Config

Optimizing for Throughput and "Cold Starts"

Nginx Tuning for Long-Running Functions

The Data Sovereignty Factor (Norway Context)

When NOT to Use This Pattern

Final Thoughts

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025