Console Login

Escaping the Lambda Trap: Self-Hosted Serverless Architectures on NVMe

Escaping the Lambda Trap: Self-Hosted Serverless Architectures on NVMe

Let’s be honest: "Serverless" is a lie. There are always servers. The only difference is whether you control them, or if you're renting execution time at a premium markup from a US tech giant while praying your recursive function doesn't bankrupt your department. I have seen startups burn through their entire Series A runway because of a misconfigured API Gateway and a recursive loop.

But the financial risk isn't even the biggest problem for those of us operating in Europe. It's data sovereignty. Since the Schrems II ruling last year (July 2020), relying on US-owned cloud providers for processing Norwegian citizen data has moved from "grey area" to "legal minefield." The Datatilsynet (Norwegian Data Protection Authority) is not known for its sense of humor regarding third-party data transfers.

The pragmatic solution in 2021 isn't to abandon the serverless pattern—event-driven architecture is too efficient to ignore. The solution is to own the platform. By deploying OpenFaaS on top of lightweight Kubernetes (K3s) on high-performance infrastructure, we get the developer velocity of serverless with the cost predictability and compliance of a local VPS.

The Architecture: Why K3s + OpenFaaS?

Running full-blown Kubernetes (k8s) on a single node is overkill. It eats RAM for breakfast. For a lean FaaS (Functions as a Service) implementation, we use K3s. It’s a certified Kubernetes distribution but stripped of legacy cloud provider add-ons. It binaries are under 100MB.

On top of that, we layer OpenFaaS. Unlike the complexity of Knative, OpenFaaS is battle-tested, simpler to debug, and runs beautifully on standard hardware—provided that hardware has fast I/O.

Step 1: The Foundation (Infrastructure Matters)

Here is where most implementations fail. Serverless patterns generate massive container churn. Functions spin up, execute, and die in milliseconds. This puts immense pressure on the disk subsystem. If you try this on a budget VPS with spinning rust (HDD) or shared SATA SSDs, your etcd latency will spike, and your API response times will crawl.

In our reference architecture at CoolVDS, we strictly use NVMe storage. The IOPS requirement for etcd to remain stable under load is non-negotiable. When you have 50 functions cold-starting simultaneously, you need read speeds that SATA simply cannot provide.

Step 2: The Setup

Assuming you are SSH'd into a fresh Ubuntu 20.04 LTS instance (standard deployment for CoolVDS), let's get the control plane running. We disable the Traefik ingress controller by default because we want fine-grained control over our own ingress later.

# Install K3s without default Traefik
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--no-deploy traefik" sh -

# Verify the node is ready (usually takes 20-30 seconds on our infrastructure)
sudo k3s kubectl get node

Next, we use arkade, a specific tool designed to manage apps on Kubernetes, to install OpenFaaS. It’s cleaner than managing raw Helm charts manually.

# Get arkade
curl -sLS https://get.arkade.dev | sudo sh

# Install OpenFaaS
arkade install openfaas

# Check the pods
sudo k3s kubectl get pods -n openfaas

Kernel Tuning for High Concurrency

A stock Linux kernel is designed for general-purpose computing, not for handling thousands of ephemeral network connections per second. If you leave the defaults, your serverless platform will hit the `nf_conntrack` limit during a traffic spike, dropping packets silently.

I apply the following `sysctl` configurations to every node handling FaaS workloads. This reduces the time connections stay in `TIME_WAIT` and increases the file descriptor limits.

# /etc/sysctl.d/99-serverless-tuning.conf

# Increase max open files for heavy container loads
fs.file-max = 2097152

# Increase the connection tracking table size
net.netfilter.nf_conntrack_max = 262144

# Reuse connections in TIME_WAIT state
net.ipv4.tcp_tw_reuse = 1

# Decrease time to keep sockets in FIN-WAIT-2
net.ipv4.tcp_fin_timeout = 15

# Max backlog of connection requests
net.core.somaxconn = 65535

Apply these with sysctl -p /etc/sysctl.d/99-serverless-tuning.conf. On a standard CoolVDS instance, these settings allow you to handle approximately 10x the concurrent function invocations compared to stock Ubuntu settings.

The "Cold Start" Problem vs. Local NVMe

In AWS Lambda, a cold start involves moving code from S3 to a worker, spinning up a microVM, and booting the runtime. This can take 200ms to 2 seconds. On your own VPS, the bottleneck is almost entirely disk I/O—pulling the Docker image and overlaying the filesystem.

Pro Tip: By using a local registry mirror on the same LAN (or same machine) and leveraging NVMe storage, we have clocked Node.js function cold starts at under 60ms. This is the performance difference that keeps users on your site.

Code Example: A Python Image Resizer

Let's look at a practical use case: an image resizer. This is a classic heavy-compute task that is expensive on public clouds but cheap on a fixed-cost VPS.

handler.py

from PIL import Image
import io
import os

def handle(req):
    """handle a request to the function
    Args:
        req (str): request body
    """
    try:
        # Assume req is binary image data
        image = Image.open(io.BytesIO(req))
        
        # Resize logic
        image.thumbnail((128, 128))
        
        # Save to buffer
        buf = io.BytesIO()
        image.save(buf, format='JPEG', quality=85)
        byte_im = buf.getvalue()
        
        return byte_im
    except Exception as e:
        return str(e)

When you deploy this via the faas-cli, the build process packages this into a Docker container. The key here is the CPU. Image processing requires raw clock speed. Virtual CPUS (vCPUs) on budget hosts often suffer from "noisy neighbor" syndrome, where another customer's database steals your cycles. At CoolVDS, our allocation guarantees prevent this specific type of CPU steal, ensuring your image resizing takes 400ms every time, not 400ms once and 1200ms the next time.

Compliance and the "NIX" Advantage

For Norwegian businesses, the physical location of the server is paramount. By hosting your FaaS infrastructure on a VPS in Oslo, you drastically reduce latency for your local users. We are talking about 2-5ms ping times to major Norwegian ISPs via the Norwegian Internet Exchange (NIX).

Furthermore, when you control the entire stack—from the OS kernel to the K3s cluster—you can audit exactly where data flows. There is no opaque "Region: EU-North-1" abstraction. You know the IP. You know the drive serial number. You are compliant.

Conclusion

Serverless is a powerful architectural pattern, but it shouldn't cost you your budget or your legal compliance. By bringing the serverless experience in-house using K3s and OpenFaaS, you regain control.

However, this architecture demands hardware that can keep up. High IOPS NVMe storage and stable CPU performance are not luxuries; they are requirements for a functioning FaaS platform. Don't let slow I/O kill your application's responsiveness.

Ready to build? Deploy a high-performance CoolVDS NVMe instance in Oslo today and start shipping functions in minutes, not days.