Console Login

Scaling API Gateways: Kernel Tuning & Architecture for Low-Latency Nordic Infrastructure

Scaling API Gateways: Kernel Tuning & Architecture for Low-Latency Nordic Infrastructure

It is November 2020. The dust from the CJEU's Schrems II ruling is still settling, and if you are running infrastructure in Norway or the broader EEA, you are likely scrambling to repatriate data from US-owned clouds. But moving your API gateway from a managed hyperscaler to a dedicated VPS in Oslo isn't just a "lift and shift" operation. It exposes you to the raw metal. Suddenly, you don't have a load balancer team managing your TCP stack. You have sysctl.conf and a looming deadline.

I recently audited a fintech setup migrating to a Norwegian datacenter to satisfy the Datatilsynet (Norwegian Data Protection Authority). Their application logic was sound, but their gateway was choking at 2,000 concurrent connections. The culprit? Default Linux kernel settings and an Nginx config that treated upstream connections like disposables. If you want sub-millisecond latency on your CoolVDS instance, you need to stop trusting defaults.

1. The Kernel is Your First Bottleneck

Most Linux distributions, including the Ubuntu 20.04 LTS images we commonly deploy, ship with conservative defaults intended for desktop usage or light web serving. When acting as an API Gateway, your server is essentially a packet shovel. It needs wide pipes.

The first wall you will hit is the file descriptor limit. In Linux, every TCP connection is a file. The default limit of 1,024 is laughable for an API gateway.

Configuration: /etc/sysctl.conf

Edit your sysctl configuration to widen the networking stack. We need to allow the kernel to queue more connection requests and reuse sockets faster. Note that tcp_tw_recycle is dangerous in NAT environments and has been removed in newer kernels, so stick to tcp_tw_reuse.

# Increase system-wide file descriptor limit
fs.file-max = 2097152

# Widen the port range for outgoing connections (critical for proxying)
net.ipv4.ip_local_port_range = 10000 65535

# Increase the backlog of incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Reuse sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# TCP Fast Open (TFO) to reduce handshake RTT
net.ipv4.tcp_fastopen = 3

Apply these changes immediately:

sysctl -p

2. Nginx: The Art of Keepalives

Whether you are using standard Nginx, OpenResty, or Kong, the underlying mechanic is the same. The biggest performance killer I see in production is the lack of upstream keepalives.

By default, Nginx opens a connection to your backend service (Node.js, Go, Python), sends the request, receives the response, and closes the connection. This forces a full TCP handshake (SYN, SYN-ACK, ACK) for every single API call. In a microservices environment, this adds measurable latency and exhausts ephemeral ports.

Pro Tip: On CoolVDS NVMe instances, the I/O bottleneck is virtually non-existent, so your CPU becomes the constraint. Eliminating SSL handshakes and TCP setups between the gateway and the backend frees up CPU cycles for actual request processing.

Optimized Nginx Upstream Block

upstream backend_api {
    server 10.0.0.5:8080;
    server 10.0.0.6:8080;

    # The critical directive: keep idle connections open
    keepalive 64;
}

server {
    listen 443 ssl http2;
    server_name api.example.no;

    # SSL/TLS Tuning for 2020 Standards
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers on;

    location / {
        proxy_pass http://backend_api;
        
        # Required for keepalive to work
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        
        # Buffer tuning
        proxy_buffers 16 16k;
        proxy_buffer_size 32k;
    }
}

3. The Hardware Reality: Why Virtualization Matters

You can tune your kernel all day, but if your underlying hypervisor is stealing your CPU cycles, it is futile. This is the noisy neighbor problem.

In 2020, many budget providers still rely on OpenVZ or LXC containers where kernel resources are shared. If another tenant gets DDoS'd, your ksoftirqd processes spike, and your API latency jitters. This is unacceptable for financial or real-time applications.

This is why we architect CoolVDS on KVM (Kernel-based Virtual Machine). KVM provides hardware virtualization. Your RAM is yours. Your CPU cores are reserved. When you run top, what you see is reality, not a fabricated metric. Furthermore, API logging generates massive write operations. We utilize enterprise NVMe storage to ensure that flushing access logs to disk never blocks the Nginx worker process.

Feature Standard HDD VPS CoolVDS NVMe
IOPS (4k Random Write) ~300 - 500 ~50,000+
Latency Spike Risk High (Mechanical seek time) Near Zero
Database Reliability Buffer Pool dependency Disk speed supports cache misses

4. Local Context: Latency and Compliance

For Norwegian businesses, the physical location of the server is now a legal requirement for certain datasets. Hosting in Oslo or nearby Nordic hubs ensures you remain compliant with the strictest interpretations of GDPR following the Schrems II decision.

Beyond compliance, there is physics. Connectivity to NIX (Norwegian Internet Exchange) is vital. Testing from a standard fiber connection in Oslo, we typically see:

  • Frankfurt (AWS/Google): 18ms - 25ms
  • CoolVDS (Local Node): 1ms - 3ms

For an API gateway aggregating 5 or 6 backend calls, that 20ms difference compounds. You could be saving 100ms per user interaction simply by moving the compute closer to the user.

5. Monitoring and Verification

Do not assume your tuning worked. Verify it. In late 2020, tools like `wrk` are standard for load testing. Here is how I stress test a new gateway deployment to ensure the file descriptor limits are holding:

# Run a benchmark with 12 threads and 400 open connections for 30 seconds
wrk -t12 -c400 -d30s --latency https://api.yourdomain.no/health

Watch for "Socket errors" in the output. If you see them, check dmesg on your server. If you see "possible SYN flooding on port 443", your net.core.somaxconn is still too low or your application is too slow to accept the connections.

Final Thoughts

Performance is an architectural feature, not a plugin. It requires a synergy between a tuned Linux kernel, a correctly configured reverse proxy, and hardware that doesn't lie to you. As we close out 2020, the demand for data sovereignty and speed is only increasing.

Don't let legacy configurations bottleneck your growth. Deploy a KVM-based, NVMe-powered instance on CoolVDS today and see what your code is actually capable of.