API Gateway Performance Tuning: Shaving Milliseconds in the Oslo Region

Let’s be honest: default configurations are for hobbyists. If you are running a high-traffic API gateway—whether it's Kong, plain Nginx, or HAProxy—on a standard Linux distribution out of the box, you are leaving 30% to 50% of your performance on the table. I've seen it happen too many times. A startup in Oslo launches a shiny new microservices architecture, and their latency spikes the moment they hit 500 requests per second. They blame the code. They blame the database.

It’s almost never the code. It’s the gateway choking on file descriptors or ephemeral ports.

In this guide, we are going to look at how to tune a Linux-based API Gateway for maximum throughput. We are strictly talking about the stack available to us right now in mid-2019: Nginx 1.15+, Kernel 4.15+, and the reality of connectivity within the Nordics.

The "War Story": When Defaults Fail

Last month, I was debugging a latency issue for a fintech client. Their API was hosted in a containerized environment, fronted by Nginx. Every day at 09:00 CET, their 99th percentile (p99) latency jumped from 40ms to 2 seconds. The CPUs were idling at 20%. RAM was fine.

The culprit? Ephemeral port exhaustion. They were opening a new connection to their upstream microservices for every single request. The kernel couldn't recycle TCP connections fast enough, leaving thousands of sockets in TIME_WAIT state. The fix wasn't buying more servers; it was five lines of configuration.

1. Kernel Level Tuning: `sysctl.conf`

Before touching the application layer, we must prep the OS. Most VPS providers give you a generic image meant for web hosting, not high-concurrency API routing. On a CoolVDS instance running Ubuntu 18.04, I always apply these settings immediately to /etc/sysctl.conf.

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535

# Allow reuse of sockets in TIME_WAIT state for new connections
# (Critical for high-throughput gateways)
net.ipv4.tcp_tw_reuse = 1

# Increase the maximum number of open files
fs.file-max = 2097152

# Max backlog of connection requests
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535

# TCP Window Scaling
net.ipv4.tcp_window_scaling = 1

After saving, run sysctl -p. These settings ensure your server doesn't reject incoming connections just because the TCP stack is too polite.

2. The Nginx Upstream Keepalive Mistake

This is the most common error in 2019. By default, Nginx acts as a reverse proxy that closes the connection to the backend after every request. This forces a new TCP handshake (SYN, SYN-ACK, ACK) for every API call. In a microservices mesh, this overhead is disastrous.

You must enable keepalive connections to your upstreams. Here is the correct configuration pattern:

http {
    upstream backend_api {
        server 10.0.0.5:8080;
        server 10.0.0.6:8080;
        
        # Keep 64 idle connections open to the upstream
        keepalive 64;
    }

    server {
        location /api/ {
            proxy_pass http://backend_api;
            
            # REQUIRED for keepalive to work
            proxy_http_version 1.1;
            proxy_set_header Connection "";
        }
    }
}

Without clearing the Connection header, Nginx forwards the client's "close" header to the backend, defeating the purpose of the keepalive directive.

3. The Hardware Factor: Why NVMe and Steal Time Matter

You can tune software all day, but you cannot tune away bad hardware or noisy neighbors. In the virtualization world (specifically Xen or KVM), "Steal Time" is the percentage of time your virtual CPU waits for the physical CPU to serve it. On oversold hosting platforms, this fluctuates wildly.

Pro Tip: Run top and look at the st value. If it is consistently above 0.5%, your provider is overselling their cores. Move your workload.

At CoolVDS, we use KVM with strict resource isolation. We don't play the "burst" game where we promise you CPU you can't use. Furthermore, API Gateways often do significant logging or caching. If you are writing access logs to a spinning HDD (or even a cheap SATA SSD), your I/O wait will block the worker process.

This is why we standardized on NVMe storage for all CoolVDS instances. In 2019, the difference between SATA SSD and NVMe is not just bandwidth; it's IOPS (Input/Output Operations Per Second). For high-logging gateways, NVMe is non-negotiable.

Comparison: SATA SSD vs CoolVDS NVMe (fio benchmark)

Metric	Standard SATA SSD VPS	CoolVDS NVMe Instance
Random Read IOPS (4k)	~5,000 - 10,000	~50,000+
Random Write IOPS (4k)	~3,000 - 8,000	~35,000+
Latency	0.5ms - 2ms	< 0.1ms

4. Local Nuances: The Oslo Advantage

Latency is determined by the speed of light and network peering. If your target audience is in Norway, hosting in Frankfurt or Amsterdam adds a mandatory 15-30ms round-trip time (RTT).

By deploying on CoolVDS in our Oslo data center, you are peering directly at NIX (Norwegian Internet Exchange). Your RTT to users in Oslo, Bergen, and Trondheim drops to single digits (often 1-3ms).

Additionally, with the tightening grip of GDPR and the rigorous standards of Datatilsynet here in Norway, keeping your data logs and traffic within national borders is becoming a significant compliance advantage. It simplifies your legal posture regarding data sovereignty.

5. SSL/TLS Optimization

In 2019, if you aren't using TLS 1.3 yet, you are lagging. OpenSSL 1.1.1 (included in Ubuntu 18.04) supports it. It reduces the handshake latency significantly.

ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;

# Enable OCSP Stapling to speed up verification
ssl_stapling on;
ssl_stapling_verify on;

Final Thoughts

Performance isn't magic. It's the sum of a tuned kernel, a properly configured application, and hardware that doesn't steal your cycles. Don't let your infrastructure be the bottleneck for your brilliant code.

If you need a platform that respects these technical realities—where NVMe is standard and KVM isolation is guaranteed—spin up a test environment today.

Ready to drop your latency? Deploy your optimized API Gateway on CoolVDS in under 55 seconds.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

API Gateway Performance Tuning: Shaving Milliseconds in the Oslo Region

API Gateway Performance Tuning: Shaving Milliseconds in the Oslo Region

The "War Story": When Defaults Fail

1. Kernel Level Tuning: sysctl.conf

2. The Nginx Upstream Keepalive Mistake

3. The Hardware Factor: Why NVMe and Steal Time Matter

Comparison: SATA SSD vs CoolVDS NVMe (fio benchmark)

4. Local Nuances: The Oslo Advantage

5. SSL/TLS Optimization

Final Thoughts

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025

1. Kernel Level Tuning: `sysctl.conf`