API Gateway Performance Tuning: Shaving Milliseconds in the Oslo Region
Letâs be honest: default configurations are for hobbyists. If you are running a high-traffic API gatewayâwhether it's Kong, plain Nginx, or HAProxyâon a standard Linux distribution out of the box, you are leaving 30% to 50% of your performance on the table. I've seen it happen too many times. A startup in Oslo launches a shiny new microservices architecture, and their latency spikes the moment they hit 500 requests per second. They blame the code. They blame the database.
Itâs almost never the code. Itâs the gateway choking on file descriptors or ephemeral ports.
In this guide, we are going to look at how to tune a Linux-based API Gateway for maximum throughput. We are strictly talking about the stack available to us right now in mid-2019: Nginx 1.15+, Kernel 4.15+, and the reality of connectivity within the Nordics.
The "War Story": When Defaults Fail
Last month, I was debugging a latency issue for a fintech client. Their API was hosted in a containerized environment, fronted by Nginx. Every day at 09:00 CET, their 99th percentile (p99) latency jumped from 40ms to 2 seconds. The CPUs were idling at 20%. RAM was fine.
The culprit? Ephemeral port exhaustion. They were opening a new connection to their upstream microservices for every single request. The kernel couldn't recycle TCP connections fast enough, leaving thousands of sockets in TIME_WAIT state. The fix wasn't buying more servers; it was five lines of configuration.
1. Kernel Level Tuning: sysctl.conf
Before touching the application layer, we must prep the OS. Most VPS providers give you a generic image meant for web hosting, not high-concurrency API routing. On a CoolVDS instance running Ubuntu 18.04, I always apply these settings immediately to /etc/sysctl.conf.
# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535
# Allow reuse of sockets in TIME_WAIT state for new connections
# (Critical for high-throughput gateways)
net.ipv4.tcp_tw_reuse = 1
# Increase the maximum number of open files
fs.file-max = 2097152
# Max backlog of connection requests
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
# TCP Window Scaling
net.ipv4.tcp_window_scaling = 1
After saving, run sysctl -p. These settings ensure your server doesn't reject incoming connections just because the TCP stack is too polite.
2. The Nginx Upstream Keepalive Mistake
This is the most common error in 2019. By default, Nginx acts as a reverse proxy that closes the connection to the backend after every request. This forces a new TCP handshake (SYN, SYN-ACK, ACK) for every API call. In a microservices mesh, this overhead is disastrous.
You must enable keepalive connections to your upstreams. Here is the correct configuration pattern:
http {
upstream backend_api {
server 10.0.0.5:8080;
server 10.0.0.6:8080;
# Keep 64 idle connections open to the upstream
keepalive 64;
}
server {
location /api/ {
proxy_pass http://backend_api;
# REQUIRED for keepalive to work
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
}
Without clearing the Connection header, Nginx forwards the client's "close" header to the backend, defeating the purpose of the keepalive directive.
3. The Hardware Factor: Why NVMe and Steal Time Matter
You can tune software all day, but you cannot tune away bad hardware or noisy neighbors. In the virtualization world (specifically Xen or KVM), "Steal Time" is the percentage of time your virtual CPU waits for the physical CPU to serve it. On oversold hosting platforms, this fluctuates wildly.
Pro Tip: Runtopand look at thestvalue. If it is consistently above 0.5%, your provider is overselling their cores. Move your workload.
At CoolVDS, we use KVM with strict resource isolation. We don't play the "burst" game where we promise you CPU you can't use. Furthermore, API Gateways often do significant logging or caching. If you are writing access logs to a spinning HDD (or even a cheap SATA SSD), your I/O wait will block the worker process.
This is why we standardized on NVMe storage for all CoolVDS instances. In 2019, the difference between SATA SSD and NVMe is not just bandwidth; it's IOPS (Input/Output Operations Per Second). For high-logging gateways, NVMe is non-negotiable.
Comparison: SATA SSD vs CoolVDS NVMe (fio benchmark)
| Metric | Standard SATA SSD VPS | CoolVDS NVMe Instance |
|---|---|---|
| Random Read IOPS (4k) | ~5,000 - 10,000 | ~50,000+ |
| Random Write IOPS (4k) | ~3,000 - 8,000 | ~35,000+ |
| Latency | 0.5ms - 2ms | < 0.1ms |
4. Local Nuances: The Oslo Advantage
Latency is determined by the speed of light and network peering. If your target audience is in Norway, hosting in Frankfurt or Amsterdam adds a mandatory 15-30ms round-trip time (RTT).
By deploying on CoolVDS in our Oslo data center, you are peering directly at NIX (Norwegian Internet Exchange). Your RTT to users in Oslo, Bergen, and Trondheim drops to single digits (often 1-3ms).
Additionally, with the tightening grip of GDPR and the rigorous standards of Datatilsynet here in Norway, keeping your data logs and traffic within national borders is becoming a significant compliance advantage. It simplifies your legal posture regarding data sovereignty.
5. SSL/TLS Optimization
In 2019, if you aren't using TLS 1.3 yet, you are lagging. OpenSSL 1.1.1 (included in Ubuntu 18.04) supports it. It reduces the handshake latency significantly.
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;
# Enable OCSP Stapling to speed up verification
ssl_stapling on;
ssl_stapling_verify on;
Final Thoughts
Performance isn't magic. It's the sum of a tuned kernel, a properly configured application, and hardware that doesn't steal your cycles. Don't let your infrastructure be the bottleneck for your brilliant code.
If you need a platform that respects these technical realitiesâwhere NVMe is standard and KVM isolation is guaranteedâspin up a test environment today.
Ready to drop your latency? Deploy your optimized API Gateway on CoolVDS in under 55 seconds.