API Gateway Performance Tuning: The Sub-Millisecond Guide for 2024
If your API Gateway adds more than 15 milliseconds to a request, you are doing it wrong. In the high-frequency trading floors of Oslo or the data-heavy logistics hubs of Northern Europe, latency isn't just an annoyance; it's a direct revenue leak. I have seen too many architectures where a bloated Java-based gateway running on oversold cloud infrastructure creates a bottleneck that no amount of backend code optimization can fix.
We are not talking about switching web frameworks here. We are talking about the raw mechanics of packet processing, kernel interrupts, and file descriptors. This is about taking a standard Nginx or Envoy setup and stripping away the fat until it screams.
Let’s dissect how to tune your API gateway for maximum throughput on a Linux environment, specifically tailored for the high-bandwidth reality of the Norwegian infrastructure market.
1. The Hardware Lie: Steal Time is the Enemy
Before touching a config file, acknowledge the hardware reality. You cannot tune a gateway on a noisy VPS. If your %st (steal time) in top is above 0.0%, your benchmarks are invalid. API Gateways are CPU-bound during TLS termination and I/O-bound during request routing.
This is why we benchmark exclusively on CoolVDS instances. The KVM virtualization ensures that the CPU cycles you pay for are actually yours. When you pin a process to a core on a CoolVDS NVMe instance, it stays there. No neighbor on the same physical host is going to steal your L3 cache.
2. Kernel Tuning: The `sysctl.conf` Essentials
Linux defaults are designed for general-purpose desktop usage, not for handling 50,000 concurrent connections. We need to widen the TCP pipe.
Open /etc/sysctl.conf. We are going to adjust the backlog queue and enable Fast Open. This is crucial for reducing the round-trip time (RTT) during the TCP handshake, especially for clients connecting from outside the Oslo region.
Key Kernel Directives
First, verify your current limit:
cat /proc/sys/net/core/somaxconn
If it says 128 or 1024, you are throttling your own success. Here is the production-grade configuration we deploy on high-performance nodes:
# /etc/sysctl.conf
# Maximize the backlog of incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# Reuse sockets in TIME_WAIT state for new connections
# Critical for high-throughput API gateways talking to backends
net.ipv4.tcp_tw_reuse = 1
# Increase ephemeral port range to avoid port exhaustion
net.ipv4.ip_local_port_range = 1024 65535
# TCP Fast Open (TFO) reduces handshake RTT
net.ipv4.tcp_fastopen = 3
# Congestion control - BBR is generally superior for mixed WAN traffic
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr
Apply these changes immediately:
sysctl -p
Pro Tip: If you are hosting in Norway but serving users in Central Europe, tcp_fastopen effectively cuts the latency penalty of physical distance by allowing data transfer during the handshake.
3. Nginx: The Gateway Config That Matters
Most tutorials tell you to set worker_processes auto; and walk away. That is insufficient. For an API Gateway, we need to manage file descriptors and upstream keepalives aggressively.
Every connection to your gateway is a file descriptor. Every connection from your gateway to your microservice is another file descriptor. If you hit the limit, your users get 502s.
Raising the Limits
Check your system limits:
ulimit -n
If it's 1024, bump it in /etc/security/limits.conf. Then, configure Nginx to use them.
# nginx.conf
user www-data;
worker_processes auto;
# The number of file descriptors per worker
worker_rlimit_nofile 65535;
events {
# Essential for Linux high performance
use epoll;
# Allow a worker to accept all new connections at once
multi_accept on;
worker_connections 65535;
}
http {
# ... logs and mime types ...
# OPTIMIZATION: Disable access logs for static assets or health checks to save I/O
access_log off;
# sendfile copies data between descriptors within the kernel
sendfile on;
tcp_nopush on;
tcp_nodelay on;
# Keepalive timeout - balance between resource usage and connection setup cost
keepalive_timeout 65;
# Gzip settings for JSON payloads
gzip on;
gzip_min_length 1024;
gzip_types application/json;
}
The Upstream Keepalive Trap
This is where 90% of setups fail. By default, Nginx acts as a reverse proxy that closes the connection to the backend service after every request. This forces a new TCP handshake for every API call internally. It burns CPU and adds latency.
You must configure the upstream block to keep connections open.
upstream backend_microservices {
server 10.0.0.5:8080;
server 10.0.0.6:8080;
# KEEPALIVE: Keep 64 idle connections to the backend open per worker
keepalive 64;
}
server {
listen 443 ssl http2;
server_name api.coolvds-client.no;
location / {
proxy_pass http://backend_microservices;
# Required for HTTP/1.1 keepalive to backends
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
4. TLS Termination: Speed vs. Security
Decryption is expensive. Modern CPUs (like the ones backing CoolVDS instances) support AES-NI instructions, which hardware-accelerate AES encryption. However, for mobile clients, ChaCha20-Poly1305 is often faster and less battery-intensive.
Test your OpenSSL speed:
openssl speed -evp aes-256-gcm
Ensure you are using HTTP/3 (QUIC) if your gateway supports it (Nginx 1.25+). QUIC runs over UDP and eliminates the Head-of-Line blocking problem common in HTTP/2.
5. Comparison: Nginx vs. Traefik vs. Kong
We see a lot of devs moving to Traefik for the "ease of use" with Docker. Be careful. Go's garbage collector (used in Traefik) can introduce latency spikes that C-based Nginx does not have. Here is the breakdown for 2024:
| Feature | Nginx (C) | Kong (LuaJIT/C) | Traefik (Go) |
|---|---|---|---|
| Throughput | Highest | High | Medium |
| Latency Consistency | Rock Solid | Very Good | Occasional GC Spikes |
| Config Complexity | High | Medium | Low |
| Ideal Use Case | Edge Ingress | API Management | Internal Service Mesh |
6. The Norwegian Context: NIX and GDPR
Latency is geography. Hosting your API Gateway in Frankfurt when your user base is in Bergen adds 20-30ms of pure physics to every request. By utilizing CoolVDS infrastructure located directly in Norway, you peer directly at NIX (Norwegian Internet Exchange).
Furthermore, the Datatilsynet (Norwegian Data Protection Authority) is increasingly strict regarding Schrems II and data transfers to non-adequate countries. Running your gateway and termination logic on Norwegian soil isn't just a performance play; it's a compliance fortress. You retain full sovereignty over the data ingress.
7. Verification
Don't trust my word. Benchmarking is the only truth. Use wrk to hammer your endpoint.
wrk -t12 -c400 -d30s https://your-api.no/endpoint
Look at the Stdev (Standard Deviation). If it's high, your gateway is choking on context switches. On a properly tuned CoolVDS instance, that standard deviation should be negligible.
Conclusion
Performance isn't an accident. It is the result of deliberate architectural choices: selecting the right virtualization (KVM), tuning the Linux kernel for network throughput, and configuring your gateway to maintain persistent connections. The default settings are for safety; your settings should be for speed.
Don't let slow I/O kill your SEO or your user experience. Deploy a test instance on CoolVDS today, apply these sysctl configs, and watch your latency drop.