API Gateway Performance Tuning: Squeezing Milliseconds Out of Nginx on Linux
Let’s be honest: default Linux distributions are tuned for general-purpose desktop usage, not for handling 10,000 concurrent API requests. If you are deploying a high-traffic API gateway on a vanilla Ubuntu 16.04 or CentOS 7 install without touching sysctl.conf, you are leaving performance on the table. In the mobile-first world of 2016, where 3G networks in rural Norway can already add latency, your infrastructure shouldn't be the bottleneck.
I recently audited a setup for a client in Oslo trying to scale a microservices architecture. They were throwing more RAM at the problem, but their latency kept spiking. The culprit wasn't their Go application code; it was the TCP stack and a misconfigured reverse proxy. Here is how we fixed it, and how you can tune your API gateway to handle the load without melting your servers.
1. The Foundation: Kernel Tuning
Before we even look at the application layer, we must look at the kernel. When your API gateway acts as a reverse proxy, it opens a massive number of sockets. First, to the client, and second, to the upstream backend. Linux, by default, is conservative about how many files can be open and how quickly it recycles TCP connections.
Edit your /etc/sysctl.conf. These settings are aggressive but necessary for a high-throughput gateway.
# /etc/sysctl.conf
# Increase system-wide file descriptors
fs.file-max = 2097152
# Allow more connections to queue up
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# Increase the range of ephemeral ports for upstream connections
net.ipv4.ip_local_port_range = 1024 65535
# Reuse sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
# Increase TCP buffer sizes for modern high-speed networks
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
Apply these with sysctl -p. The tcp_tw_reuse flag is particularly critical for API gateways. Without it, you will run out of ephemeral ports because thousands of sockets will sit in TIME_WAIT state for a full minute after the connection closes. In a high-velocity environment, that is a death sentence.
2. Nginx: The Gatekeeper
Whether you are using raw Nginx, OpenResty, or Kong, the underlying engine is the same. The most common mistake I see is neglecting upstream keepalives.
By default, Nginx acts as a polite HTTP/1.0 client to your backend services. It opens a connection, sends the request, gets the response, and closes the connection. This means for every single API call, you are paying the price of a full TCP handshake (SYN, SYN-ACK, ACK) between the gateway and your microservice. If you are using SSL internally, you are also doing the TLS handshake again.
Enable Keepalives
You need to tell Nginx to keep that connection open.
upstream backend_service {
server 10.0.0.10:8080;
server 10.0.0.11:8080;
# Keep 64 idle connections open to the upstream
keepalive 64;
}
server {
location /api/ {
proxy_pass http://backend_service;
# Required for HTTP/1.1 to upstreams
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
This simple change can reduce internal latency by 20-50ms per request.
3. SSL/TLS: HTTP/2 is Here
With Let's Encrypt leaving beta earlier this year, there is no excuse for unencrypted traffic. However, SSL has a CPU cost. To mitigate this, you must enable HTTP/2 (released in Nginx 1.9.5). HTTP/2 allows multiplexing multiple requests over a single TCP connection, eliminating the head-of-line blocking problem of HTTP/1.1.
Ensure you are using OpenSSL 1.0.2 or later to support ALPN, which is required for HTTP/2 in Chrome.
server {
listen 443 ssl http2;
server_name api.yourdomain.no;
ssl_certificate /etc/letsencrypt/live/api.yourdomain.no/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/api.yourdomain.no/privkey.pem;
# Optimize the cache
ssl_session_cache shared:SSL:50m;
ssl_session_timeout 1d;
ssl_session_tickets off;
# Modern Cipher Suite (2016 Standard)
ssl_protocols TLSv1.2;
ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256';
ssl_prefer_server_ciphers on;
}
4. The Hardware Reality: Why "Cloud" Often Fails
You can tune your kernel and your Nginx config until perfection, but if your underlying hypervisor is stealing your CPU cycles, it won't matter. This is the "Noisy Neighbor" effect.
Pro Tip: Check your "Steal Time" inside your VM using
top. If%stis consistently above 0.5, your hosting provider is overselling their physical CPU cores. Move immediately.
In traditional shared hosting or cheap OpenVZ containers, resources are not guaranteed. For an API Gateway, inconsistent I/O is fatal. Logging requests, writing to access logs, or buffering payloads to disk when buffers overflow requires fast storage.
This is where CoolVDS differs fundamentally from budget providers. We utilize KVM virtualization to ensure strict resource isolation. When you buy 4 vCPUs on CoolVDS, those cycles are yours. Furthermore, we have fully transitioned to NVMe storage in our Oslo data centers. Standard SSDs are fast, but NVMe connects directly to the PCIe bus, drastically reducing the latency between your application asking for data and the disk providing it.
| Feature | Budget VPS (OpenVZ) | CoolVDS (KVM + NVMe) |
|---|---|---|
| IOPS | ~5,000 (Shared) | ~20,000+ (Dedicated) |
| Kernel Access | Restricted | Full (Load custom modules) |
| Latency Consistency | High Jitter | Stable |
5. Local Context: Norway and Data Sovereignty
With the adoption of the GDPR this past April, the regulatory landscape is shifting. While enforcement doesn't start until 2018, forward-thinking CTOs are already moving data back to Europe. Datatilsynet (The Norwegian Data Protection Authority) is becoming increasingly strict about where user data is processed.
Hosting your API Gateway outside of Norway adds necessary network latency—ping times from Oslo to Frankfurt are usually around 25-30ms. That doesn't sound like much, but in a microservices architecture where one user click triggers ten internal API calls, that latency compounds. By hosting on CoolVDS infrastructure within Norway, you benefit from peering at NIX (Norwegian Internet Exchange), dropping that latency to single digits for local users while ensuring data stays within Norwegian jurisdiction.
Summary
Performance isn't magic. It's a combination of efficient configuration, modern protocols like HTTP/2, and honest hardware. Don't let a default configuration file be the reason your app feels sluggish.
Ready to test real performance? Spin up a CoolVDS KVM instance with NVMe storage today. SSH in, run these sysctl tweaks, and watch your %st stay at zero.