API Gateway Performance Tuning: Nginx, Lua, and the 10ms Goal

It’s 3:00 AM on a Tuesday. Your monitoring dashboard—probably Nagios or Zabbix—is screaming red. Your microservices are fine, your database load is nominal, yet your mobile app users are seeing spinning wheels. The culprit? Your API Gateway is choking on SSL handshakes and connection overhead. I have seen this scenario play out in data centers from Oslo to Frankfurt, and the root cause is almost always default configurations.

In 2017, we are moving away from monolithic architectures to microservices, which means one user request triggers ten internal API calls. If your gateway adds 50ms of latency per call, you have just destroyed the user experience. You need raw speed.

This guide ignores the fluff. We are going to tune the Linux kernel, optimize Nginx for high concurrency, and discuss why the underlying hardware (specifically NVMe storage and KVM virtualization) makes or breaks your benchmarks.

1. The OS Layer: Tuning the Kernel for Concurrency

Most Linux distributions ship with sysctl settings designed for general-purpose computing, not high-throughput packet switching. If you are running CentOS 7 or Ubuntu 16.04, your server will likely cap out around 65k concurrent connections due to port exhaustion and file descriptor limits.

To handle a high-load API gateway, you must allow the kernel to recycle TCP connections faster. Open your /etc/sysctl.conf and verify these settings. Do not blindly copy-paste; understand that we are reducing the TIME_WAIT state to free up ephemeral ports.

# /etc/sysctl.conf

# Increase system-wide file descriptor limit
fs.file-max = 2097152

# Increase the read/write buffer sizes for TCP
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# Increase the size of the backlog queue
net.core.netdev_max_backlog = 5000
net.core.somaxconn = 65535

# Reuse connections in TIME_WAIT state (Essential for API Gateways)
net.ipv4.tcp_tw_reuse = 1

# Expand the local port range
net.ipv4.ip_local_port_range = 1024 65535

Apply these with sysctl -p. Without tcp_tw_reuse, a spike in traffic will exhaust your available ports, resulting in Nginx throwing generic 502 errors while your backend services sit idle.

2. Nginx: The Upstream Keepalive Trap

Nginx is the industry standard for API Gateways today, whether you are using the raw open-source version, OpenResty, or Kong (which is just Nginx + Lua). However, there is a specific configuration mistake I see in 90% of setups: failing to enable upstream keepalives.

By default, Nginx opens a new connection to your backend service (Node.js, Go, PHP-FPM) for every single request, then closes it. This creates massive overhead. You want Nginx to keep a pool of open connections to the backend.

The Wrong Configuration

upstream backend_api {
    server 10.0.0.5:8080;
}

server {
    location /api/ {
        proxy_pass http://backend_api;
    }
}

The Optimized Configuration

upstream backend_api {
    server 10.0.0.5:8080;
    # Keep 64 idle connections open to the backend
    keepalive 64;
}

server {
    location /api/ {
        proxy_pass http://backend_api;
        
        # REQUIRED: Clear these headers to enable keepalive
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

If you miss the proxy_set_header Connection ""; directive, Nginx will forward the client's "close" header to the backend, rendering the keepalive useless. This simple change can drop internal latency by 20-30ms per request.

3. SSL/TLS: The CPU Bottleneck

Encryption is computationally expensive. With the rise of Let's Encrypt and Google forcing HTTPS for SEO, your gateway is doing heavy lifting. In 2017, we have access to AES-NI instruction sets on modern CPUs, which accelerate encryption. However, you must configure Nginx to use efficient ciphers and session caching.

Avoid the legacy SHA-1 ciphers. Prioritize ECDHE (Elliptic Curve Diffie-Hellman). This is not just for security; it is faster on modern hardware.

ssl_session_cache shared:SSL:50m;
ssl_session_timeout 1d;
ssl_session_tickets off;

# Modern cipher suite for 2017 standards
ssl_protocols TLSv1.2;
ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256';
ssl_prefer_server_ciphers on;

Pro Tip: If you are serving static assets alongside your API, enable sendfile and tcp_nopush. But for pure API JSON responses, disable tcp_nopush and enable tcp_nodelay to force packets out immediately without buffering.

4. The Hardware Reality: Why Virtualization Matters

You can tune your configs all day, but if your host machine is stealing CPU cycles, you will experience "jitter"—unpredictable latency spikes. This is common in budget VPS providers using OpenVZ or LXC, where resources are oversold.

For an API Gateway, I/O Wait is the enemy. Logging requests to disk (access logs, error logs) can block the Nginx worker process if the disk is slow. This is why rotating rust (HDD) is dead for high-performance hosting.

At CoolVDS, we made a specific architectural choice to use KVM virtualization exclusively. Unlike containers, KVM provides stronger isolation. Your kernel is your kernel. Furthermore, we deploy strictly on NVMe storage. In our benchmarks, NVMe drives reduce the time Nginx spends writing access logs by orders of magnitude compared to standard SSDs. When you are pushing 5,000 requests per second, that disk write speed defines your throughput ceiling.

5. Local Context: Latency and Compliance in Norway

For developers targeting the Nordic market, physics is a factor. Hosting your API Gateway in a US East data center adds ~90ms of round-trip latency to Oslo. Hosting in Frankfurt adds ~25ms. Hosting in Oslo reduces this to <5ms.

Beyond physics, we have the looming GDPR regulations (enforceable next year, 2018). The Norwegian Data Protection Authority (Datatilsynet) is already signaling stricter enforcement on data sovereignty. Storing and processing logs containing IP addresses within Norway is the safest bet for long-term compliance, especially given the uncertainty regarding US data transfers after the invalidation of Safe Harbor.

Conclusion

Performance is a stack, not a switch. It starts with the hardware (NVMe, KVM), moves to the kernel (sysctl), and finishes with the application config (Nginx keepalives). Don't let default settings throttle your growth.

If you need a test environment that doesn't suffer from the "noisy neighbor" effect, spin up a KVM instance. You can deploy a high-performance VPS Norway instance on CoolVDS in under a minute. Test your ab (Apache Bench) results against your current provider—the numbers usually speak for themselves.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

API Gateway Performance Tuning: Nginx, Lua, and the 10ms Goal

API Gateway Performance Tuning: Nginx, Lua, and the 10ms Goal

1. The OS Layer: Tuning the Kernel for Concurrency

2. Nginx: The Upstream Keepalive Trap

The Wrong Configuration

The Optimized Configuration

3. SSL/TLS: The CPU Bottleneck

4. The Hardware Reality: Why Virtualization Matters

5. Local Context: Latency and Compliance in Norway

Conclusion

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025