Console Login

Squeezing Milliseconds: Advanced Nginx API Gateway Tuning on Linux

Squeezing Milliseconds: Advanced Nginx API Gateway Tuning on Linux

Let’s be honest: your API is probably slow. Not "load time 10 seconds" slow, but "add 200ms of unnecessary handshake latency" slow. In the world of mobile apps and microservices, that latency compounds. If your frontend makes five calls to populate a dashboard, you just lost a second of user attention.

I’ve spent the last week debugging a high-traffic payment gateway for a client in Oslo. The code was optimized, the database queries were indexed, yet the Time to First Byte (TTFB) was consistently mediocre. The culprit? Default Linux kernel settings and an unoptimized Nginx configuration. We are going to fix that today.

The Hardware Reality Check

Before we touch a single config file, we need to address the elephant in the rack. Software cannot fix physics.

In 2017, if you are hosting your API gateway on standard spinning HDDs or even cheap SATA SSDs shared with fifty other noisy neighbors, your tuning efforts are futile. API Gateways are I/O intensive—logging requests, SSL handshakes, and buffering payloads all hit the disk. I have seen iowait spike simply because another tenant on the same physical host decided to run a backup.

This is why we architect CoolVDS on KVM with NVMe storage. The difference isn't subtle.

Metric Standard SATA SSD CoolVDS NVMe
IOPS (Random Read) ~80,000 ~400,000+
Latency ~150 µs ~20 µs
Throughput 550 MB/s 3,000 MB/s

If your infrastructure is ready, let's open the terminal.

1. Kernel Level Tuning (sysctl.conf)

Linux is tuned for general-purpose computing by default, not for handling 10,000 concurrent API connections. We need to modify the TCP stack to handle ephemeral ports faster and allow more open files.

Edit your /etc/sysctl.conf. These settings are aggressive but necessary for high-throughput gateways.

# Increase system-wide file descriptor limit
fs.file-max = 2097152

# Allow more connections to be handled
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Reuse sockets in TIME_WAIT state for new connections
# (Critical for API gateways making upstream calls)
net.ipv4.tcp_tw_reuse = 1

# Increase TCP buffer sizes for 10Gbps+ networks (common in Nordic datacenters)
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# Protection against SYN flood attacks
net.ipv4.tcp_syncookies = 1

Apply these changes with sysctl -p. Without tcp_tw_reuse, your API gateway will run out of ephemeral ports during traffic spikes, resulting in connection timeouts even if your CPU is idle.

2. Nginx: Beyond the Defaults

Most tutorials tell you to set worker_processes auto; and walk away. That is insufficient. For an API Gateway, we are often using Nginx as a reverse proxy (perhaps passing to Node.js, Go, or PHP-FPM).

Worker Rlimit & Connections

You must explicitly raise the file descriptor limit for the Nginx process. If your OS allows 100k files but Nginx is capped at 1024, you will bottle-neck.

worker_processes auto;
worker_rlimit_nofile 100000;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

Keepalive to Upstream

This is the most common mistake I see. Nginx uses HTTP/1.1 Keep-Alive to the client, but by default, it tears down the connection to the backend (upstream) after every request. This adds TCP handshake overhead to every single API call.

Configure an upstream block with keepalive:

upstream backend_api {
    server 10.0.0.5:8080;
    # Keep 64 idle connections open to the backend
    keepalive 64;
}

server {
    location /api/ {
        proxy_pass http://backend_api;
        
        # Required for keepalive to work
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}
Pro Tip: If your servers are located in Norway (like our CoolVDS Oslo zone), latency to local ISPs like Telenor or Altibox is extremely low (often <5ms). However, if your upstream database is in Frankfurt or Amsterdam, that keepalive setting becomes critical to avoid the round-trip penalty on every query.

3. SSL/TLS Optimization (Because HTTP/2 is here)

It is 2017. If you aren't using HTTP/2, you are falling behind. HTTP/2 multiplexing allows multiple requests over a single TCP connection, which drastically improves performance for mobile clients on high-latency 4G networks.

Ensure you are using OpenSSL 1.0.2 or later (standard on Ubuntu 16.04). Here is a modern, high-performance SSL block that gets you an A+ on SSLLabs:

listen 443 ssl http2;

ssl_protocols TLSv1.2;
ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256';
ssl_prefer_server_ciphers on;

# Optimize SSL Session Caching
ssl_session_cache shared:SSL:50m;
ssl_session_timeout 1d;
ssl_session_tickets on;

4. Logging: The Silent Killer

Writing logs to disk is expensive. For a high-throughput API, standard access logs can consume significant I/O.

Option A: Buffer the logs.
This tells Nginx to wait until the buffer is full before flushing to disk. You risk losing a few seconds of logs during a crash, but the performance gain is massive.

access_log /var/log/nginx/access.log combined buffer=64k flush=5m;

Option B: Disable access logs for assets.
Do you really need to log every request for `favicon.ico` or robot probes?

Data Sovereignty and Compliance

We are seeing stricter regulations regarding data storage. With the looming GDPR enforcement next year, knowing exactly where your logs and data reside is paramount. Using a US-based cloud provider introduces complexity regarding the Privacy Shield framework.

By hosting on CoolVDS in Norway, your data remains under Norwegian jurisdiction and the European Economic Area (EEA) rules, simplifying compliance for local enterprises. Plus, you get the benefit of routing traffic through NIX (Norwegian Internet Exchange) for local users, keeping latency ridiculously low.

Conclusion

Performance tuning is a game of inches. You optimize the kernel to gain connection capacity. You tune Nginx to reduce handshake overhead. You implement HTTP/2 to multiplex requests.

But ultimately, your code needs to run on hardware that respects your engineering efforts. Don't let IOPS bottlenecks strangle your API gateway.

Ready to test these configs? Deploy a high-performance NVMe instance on CoolVDS in under 55 seconds and see the latency drop for yourself.