Your API Gateway is Choking (And It's Not the Code's Fault)

Let's cut the pleasantries. If you are running microservices in 2020, you aren't connecting service-to-service directly. You have a gatekeeper. Maybe it's Kong, maybe it's a raw NGINX reverse proxy, or perhaps Traefik. You look at your New Relic dashboards and see application response times in the green, yet the client-side latency is dragging.

I see this every week. A CTO calls me, swearing their Node.js or Go services are optimized to the bone, yet they are bleeding milliseconds. 9 times out of 10, the bottleneck isn't the application logic. It's the gateway configuration and the underlying Linux OS limits.

We recently audited a high-frequency trading bot hosted here in Norway. They were losing money because their API Gateway—handling 20,000 requests per second (RPS)—was effectively DDoS-ing itself. The connection overhead was eating 40% of their CPU cycles. Here is how we fixed it, and how you can tune your CoolVDS instance to handle similar loads.

1. The "File Descriptor" Lie

By default, most Linux distros ship with conservative limits. Ubuntu 20.04 is better than its predecessors, but it still isn't ready for high-concurrency routing out of the box. If your gateway hits the nofile limit, it stops accepting new connections silently. It doesn't crash; it just stalls.

Check your current limits:

ulimit -n

If that returns 1024, you are in trouble. For a gateway on a CoolVDS production node, we need to crank this up at the OS level.

Edit /etc/security/limits.conf:

* soft nofile 65535
* hard nofile 65535
root soft nofile 65535
root hard nofile 65535

But that's just the user space. You need to tell NGINX (or your gateway of choice) to actually utilize them. Inside your nginx.conf, at the main context level:

worker_rlimit_nofile 65535;

events {
    worker_connections 16384;
    use epoll;
    multi_accept on;
}

2. Kernel Tuning for High TCP Turnover

API Gateways are distinct from standard web servers because they open two connections for every request: one to the client, one to the upstream service. This doubles the pressure on the ephemeral port range. You will run out of ports quickly, leading to TIME_WAIT exhaustion.

We need to modify the sysctl.conf to recycle connections faster. This is critical for keeping latency low between your CoolVDS instance and external APIs or internal microservices.

Add this to /etc/sysctl.conf:

# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535

# Max number of packets in the receive queue
net.core.netdev_max_backlog = 16384

# Max SYN backlog (connections waiting for ACK)
net.ipv4.tcp_max_syn_backlog = 8192

# Max connections in listen queue
net.core.somaxconn = 8192

Load it with sysctl -p. These settings are aggressive. They assume you have the CPU power to handle the interrupts. This is why we insist on KVM virtualization at CoolVDS—we don't oversell CPU cycles. If you try this on a cheap OpenVZ container where the host kernel is shared and overloaded, you might actually degrade performance.

Pro Tip: Do not enable tcp_tw_recycle. It was removed in newer kernels and breaks connections for users behind NAT (like mobile networks). Stick to tcp_tw_reuse.

3. The Upstream Keepalive Mistake

This is the most common configuration error I see in NGINX. By default, NGINX acts as a polite HTTP/1.0 client to your backend services: it opens a connection, sends the request, gets the response, and closes the connection.

If you are routing to a local Node.js service or a database API, the TLS handshake and TCP setup for every single request adds massive latency. You need to keep the connection open.

upstream backend_api {
    server 10.0.0.5:8080;
    # Keep 64 idle connections open to this upstream
    keepalive 64;
}

server {
    location /api/ {
        proxy_pass http://backend_api;
        
        # Required for keepalive to work
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

Without the empty Connection header, NGINX forwards the "close" header by default, killing the keepalive.

4. I/O Latency: The Silent Killer

An API Gateway logs everything. Access logs, error logs, audit trails. If you are pushing 5,000 requests per second, you are writing to disk 5,000 times per second. If your disk I/O blocks, the NGINX worker process blocks. If the worker blocks, no requests get processed.

This is where hardware architecture becomes political. Many "budget" VPS providers in Europe still run on SSDs that are network-attached (Ceph or similar) with high latency, or worse, shared spindles.

For a gateway, you must buffer your logs to avoid disk locking. Modify your access log directive:

access_log /var/log/nginx/access.log main buffer=32k flush=5s;

This tells NGINX: "Wait until you have 32kb of data or 5 seconds have passed before physically writing to the disk."

The Hardware Reality

Even with buffering, eventually, you have to write to the disk. On CoolVDS, we use local NVMe storage. The I/O throughput is massive compared to standard SSDs. When your log buffer flushes, it happens instantly. On a legacy hosting platform, that flush operation could stall your API for 50-100ms. In the world of high-frequency trading or real-time bidding, that is an eternity.

5. Local Nuances: Norway and GDPR

Why host this gateway in Norway? Beyond the obvious legal benefits of keeping data within the EEA (especially with the uncertainty surrounding Privacy Shield right now), there is the physics of the network.

If your users are in Oslo, Bergen, or Trondheim, routing traffic through Frankfurt adds 20-30ms of round-trip time. Routing through a US cloud provider adds 100ms+. By placing your API Gateway on a CoolVDS instance in Oslo, you are peering directly at NIX (Norwegian Internet Exchange).

Low latency isn't just about speed; it's about SEO (Core Web Vitals are becoming a ranking factor) and user retention. A fast gateway makes your entire architecture feel snappy, even if the backend is heavy.

Summary

Optimizing an API Gateway is an exercise in removing hurdles. You remove the file descriptor limits, you remove the TCP handshake overhead, and you remove the disk I/O blocking.

Kernel: Tune sysctl for high concurrency.
NGINX: Enable keepalive to upstreams.
Storage: Buffer logs and demand NVMe.

Don't let a default configuration file dictate your performance ceiling. Deploy a high-performance instance on CoolVDS today, SSH in, and apply these configs. You'll see the difference in your p99 latency metrics immediately.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

API Gateway Optimization: Crushing Latency on High-Traffic Nodes

Your API Gateway is Choking (And It's Not the Code's Fault)

1. The "File Descriptor" Lie

2. Kernel Tuning for High TCP Turnover

3. The Upstream Keepalive Mistake

4. I/O Latency: The Silent Killer

The Hardware Reality

5. Local Nuances: Norway and GDPR

Summary

/// RELATED POSTS

API Gateway Tuning: Crushing Latency in High-Traffic Nordic Systems

Silence the Noise: Advanced APM Strategies for High-Throughput Norwegian Systems

Bun vs. Node.js in 2025: Why High-Performance Runtimes Die on Cheap VPS Hardware

Zero-Compromise API Gateway Tuning: Reducing Latency from Oslo to the Edge

Nordic Latency Killers: Advanced API Gateway Tuning for High-Throughput Systems

Zen 5 in the Datacenter: Why We Deployed AMD Ryzen 9000 Series for High-Performance VDS