API Gateway Bottlenecks: A SysAdmin’s Guide to Surviving High Concurrency

It is 2018, and the monolith is dying. You have broken your application into twenty different microservices, deployed them via Docker, and placed an API Gateway in front to handle authentication, rate limiting, and routing. You feel modern. You feel agile.

Then you look at the latency graphs.

200ms overhead? For a simple loopback request? This is unacceptable. If your gateway adds latency, every single downstream service suffers. I have seen production environments in Oslo grind to a halt not because the application code was slow, but because the gateway was choking on context switches and I/O wait times. In this guide, we are going to fix that. We will tune the Linux kernel, optimize the NGINX core (which powers Kong and OpenResty), and discuss why the underlying hardware of your VPS provider is usually the silent killer of performance.

1. The OS Layer: Tuning the Kernel for Throughput

Most Linux distributions, including the Ubuntu 16.04 LTS images used by most providers, ship with conservative defaults intended for desktop usage or low-traffic web servers. When your API Gateway needs to handle 10,000 concurrent connections, these defaults fail hard.

The first limit you will hit is the file descriptor limit. In Linux, everything is a file. An open socket is a file. If your user limit is set to 1024 (the default), your gateway will crash under load.

Edit /etc/sysctl.conf and apply these settings to widen the TCP highway:

# /etc/sysctl.conf

# Increase system-wide file descriptor limit
fs.file-max = 2097152

# Increase the backlog of incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Reuse sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase TCP buffer sizes for modern high-speed networks
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# Protect against SYN flood attacks while maintaining performance
net.ipv4.tcp_syncookies = 1

Load these with sysctl -p. These settings ensure that when a burst of traffic hits your gateway, the kernel doesn't drop packets simply because its internal buffers are full.

2. NGINX Configuration: Beyond the Defaults

Whether you are using raw NGINX, OpenResty, or Kong, the underlying engine is the same. The default configuration often wastes CPU cycles on context switching.

Here is a battle-tested configuration snippet for high-throughput gateways. This goes into your main nginx.conf context:

worker_processes auto;
worker_rlimit_nofile 100000;

events {
    # epoll is essential for Linux 2.6+
    use epoll;
    
    # Allow a worker to accept all new connections at once
    multi_accept on;
    
    # Essential for high concurrency
    worker_connections 8192;
}

http {
    # Disable Nagle's algorithm for instant packet sending
    tcp_nodelay on;
    
    # Optimize packet sending
    tcp_nopush on;
    
    # Keep connections open longer to reduce handshake overhead
    keepalive_timeout 65;
    keepalive_requests 100000;
    
    # SSL Optimization for 2018 standards
    ssl_protocols TLSv1.2;
    ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 10m;
}

Pro Tip: If you are terminating SSL at the gateway (which you should be), the handshake is CPU intensive. Ensure your VPS has access to the AES-NI instruction set. At CoolVDS, we pass the host CPU flags directly to the KVM guest, so your OpenSSL library can offload encryption natively. Many budget providers mask these flags, forcing software emulation and spiking your CPU usage.

3. The Hardware Reality: Why HDD and SATA SSDs Are Not Enough

This is where most "cloud" setups fail. An API Gateway logs heavily. Access logs, error logs, and often request payloads for debugging. If you are using a database-backed gateway like Kong (running on Cassandra or Postgres), you are hammering the disk constantly.

In 2018, standard SATA SSDs top out at around 500-600 MB/s and, more importantly, have a queue depth limit. When your API gets a spike, your I/O Wait goes up, and your CPU sits idle waiting for the disk to write a log entry.

This is why we standardized on NVMe storage for all CoolVDS instances. NVMe connects directly to the PCIe bus, bypassing the SATA controller entirely. The difference is not just theoretical.

Benchmark: SATA SSD vs NVMe (Random Write 4k)

Metric	Standard SATA SSD VPS	CoolVDS NVMe VPS
IOPS	~8,000	~40,000+
Latency	2-5 ms	0.1 ms
Throughput	450 MB/s	2,500 MB/s

If you are running a high-traffic API on SATA storage, you are driving a Ferrari with the handbrake on. The IOPS capability of NVMe ensures that logging and database transactions never block the request thread.

4. The Norwegian Context: Latency and Legality

Latency is determined by physics. If your users are in Oslo or Bergen, hosting your API gateway in a massive data center in Frankfurt or Amsterdam adds 15-30ms of round-trip time (RTT) before the request is even processed. For real-time applications, that is an eternity.

By placing your infrastructure directly in Norway, you slash that network latency to under 5ms. Furthermore, with the GDPR enforcement deadline of May 25, 2018 rapidly approaching, data sovereignty is critical. Hosting within Norway (a strong EEA member) simplifies your compliance posture regarding Datatilsynet requirements. You know exactly where your data physically resides.

Conclusion

Performance is an aggregate of marginal gains. You tune the kernel to stop dropping packets. You tune NGINX to keep connections alive. And you choose infrastructure that eliminates I/O bottlenecks. In the era of microservices, your API Gateway is the most critical component of your stack. Do not starve it of resources.

If you need to verify these configurations yourself, you need a sandbox that doesn't suffer from the "noisy neighbor" effect of cheap container hosting. Deploy a KVM-based, NVMe-powered instance on CoolVDS today. Test your sysctl settings, benchmark your throughput, and see what true raw performance looks like.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

API Gateway Performance: Tuning NGINX & Kong for Sub-Millisecond Latency

API Gateway Bottlenecks: A SysAdmin’s Guide to Surviving High Concurrency

1. The OS Layer: Tuning the Kernel for Throughput

2. NGINX Configuration: Beyond the Defaults

3. The Hardware Reality: Why HDD and SATA SSDs Are Not Enough

Benchmark: SATA SSD vs NVMe (Random Write 4k)

4. The Norwegian Context: Latency and Legality

Conclusion

/// RELATED POSTS

API Gateway Tuning: Crushing Latency in High-Traffic Nordic Systems

Silence the Noise: Advanced APM Strategies for High-Throughput Norwegian Systems

Bun vs. Node.js in 2025: Why High-Performance Runtimes Die on Cheap VPS Hardware

Zero-Compromise API Gateway Tuning: Reducing Latency from Oslo to the Edge

Nordic Latency Killers: Advanced API Gateway Tuning for High-Throughput Systems

Zen 5 in the Datacenter: Why We Deployed AMD Ryzen 9000 Series for High-Performance VDS