API Gateway Bottlenecks: A SysAdmin’s Guide to Surviving High Concurrency
It is 2018, and the monolith is dying. You have broken your application into twenty different microservices, deployed them via Docker, and placed an API Gateway in front to handle authentication, rate limiting, and routing. You feel modern. You feel agile.
Then you look at the latency graphs.
200ms overhead? For a simple loopback request? This is unacceptable. If your gateway adds latency, every single downstream service suffers. I have seen production environments in Oslo grind to a halt not because the application code was slow, but because the gateway was choking on context switches and I/O wait times. In this guide, we are going to fix that. We will tune the Linux kernel, optimize the NGINX core (which powers Kong and OpenResty), and discuss why the underlying hardware of your VPS provider is usually the silent killer of performance.
1. The OS Layer: Tuning the Kernel for Throughput
Most Linux distributions, including the Ubuntu 16.04 LTS images used by most providers, ship with conservative defaults intended for desktop usage or low-traffic web servers. When your API Gateway needs to handle 10,000 concurrent connections, these defaults fail hard.
The first limit you will hit is the file descriptor limit. In Linux, everything is a file. An open socket is a file. If your user limit is set to 1024 (the default), your gateway will crash under load.
Edit /etc/sysctl.conf and apply these settings to widen the TCP highway:
# /etc/sysctl.conf
# Increase system-wide file descriptor limit
fs.file-max = 2097152
# Increase the backlog of incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# Reuse sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Increase TCP buffer sizes for modern high-speed networks
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Protect against SYN flood attacks while maintaining performance
net.ipv4.tcp_syncookies = 1
Load these with sysctl -p. These settings ensure that when a burst of traffic hits your gateway, the kernel doesn't drop packets simply because its internal buffers are full.
2. NGINX Configuration: Beyond the Defaults
Whether you are using raw NGINX, OpenResty, or Kong, the underlying engine is the same. The default configuration often wastes CPU cycles on context switching.
Here is a battle-tested configuration snippet for high-throughput gateways. This goes into your main nginx.conf context:
worker_processes auto;
worker_rlimit_nofile 100000;
events {
# epoll is essential for Linux 2.6+
use epoll;
# Allow a worker to accept all new connections at once
multi_accept on;
# Essential for high concurrency
worker_connections 8192;
}
http {
# Disable Nagle's algorithm for instant packet sending
tcp_nodelay on;
# Optimize packet sending
tcp_nopush on;
# Keep connections open longer to reduce handshake overhead
keepalive_timeout 65;
keepalive_requests 100000;
# SSL Optimization for 2018 standards
ssl_protocols TLSv1.2;
ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
}
Pro Tip: If you are terminating SSL at the gateway (which you should be), the handshake is CPU intensive. Ensure your VPS has access to the AES-NI instruction set. At CoolVDS, we pass the host CPU flags directly to the KVM guest, so your OpenSSL library can offload encryption natively. Many budget providers mask these flags, forcing software emulation and spiking your CPU usage.
3. The Hardware Reality: Why HDD and SATA SSDs Are Not Enough
This is where most "cloud" setups fail. An API Gateway logs heavily. Access logs, error logs, and often request payloads for debugging. If you are using a database-backed gateway like Kong (running on Cassandra or Postgres), you are hammering the disk constantly.
In 2018, standard SATA SSDs top out at around 500-600 MB/s and, more importantly, have a queue depth limit. When your API gets a spike, your I/O Wait goes up, and your CPU sits idle waiting for the disk to write a log entry.
This is why we standardized on NVMe storage for all CoolVDS instances. NVMe connects directly to the PCIe bus, bypassing the SATA controller entirely. The difference is not just theoretical.
Benchmark: SATA SSD vs NVMe (Random Write 4k)
| Metric | Standard SATA SSD VPS | CoolVDS NVMe VPS |
|---|---|---|
| IOPS | ~8,000 | ~40,000+ |
| Latency | 2-5 ms | 0.1 ms |
| Throughput | 450 MB/s | 2,500 MB/s |
If you are running a high-traffic API on SATA storage, you are driving a Ferrari with the handbrake on. The IOPS capability of NVMe ensures that logging and database transactions never block the request thread.
4. The Norwegian Context: Latency and Legality
Latency is determined by physics. If your users are in Oslo or Bergen, hosting your API gateway in a massive data center in Frankfurt or Amsterdam adds 15-30ms of round-trip time (RTT) before the request is even processed. For real-time applications, that is an eternity.
By placing your infrastructure directly in Norway, you slash that network latency to under 5ms. Furthermore, with the GDPR enforcement deadline of May 25, 2018 rapidly approaching, data sovereignty is critical. Hosting within Norway (a strong EEA member) simplifies your compliance posture regarding Datatilsynet requirements. You know exactly where your data physically resides.
Conclusion
Performance is an aggregate of marginal gains. You tune the kernel to stop dropping packets. You tune NGINX to keep connections alive. And you choose infrastructure that eliminates I/O bottlenecks. In the era of microservices, your API Gateway is the most critical component of your stack. Do not starve it of resources.
If you need to verify these configurations yourself, you need a sandbox that doesn't suffer from the "noisy neighbor" effect of cheap container hosting. Deploy a KVM-based, NVMe-powered instance on CoolVDS today. Test your sysctl settings, benchmark your throughput, and see what true raw performance looks like.