Surviving the Request Storm: API Gateway Tuning
I still remember the silence. It wasn't the silence of a job well done; it was the silence of a log file that had stopped moving because the API Gateway had choked on 50,000 concurrent connections. The load balancer was up, the backend services were idle, but the gateway—the choke point—had exhausted its file descriptors. We were running a standard cloud instance with default settings. That day, we learned that "default" is just another word for "broken" when you hit scale.
If you are running microservices, your API Gateway (likely Nginx, Kong, or HAProxy) is the most critical piece of infrastructure you own. It handles SSL termination, routing, rate limiting, and authentication. Yet, most teams in Oslo deploy these gateways on shared, noisy-neighbor VPS environments with stock sysctl settings. They wonder why their latency spikes to 200ms during peak hours.
Let’s fix that. We are going to look at how to tune a Linux stack for an API Gateway in 2020, specifically targeting the Norwegian market where low latency is expected.
1. The OS Layer: It Starts with the Kernel
Before touching Nginx config, you must look at the kernel. Linux is conservative by default. It assumes you are running a desktop, not a high-throughput packet cannon.
The first limit you will hit is the file descriptor limit. Everything in Linux is a file. A socket is a file. If you have 10,000 incoming users and your gateway talks to 5 backend microservices, that connection count multiplies fast.
Essential Sysctl Tuning
Open /etc/sysctl.conf. If you rely on default values here, you are capping your own knees. Specifically, pay attention to the backlog queue and local port range.
# Increase system-wide file descriptor limit
fs.file-max = 2097152
# Increase the size of the receive queue.
# The default is often too small for high-burst API traffic.
net.core.netdev_max_backlog = 16384
net.core.somaxconn = 65535
# TCP Stack Tuning
# Reuse connections in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Note: tcp_tw_recycle is dangerous in NAT environments (like standard VPS), do NOT enable it.
# Increase the ephemeral port range to allow more outbound connections to upstreams
net.ipv4.ip_local_port_range = 1024 65535
# Protect against SYN flood while allowing legitimate bursts
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_syncookies = 1
Apply these with sysctl -p. If you are on CoolVDS, our KVM virtualization ensures these kernel changes actually stick. On some container-based hosting (OpenVZ/LXC), the host kernel overrides your settings, rendering your tuning useless.
2. Nginx Configuration: The "Keepalive" Trap
Whether you use raw Nginx or a derivative like Kong, the mistake is almost always the same: failing to enable HTTP keepalives to the upstream backends.
By default, Nginx acts as a polite HTTP/1.0 client to your backend services. It opens a connection, sends the request, gets the response, and closes the connection. This involves a full TCP handshake (and potentially SSL handshake) for every single internal API call. This burns CPU and adds massive latency.
You need to keep those connections open.
The Correct Upstream Block
upstream backend_service {
server 10.0.0.5:8080;
server 10.0.0.6:8080;
# The magic number. This is the number of IDLE connections to keep open.
keepalive 64;
}
server {
location /api/v1/ {
proxy_pass http://backend_service;
# Required for keepalive to work
proxy_http_version 1.1;
proxy_set_header Connection "";
# Buffer tuning
proxy_buffers 16 16k;
proxy_buffer_size 32k;
}
}
Pro Tip: Monitor your "TIME_WAIT" sockets. If you see thousands of them on your gateway server, you aren't reusing connections properly. Use netstat -n | grep TIME_WAIT | wc -l to check. Ideally, this number stays low.
3. SSL Termination: The CPU Eater
In 2020, running non-SSL APIs is negligence, even internally. However, encryption costs CPU cycles. Handshakes are expensive.
To optimize this, ensure you are utilizing TLS 1.3, which reduces the handshake overhead significantly compared to TLS 1.2. Furthermore, you need a CPU that supports AES-NI instructions effectively. We see many "budget" VPS providers over-provisioning CPUs so heavily that "steal time" eats your encryption performance.
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers on;
# Cache SSL sessions to avoid re-handshaking
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
4. The Norway Factor: Network Topology & Data Residency
Latency is governed by the speed of light. If your users are in Oslo or Bergen, hosting your API Gateway in Frankfurt or Amsterdam adds a mandatory 20-30ms round-trip tax to every packet. For a real-time bidding app or a high-frequency trading interface, that lag is unacceptable.
Furthermore, we must talk about compliance. With the Datatilsynet keeping a close watch on GDPR compliance, keeping your data handling within Norwegian jurisdiction is a strategic advantage. It simplifies your legal posture regarding data sovereignty.
Peering Matters
When selecting a host, ask if they peer at NIX (Norwegian Internet Exchange). If your traffic has to route through international transit just to reach a user on Telenor or Altibox networks inside Norway, you are losing performance.
| Metric | Standard Cloud (Frankfurt) | CoolVDS (Oslo) |
|---|---|---|
| Ping to Oslo User | 25ms - 35ms | < 2ms |
| Data Residency | Germany (EU) | Norway (Local) |
| Disk I/O (Random Read) | Often Throttled | Unthrottled NVMe |
5. Why Infrastructure Choice Dictates Performance
You can tune sysctl until your fingers bleed, but if the underlying hypervisor is stealing your CPU cycles, it won't matter. API Gateways are "bursty". They need instant CPU access to handle a flood of requests.
This is where the distinction between container-based VPS and KVM-based VDS becomes critical. In a container (LXC/OpenVZ), you often share the kernel and the network stack buffers with 50 other tenants. If one neighbor gets DDoS'd, your gateway stutters.
At CoolVDS, we use KVM. When you buy 4 vCPUs, they are yours. We use high-performance NVMe storage because API Gateways write massive amounts of access logs. If your disk I/O blocks, Nginx blocks. If Nginx blocks, your application is down.
The Bottom Line: Don't let your infrastructure be the bottleneck. Configure your kernel, enable upstream keepalives, and host your gateway where your users are.
Need to verify your latency? Spin up a CoolVDS instance in our Oslo datacenter today. It takes 55 seconds to deploy, giving you enough time to grab a coffee before you start benchmarking.