API Gateway Performance Tuning: Breaking the 100ms Barrier
There is a specific kind of silence that fills a room when a lead developer realizes their API Gateway is the bottleneck. It’s not the database. It’s not the application logic. It’s the door frame itself. In the high-stakes environment of Nordic tech, where users expect instantaneous interaction whether they are in Oslo or Tromsø, a slow handshake is a death sentence for your application.
Most default VPS configurations are woefully inadequate for high-concurrency API traffic. They are tuned for general-purpose web serving, not the bursty, connection-heavy patterns of modern microservices. I have spent the last three weeks debugging a payment processing cluster that was timing out during peak loads. The culprit? Default file descriptor limits and a TCP stack behaving like it’s still 1999.
Today, we are going to fix that. We will tune an Nginx-based API Gateway running on CentOS 7 (or Ubuntu 18.04 LTS) to handle thousands of requests per second without breaking a sweat. And we are going to do it on hardware that actually supports high I/O, because tuning software on spinning rust is a waste of time.
The Hardware Foundation: Why I/O Wait Kills APIs
Before we touch a single config file, we need to address the infrastructure. An API Gateway logs heavily. Access logs, error logs, audit trails for GDPR compliance—especially relevant here in Norway under Datatilsynet's watchful eye. If your disk I/O is slow, your Nginx workers block while writing to disk. This is "I/O Wait," and it causes latency spikes that look like network issues but are actually disk issues.
This is why we standardized on CoolVDS for our reference architecture. They don't use standard SSDs; they use NVMe storage. In 2019, the difference between SATA SSD and NVMe is the difference between a bicycle and a Tesla. On a CoolVDS instance, the write latency is negligible, meaning your gateway creates logs and moves on instantly.
Step 1: The OS Layer (Kernel Tuning)
Linux defaults are conservative. For an API Gateway, we need to open the floodgates. We need to modify /etc/sysctl.conf to handle a massive number of open connections and rapid TCP recycling.
Pro Tip: Be careful withtcp_tw_recycle. In modern Linux kernels (4.12+), it is deprecated and can cause issues with NAT. Stick totcp_tw_reuse.
Add these lines to your sysctl configuration to optimize the TCP stack for low latency and high concurrency:
# /etc/sysctl.conf
# Maximize the backlog of incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535
# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Increase TCP buffer sizes for 10Gbps+ networks (common in Nordic datacenters)
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Protection against SYN flood attacks
net.ipv4.tcp_syncookies = 1
Apply these changes with sysctl -p. If you are running on a standard shared hosting provider, you might not have permission to change these. This is another reason why a KVM-based VPS from CoolVDS is essential; you get full kernel control.
Step 2: Nginx Worker Configuration
Nginx is the gold standard for API Gateways in 2019. Whether you use raw Nginx or OpenResty (the engine behind Kong), the worker configuration dictates your concurrency limit. The standard worker_processes 1; is insufficient.
Open your nginx.conf and locate the main context block:
user nginx;
worker_processes auto; # Automatically detects CPU cores
worker_rlimit_nofile 65535; # Allows Nginx to open this many files/sockets
events {
worker_connections 16384; # Connections per worker
use epoll; # Essential for Linux performance
multi_accept on; # Accept as many connections as possible
}
The worker_rlimit_nofile directive is critical. Without it, Nginx will hit the OS limit (often 1024) and start dropping connections with "Too many open files" errors, regardless of your RAM.
Step 3: Upstream Keepalives & SSL
One of the biggest latency killers is the SSL handshake. Establishing a secure connection is expensive computationally. To mitigate this, we use Keepalives to the backend services and optimized TLS settings for the client.
Here is a production-ready upstream configuration for a microservice architecture:
http {
# ... other settings ...
# Upstream definition with Keepalive
upstream backend_service {
server 10.0.0.5:8080;
server 10.0.0.6:8080;
# Keep 64 idle connections open to the backend
keepalive 64;
}
server {
listen 443 ssl http2;
server_name api.yourdomain.no;
# TLS Optimization
ssl_protocols TLSv1.2 TLSv1.3; # TLS 1.3 is faster and more secure
ssl_ciphers EECDH+AESGCM:EDH+AESGCM;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
# OCSP Stapling (Speeds up handshake by verifying cert on server side)
ssl_stapling on;
ssl_stapling_verify on;
resolver 1.1.1.1 8.8.8.8 valid=300s;
location / {
proxy_pass http://backend_service;
# Required for keepalive to work
proxy_http_version 1.1;
proxy_set_header Connection "";
# Forwarding headers for logging/security
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}
The Norwegian Latency Advantage
Code optimization can only take you so far. Physics is the final boss. If your target audience is in Scandinavia, hosting your API Gateway in Frankfurt or London adds 20-30ms of round-trip time (RTT) purely due to distance. Hosting in the US adds 100ms+.
| User Location | Server in US (Virginia) | Server in Germany | CoolVDS (Oslo/Nearby) |
|---|---|---|---|
| Oslo, Norway | ~110 ms | ~35 ms | ~2 ms |
| Bergen, Norway | ~115 ms | ~40 ms | ~8 ms |
| Stockholm, Sweden | ~115 ms | ~30 ms | ~12 ms |
By placing your CoolVDS instance locally, you are routing traffic through NIX (Norwegian Internet Exchange) peers, drastically reducing hops. For fintech or real-time bidding apps, this isn't a luxury; it's a requirement.
Step 4: Rate Limiting (DDoS Protection)
Performance isn't just about speed; it's about stability. A single abusive client can saturate your workers. Implementing a limit_req_zone is mandatory.
http {
# Define a zone named 'api_limit' with 10MB memory, allowing 10 requests/sec
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
server {
location /api/ {
# Burst allows brief spikes (up to 20), nodelay processes them instantly
limit_req zone=api_limit burst=20 nodelay;
proxy_pass http://backend_service;
}
}
}
This configuration is polite but firm. It allows legitimate users to burst traffic (loading a dashboard) but clamps down on scrapers or denial-of-service attempts. Combined with CoolVDS's network-level DDoS protection, your gateway remains resilient.
Conclusion: Stop Tolerating Lag
In 2019, there is no excuse for a sluggish API. The tools are mature, HTTP/2 is standard, and hardware like NVMe is accessible. The difference between a mediocre platform and a market leader often comes down to the milliseconds shaved off during the handshake.
You have the config. You have the kernel tweaks. Now you need the engine to run it.
Don't let slow I/O kill your SEO or your user experience. Deploy a high-performance test instance on CoolVDS in 55 seconds and see the difference raw NVMe power makes.