You Can't Optimize Code Out of Bad Infrastructure
I recently audited a fintech startup in Oslo. Their dev team was brilliant. Their Go microservices were lean. Yet, their payment API was timing out during peak loads. They blamed the database. They blamed the code. They were wrong.
The problem was the door handle.
Their API Gateway (a standard Nginx reverse proxy setup) was choking on TCP handshakes because they were running on a shared, over-sold cloud instance with default Linux kernel settings. If you are running high-traffic APIs in 2022 without touching your sysctl.conf or understanding the underlying hypervisor, you aren't an engineer; you're a passenger.
Here is how we fixed it, dropped latency by 40ms, and why hardware locality in Norway matters more than you think.
1. The Kernel is the Limit
Most Linux distributions ship with conservative defaults intended for desktop usage or light web serving. When your API Gateway gets hit with 5,000 concurrent requests, the kernel drops packets not because it can't handle them, but because it wasn't told it's allowed to.
You need to widen the TCP bottlenecks. Open /etc/sysctl.conf and look at these parameters. If they aren't there, add them.
The Connection Backlog
When a request hits your server, it sits in a queue waiting for the application (Nginx/Traefik/Kong) to accept it. If the queue is full, the connection is rejected. Default is often 128. That is a joke for an API gateway.
# Increase the maximum number of connections queued for acceptance
net.core.somaxconn = 65535
# Increase the memory dedicated to the network buffers
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
Port Exhaustion
If your gateway talks to upstream microservices, every connection uses a local port. You will run out of ports (ephemeral port exhaustion) under high load. Allow the kernel to reuse sockets in the TIME_WAIT state.
# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Expand the local port range
net.ipv4.ip_local_port_range = 1024 65535
Pro Tip: Apply these changes immediately with sysctl -p. But remember, these are useless if your VPS provider limits your CPU steal time. We see this constantly on budget hosts. CoolVDS guarantees KVM isolation so your kernel adjustments actually map to hardware resources.
2. Nginx Configuration: Beyond the Basics
Whether you use raw Nginx, Kong, or OpenResty, the underlying engine configuration determines your throughput. In May 2022, HTTP/2 is the standard for performance, and TLS 1.3 is mandatory for security without the latency penalty.
Here is a snippet from a production-grade nginx.conf tuned for an API Gateway role:
worker_processes auto;
worker_rlimit_nofile 65535;
events {
multi_accept on;
use epoll;
worker_connections 16384;
}
http {
# ... logs and mime types ...
##
# Buffer Optimization
##
client_body_buffer_size 10K;
client_header_buffer_size 1k;
client_max_body_size 8m;
large_client_header_buffers 2 1k;
##
# Keep-Alive is Critical for APIs
##
keepalive_timeout 30;
keepalive_requests 100000;
##
# SSL Optimization
##
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
ssl_protocols TLSv1.2 TLSv1.3;
}
Why this matters: Setting keepalive_requests high is crucial for API Gateways. Establishing a TCP connection is expensive. Establishing a TLS handshake is even more expensive. Don't tear down the connection after 100 requests. Keep it open.
3. The Hardware Variable: NVMe and KVM
Software tuning hits a hard wall called "Physics."
In an API setup, you are often logging requests, caching responses to disk (fastCGI cache / proxy cache), or reading SSL certificates. If your underlying storage is standard SSD (SATA) or, god forbid, spinning rust, your I/O wait times will destroy your latency metrics.
We benchmarked a standard 4KB random read/write (common for logs and cache) on different storage backends available in the Nordic market right now.
| Storage Type | IOPS (Random 4K) | Latency |
|---|---|---|
| HDD (Shared) | ~300 | 15ms+ |
| SATA SSD (Shared) | ~5,000 | 2-5ms |
| CoolVDS NVMe | ~50,000+ | <0.1ms |
When we built the CoolVDS platform, we mandated NVMe not just for the speed, but for the queue depth. NVMe drives can handle thousands of parallel command queues. SATA has one. If your API gateway is logging access logs while caching a response, SATA queues up. NVMe just handles it.
4. The Norwegian Nuance: Latency and Law
If your users are in Oslo, Bergen, or Trondheim, hosting in Frankfurt adds a physical latency floor you cannot tune away. Speed of light is roughly 300km per millisecond (in a vacuum), but fiber isn't a vacuum and routing isn't a straight line.
Round trip time (RTT) from Oslo to Frankfurt is usually ~25-30ms. RTT from Oslo to a local datacenter is <3ms.
Furthermore, we are seeing the fallout of Schrems II. The Datatilsynet (Norwegian Data Protection Authority) is becoming increasingly strict about data transfers outside the EEA. Hosting locally isn't just a performance play anymore; it's a compliance shield.
Code Example: Testing Latency with Curl
Don't believe the marketing. Test the time_connect yourself from your local machine:
curl -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total: %{time_total}\n" -o /dev/null -s https://your-api-endpoint.com
If time_connect is over 0.050s, you have a network distance problem. If time_starttransfer is high but connect is low, you have a backend/gateway processing problem (see section 1 and 2).
5. Conclusion
Performance is a stack. You start with the hardware (NVMe, KVM isolation), you tune the OS (sysctl), and finally, you optimize the application (Nginx). Skipping the foundation makes the rest irrelevant.
If you are tired of fighting for CPU cycles on noisy shared hosting, it is time to move your gateway.
Deploy a CoolVDS NVMe instance in Oslo today. We give you the raw IOPS; you bring the config.