Stop Accepting 500ms Overhead on Your API Calls

It is 2015. Users are on 4G networks, and they expect instant data. If your API gateway introduces 200ms of latency before it even hits your backend logic, your mobile app feels broken. I have analyzed logs from dozens of Norwegian startups this year, and the bottleneck is rarely the Ruby or PHP code. It is the gateway configuration.

Most of you are slapping a default NGINX install in front of your upstream servers and calling it a day. That works for static files. For high-throughput APIs, it is negligence.

Here is how we tune the stack for sub-millisecond overhead, specifically for the infrastructure we run at CoolVDS.

1. The Kernel is Your First Bottleneck

Default Linux distributions (CentOS 7, Ubuntu 14.04) are tuned for general-purpose computing, not high-concurrency packet switching. When you have thousands of ephemeral connections hitting your gateway, you run out of ports fast.

Edit your /etc/sysctl.conf. These aren't suggestions; they are requirements for high load.

# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535

# Maximize the backlog for incoming connections
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 4096

Apply it with sysctl -p. Without this, your kernel drops SYN packets when traffic spikes, and your clients see timeouts while your CPU is idle.

2. NGINX Upstream Keepalive: The forgotten flag

This is the most common mistake I see in 2015. By default, NGINX opens a new connection to your backend (Node.js, Python, etc.) for every single request. The TCP handshake overhead adds up.

You must configure the keepalive directive in your upstream block. This keeps the pipe open.

upstream backend_api {
    server 10.0.0.5:8080;
    keepalive 64;
}

server {
    location /api/ {
        proxy_pass http://backend_api;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

Note the empty Connection header. If you miss that line, NGINX closes the connection anyway. We recently fixed this for a client in Oslo, and their internal latency dropped from 45ms to 3ms.

3. The "Noisy Neighbor" & CPU Steal Time

You can tune your config all day, but if your underlying host is oversold, you are dead in the water. In a virtualized environment, "Steal Time" is the percentage of time your virtual CPU waits for the physical CPU to serve another customer's VM.

Pro Tip: Run top inside your VPS. Look at the %st value. If it is consistently above 0.5, migrate immediately. You cannot tune your way out of resource contention.

This is why we architect CoolVDS differently. We use KVM (Kernel-based Virtual Machine) with strict resource isolation. Unlike OpenVZ containers where resources are fluid (and often stolen), our KVM instances lock your RAM and CPU cycles to your account. When you parse a massive JSON payload, that CPU cycle is yours, not your neighbor's.

4. Data Sovereignty and Latency

With the current discussions around the EU Data Protection Directive and the scrutiny from Datatilsynet, where your data physically sits matters more than ever. Routing API traffic through a cheap provider in Amsterdam adds 20-30ms of round-trip time (RTT) for your Norwegian users.

Hosting in Norway, specifically connected to NIX (Norwegian Internet Exchange), keeps that RTT negligible. Speed is a feature. Compliance is a requirement.

5. A Note on HTTP/2

RFC 7540 (HTTP/2) was just finalized this month. NGINX 1.9.x has experimental support. While it is too early for mission-critical banking APIs, start testing it on your staging environments now. Multiplexing requests over a single connection will change the game for mobile latency.

Final Thoughts

High-performance API hosting is not about magic; it is about removing barriers in the TCP stack and ensuring your hardware I/O isn't shared with a hundred other websites. Don't let your infrastructure be the reason your app fails.

Need a clean environment to test these configs? Spin up a pure KVM SSD instance on CoolVDS. It takes 55 seconds, and you get root access to the kernel.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Taming Latency: Tuning NGINX as an API Gateway on Linux (2015 Edition)

Stop Accepting 500ms Overhead on Your API Calls

1. The Kernel is Your First Bottleneck

2. NGINX Upstream Keepalive: The forgotten flag

3. The "Noisy Neighbor" & CPU Steal Time

4. Data Sovereignty and Latency

5. A Note on HTTP/2

Final Thoughts

/// RELATED POSTS

API Gateway Tuning: Crushing Latency in High-Traffic Nordic Systems

Silence the Noise: Advanced APM Strategies for High-Throughput Norwegian Systems

Bun vs. Node.js in 2025: Why High-Performance Runtimes Die on Cheap VPS Hardware

Zero-Compromise API Gateway Tuning: Reducing Latency from Oslo to the Edge

Nordic Latency Killers: Advanced API Gateway Tuning for High-Throughput Systems

Zen 5 in the Datacenter: Why We Deployed AMD Ryzen 9000 Series for High-Performance VDS