Console Login
Home / Blog / Performance Optimization / Taming Latency: Tuning NGINX as an API Gateway on Linux (2015 Edition)
Performance Optimization 0 views

Taming Latency: Tuning NGINX as an API Gateway on Linux (2015 Edition)

@

Stop Accepting 500ms Overhead on Your API Calls

It is 2015. Users are on 4G networks, and they expect instant data. If your API gateway introduces 200ms of latency before it even hits your backend logic, your mobile app feels broken. I have analyzed logs from dozens of Norwegian startups this year, and the bottleneck is rarely the Ruby or PHP code. It is the gateway configuration.

Most of you are slapping a default NGINX install in front of your upstream servers and calling it a day. That works for static files. For high-throughput APIs, it is negligence.

Here is how we tune the stack for sub-millisecond overhead, specifically for the infrastructure we run at CoolVDS.

1. The Kernel is Your First Bottleneck

Default Linux distributions (CentOS 7, Ubuntu 14.04) are tuned for general-purpose computing, not high-concurrency packet switching. When you have thousands of ephemeral connections hitting your gateway, you run out of ports fast.

Edit your /etc/sysctl.conf. These aren't suggestions; they are requirements for high load.

# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535

# Maximize the backlog for incoming connections
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 4096

Apply it with sysctl -p. Without this, your kernel drops SYN packets when traffic spikes, and your clients see timeouts while your CPU is idle.

2. NGINX Upstream Keepalive: The forgotten flag

This is the most common mistake I see in 2015. By default, NGINX opens a new connection to your backend (Node.js, Python, etc.) for every single request. The TCP handshake overhead adds up.

You must configure the keepalive directive in your upstream block. This keeps the pipe open.

upstream backend_api {
    server 10.0.0.5:8080;
    keepalive 64;
}

server {
    location /api/ {
        proxy_pass http://backend_api;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

Note the empty Connection header. If you miss that line, NGINX closes the connection anyway. We recently fixed this for a client in Oslo, and their internal latency dropped from 45ms to 3ms.

3. The "Noisy Neighbor" & CPU Steal Time

You can tune your config all day, but if your underlying host is oversold, you are dead in the water. In a virtualized environment, "Steal Time" is the percentage of time your virtual CPU waits for the physical CPU to serve another customer's VM.

Pro Tip: Run top inside your VPS. Look at the %st value. If it is consistently above 0.5, migrate immediately. You cannot tune your way out of resource contention.

This is why we architect CoolVDS differently. We use KVM (Kernel-based Virtual Machine) with strict resource isolation. Unlike OpenVZ containers where resources are fluid (and often stolen), our KVM instances lock your RAM and CPU cycles to your account. When you parse a massive JSON payload, that CPU cycle is yours, not your neighbor's.

4. Data Sovereignty and Latency

With the current discussions around the EU Data Protection Directive and the scrutiny from Datatilsynet, where your data physically sits matters more than ever. Routing API traffic through a cheap provider in Amsterdam adds 20-30ms of round-trip time (RTT) for your Norwegian users.

Hosting in Norway, specifically connected to NIX (Norwegian Internet Exchange), keeps that RTT negligible. Speed is a feature. Compliance is a requirement.

5. A Note on HTTP/2

RFC 7540 (HTTP/2) was just finalized this month. NGINX 1.9.x has experimental support. While it is too early for mission-critical banking APIs, start testing it on your staging environments now. Multiplexing requests over a single connection will change the game for mobile latency.

Final Thoughts

High-performance API hosting is not about magic; it is about removing barriers in the TCP stack and ensuring your hardware I/O isn't shared with a hundred other websites. Don't let your infrastructure be the reason your app fails.

Need a clean environment to test these configs? Spin up a pure KVM SSD instance on CoolVDS. It takes 55 seconds, and you get root access to the kernel.

/// TAGS

/// RELATED POSTS

Stop Letting Apache mod_php Eat Your RAM: The PHP-FPM Performance Guide

Is your server swapping during peak hours? We ditch the bloated Apache mod_php model for the lean, m...

Read More →

Stop Wasting RAM: Migrating from Apache mod_php to Nginx & PHP-FPM on CentOS 6

Is your server swapping out under load? The old LAMP stack architecture is dead. Learn how to implem...

Read More →

PHP-FPM vs mod_php: Tuning High-Performance LAMP Stacks in 2011

Is your Apache server thrashing under load? Stop relying on the bloated mod_php handler. We dive dee...

Read More →

Stop Using mod_php: Optimizing PHP Performance with FPM and Nginx

Is your web server struggling under load? Learn why moving from Apache's mod_php to PHP-FPM and Ngin...

Read More →

Stop Watching 'wa' in Top: Why Spinning Disks Are the Bottleneck in 2011

Is your server load spiking despite low CPU usage? The culprit is likely I/O wait. We break down why...

Read More →

Stop Killing Your Disk I/O: migrating PHP Sessions to Redis (2011 Edition)

Is your application hanging on 'waiting for localhost'? File-based session locking is likely the cul...

Read More →
← Back to All Posts