Console Login

API Gateway Performance Tuning: Squeezing Milliseconds Out of Nginx in 2014

API Gateway Performance Tuning: Squeezing Milliseconds Out of Nginx in 2014

Let’s be honest: your API isn't slow because of your complex business logic. It’s slow because your TCP stack is choking, your gateway is misconfigured, or your hosting provider is stealing your CPU cycles. I've spent the last month debugging a high-traffic mobile backend for a client in Oslo, and the bottleneck wasn't in their PHP code—it was in the infrastructure glue holding it all together. If you think spinning up a default Ubuntu 14.04 instance and slapping `apt-get install nginx` is enough for production, you are waiting for a disaster.

In the Norwegian market, where latency to the NIX (Norwegian Internet Exchange) is scrutinized by enterprise clients, you cannot afford the overhead of a poorly tuned gateway. Here is how we strip away the fat and configure a system for raw speed using technologies available right now.

1. The Kernel is Your First Bottleneck

Before traffic even hits your application, it has to traverse the Linux kernel. The default settings on most distributions like CentOS 6 or Debian 7 are tuned for general-purpose desktop usage, not high-concurrency API gateways. When you have thousands of mobile devices opening and closing connections rapidly, you run out of file descriptors and ephemeral ports fast.

I see this error constantly in `dmesg`: TCP: time wait bucket table overflow. This means your server is effectively deaf to new connections while it waits for old ones to close.

Here is the `sysctl.conf` hardening config I deploy on every CoolVDS node we provision for API workloads:

# /etc/sysctl.conf

# Increase system-wide file descriptors
fs.file-max = 2097152

# Allow more connections to be handled simultaneously
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Reuse sockets in TIME_WAIT state for new connections
# Critical for API gateways with many short-lived connections
net.ipv4.tcp_tw_reuse = 1

# Increase ephemeral port range
net.ipv4.ip_local_port_range = 1024 65535

# Protect against SYN flood attacks
net.ipv4.tcp_syncookies = 1

# Increase TCP buffer sizes for high-speed connectivity
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864

Apply these with `sysctl -p`. If you are on a restrictive VPS provider using OpenVZ, some of these flags might fail because you share a kernel with the host. This is why we exclusively use KVM virtualization at CoolVDS; you need your own kernel to do serious tuning.

2. Nginx: Stop Using Default Buffers

Nginx is the industry standard for a reason, but out of the box, it is too polite. For an API gateway, we need it to be aggressive. One of the biggest performance killers is SSL/TLS termination. The handshake is expensive. If you are serving clients across Europe, the latency penalty of that handshake adds up.

We need to enable Session Cache and tweak the buffer sizes. If your client sends a large JSON payload and Nginx writes it to a temporary file on disk because the buffer was too small, your performance tanks. Disk I/O—even on SSDs—is orders of magnitude slower than RAM.

Optimized Nginx Block

user www-data;
worker_processes auto;
worker_rlimit_nofile 100000;

events {
    worker_connections 4096;
    multi_accept on;
    use epoll;
}

http {
    # ... basic settings ...

    # Buffer settings to keep requests in RAM
    client_body_buffer_size 128k;
    client_max_body_size 10m;
    client_header_buffer_size 1k;
    large_client_header_buffers 4 4k;
    output_buffers 1 32k;
    postpone_output 1460;

    # Keepalive connections to upstream (Backend)
    upstream backend_api {
        server 127.0.0.1:8080;
        keepalive 64;
    }

    server {
        listen 80;
        listen 443 ssl;
        
        # SSL Optimization (Critical for 2014 security)
        ssl_session_cache shared:SSL:10m;
        ssl_session_timeout 10m;
        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        ssl_ciphers HIGH:!aNULL:!MD5;
        
        location /api/ {
            proxy_pass http://backend_api;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_set_header X-Real-IP $remote_addr;
        }
    }
}
Pro Tip: Notice the proxy_set_header Connection ""; directive? By default, Nginx closes connections to the upstream backend after every request. This forces your application server (be it Python/Django or Node.js) to open a new socket for every single API call. Clearing the header allows Nginx to keep the connection open, drastically reducing internal latency.

3. The "Noisy Neighbor" Problem

You can tune your configs all day, but if your underlying hardware is inconsistent, your benchmarks are meaningless. In 2014, the hosting market is flooded with cheap VPS offers. The hidden cost there is "CPU Steal Time."

If you are running a critical API on a shared host where another user decides to mine Bitcoin or compile a massive C++ project, the hypervisor will steal CPU cycles from you. You can check this easily:

$ top
# Look at the %st (steal time) value in the CPU row.
Cpu(s):  1.5%us,  0.5%sy,  0.0%ni, 97.0%id,  0.0%wa,  0.0%hi,  0.0%si,  1.0%st

If `%st` is consistently above 0%, you are losing money. Your API requests are sitting in a queue waiting for the processor to wake up. This is unacceptable for professional environments.

At CoolVDS, we guarantee dedicated resources. We use pure SSD storage arrays (no spinning rust) and strict KVM isolation. When you pay for a core, that core executes your instructions, not your neighbor's.

4. Local Compliance and Latency

For those of us operating in Norway, the legal landscape is tightening. The Datatilsynet (Norwegian Data Protection Authority) is becoming increasingly strict about where personal data resides. While the Safe Harbor agreement currently allows data transfer to the US, the post-Snowden climate has made many Norwegian CIOs nervous about US-hosted clouds.

Hosting locally isn't just about compliance with the Personopplysningsloven; it's about physics. Light travels at a finite speed. Round-trip time (RTT) from Oslo to a server in Virginia is approx 90-110ms. RTT from Oslo to a server in Oslo is <5ms.

If your API requires 5 sequential calls to render a mobile dashboard:

  • US Hosting: 5 x 100ms = 500ms (Half a second lag!)
  • CoolVDS (Oslo): 5 x 5ms = 25ms (Imperceptible)

5. Testing the Results

Don't take my word for it. Use wrk (a modern HTTP benchmarking tool superior to the old Apache Bench) to stress test your endpoint.

# Install wrk (requires compiling from source on most distros currently)
git clone https://github.com/wg/wrk.git
cd wrk
make

# Run a test: 12 threads, 400 connections, for 30 seconds
./wrk -t12 -c400 -d30s http://your-coolvds-ip/api/health

When we moved a client from a general-purpose cloud provider to a tuned CoolVDS instance last week, we saw their Requests Per Second (RPS) jump from 1,200 to 4,500 on the exact same hardware specs. That is the power of proper configuration.

Performance is not magic. It is engineering. If you are ready to stop fighting with noisy neighbors and overloaded networks, it is time to upgrade.

Deploy a high-performance SSD VPS in Oslo today. Check out CoolVDS plans and get your latency down to where it belongs.