Stop Accepting 500ms Overhead on Your API Calls
It is 2015. Users are on 4G networks, and they expect instant data. If your API gateway introduces 200ms of latency before it even hits your backend logic, your mobile app feels broken. I have analyzed logs from dozens of Norwegian startups this year, and the bottleneck is rarely the Ruby or PHP code. It is the gateway configuration.
Most of you are slapping a default NGINX install in front of your upstream servers and calling it a day. That works for static files. For high-throughput APIs, it is negligence.
Here is how we tune the stack for sub-millisecond overhead, specifically for the infrastructure we run at CoolVDS.
1. The Kernel is Your First Bottleneck
Default Linux distributions (CentOS 7, Ubuntu 14.04) are tuned for general-purpose computing, not high-concurrency packet switching. When you have thousands of ephemeral connections hitting your gateway, you run out of ports fast.
Edit your /etc/sysctl.conf. These aren't suggestions; they are requirements for high load.
# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535
# Maximize the backlog for incoming connections
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 4096
Apply it with sysctl -p. Without this, your kernel drops SYN packets when traffic spikes, and your clients see timeouts while your CPU is idle.
2. NGINX Upstream Keepalive: The forgotten flag
This is the most common mistake I see in 2015. By default, NGINX opens a new connection to your backend (Node.js, Python, etc.) for every single request. The TCP handshake overhead adds up.
You must configure the keepalive directive in your upstream block. This keeps the pipe open.
upstream backend_api {
server 10.0.0.5:8080;
keepalive 64;
}
server {
location /api/ {
proxy_pass http://backend_api;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
Note the empty Connection header. If you miss that line, NGINX closes the connection anyway. We recently fixed this for a client in Oslo, and their internal latency dropped from 45ms to 3ms.
3. The "Noisy Neighbor" & CPU Steal Time
You can tune your config all day, but if your underlying host is oversold, you are dead in the water. In a virtualized environment, "Steal Time" is the percentage of time your virtual CPU waits for the physical CPU to serve another customer's VM.
Pro Tip: Runtopinside your VPS. Look at the%stvalue. If it is consistently above 0.5, migrate immediately. You cannot tune your way out of resource contention.
This is why we architect CoolVDS differently. We use KVM (Kernel-based Virtual Machine) with strict resource isolation. Unlike OpenVZ containers where resources are fluid (and often stolen), our KVM instances lock your RAM and CPU cycles to your account. When you parse a massive JSON payload, that CPU cycle is yours, not your neighbor's.
4. Data Sovereignty and Latency
With the current discussions around the EU Data Protection Directive and the scrutiny from Datatilsynet, where your data physically sits matters more than ever. Routing API traffic through a cheap provider in Amsterdam adds 20-30ms of round-trip time (RTT) for your Norwegian users.
Hosting in Norway, specifically connected to NIX (Norwegian Internet Exchange), keeps that RTT negligible. Speed is a feature. Compliance is a requirement.
5. A Note on HTTP/2
RFC 7540 (HTTP/2) was just finalized this month. NGINX 1.9.x has experimental support. While it is too early for mission-critical banking APIs, start testing it on your staging environments now. Multiplexing requests over a single connection will change the game for mobile latency.
Final Thoughts
High-performance API hosting is not about magic; it is about removing barriers in the TCP stack and ensuring your hardware I/O isn't shared with a hundred other websites. Don't let your infrastructure be the reason your app fails.
Need a clean environment to test these configs? Spin up a pure KVM SSD instance on CoolVDS. It takes 55 seconds, and you get root access to the kernel.