The I/O Wait Trap
It is 2015, and if your API takes more than 200ms to respond, your mobile users are already gone. I recently audited a client's setup running a REST API for a high-traffic media app. They were running Ubuntu 14.04 on a budget US-based VPS provider. Their code was decent (Node.js 0.12), but their latency was erratic. One request took 50ms, the next took 2 seconds.
The culprit wasn't their code. It was noisy neighbors and spinning rust. They were on a shared OpenVZ container where another tenant was hammering the disk, causing their I/O wait to spike. In the world of high-performance API gateways, disk latency is the silent killer of concurrency.
Here is how we fixed it, and how you can tune your stack to handle thousands of requests per second without melting your server.
1. The New Heavyweight: HTTP/2 is Here
If you are still serving purely over HTTP/1.1, you are leaving performance on the table. Nginx 1.9.5 was released just last month (September 2015), and it finally brings HTTP/2 support. This is not a drill. Multiplexing requests over a single connection drastically reduces the overhead for rich API payloads.
To enable it, you need to be running the mainline branch. Don't rely on the default yum or apt-get repositories; they are often months behind. Compile from source or use the official Nginx repo.
server {
listen 443 ssl http2;
server_name api.yourdomain.no;
ssl_certificate /etc/nginx/ssl/server.crt;
ssl_certificate_key /etc/nginx/ssl/server.key;
...
}
Pro Tip: HTTP/2 requires SSL/TLS. If you haven't implemented SSL because of "performance fears," stop. Modern CPUs with AES-NI instructions handle encryption with negligible overhead. Security is no longer an excuse for slowness.
2. Kernel Tuning: Open the Floodgates
Out of the box, Linux is tuned for a general-purpose desktop, not a high-throughput API gateway. When you have thousands of ephemeral connections (common with REST APIs), you will run out of file descriptors fast. You'll see Too many open files in your logs.
Edit your /etc/sysctl.conf. We need to allow the kernel to reuse TIME-WAIT sockets and increase the range of ephemeral ports.
# /etc/sysctl.conf
fs.file-max = 2097152
# Network tuning for high concurrency
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_max_syn_backlog = 3240000
Apply these with sysctl -p. Combine this with increasing the Nginx worker_rlimit_nofile to match your OS limits.
3. The Hardware Reality: NVMe vs. SSD vs. HDD
Software tuning only gets you so far. The physical medium storing your database and logs dictates your floor for latency. Most "cloud" providers today (2015) are still selling you standard SSDs, or worse, spinning HDDs, as "high performance."
Here is the reality of Input/Output Operations Per Second (IOPS):
| Storage Type | Approx. Random Read IOPS | Likely Bottleneck |
|---|---|---|
| 7.2k RPM HDD | 80 - 120 | Everything. Avoid for APIs. |
| Standard SATA SSD | 5,000 - 80,000 | SATA Interface limits |
| NVMe (CoolVDS Standard) | 200,000+ | None. (CPU becomes the limit) |
At CoolVDS, we don't bother with legacy storage. Our instances are backed by NVMe storage interfaces. When your database (MySQL or MongoDB) tries to write session data during a traffic spike, NVMe ensures the disk isn't the reason your request hangs.
4. The Legal Storm: Safe Harbor is Dead
We cannot talk about hosting architecture without addressing the elephant in the room. On October 6th, the European Court of Justice invalidated the Safe Harbor agreement. This is a massive blow to anyone hosting Norwegian user data on US-controlled servers (AWS, Google, Rackspace), regardless of where the datacenter is physically located.
Datatilsynet (The Norwegian Data Protection Authority) is likely to ramp up scrutiny. As a CTO or Systems Architect, you now have a legal liability if you are shipping data across the Atlantic without valid transfer mechanisms.
This is where sovereignty meets performance. Hosting on CoolVDS keeps your data legally within Norway/Europe, adhering to strict privacy standards while drastically reducing network latency. Why route packets to Frankfurt or Amsterdam when your users are in Oslo and Bergen? Connecting via NIX (Norwegian Internet Exchange) ensures your ping times stay in the single digits.
5. KVM: Real Virtualization
Finally, stop using container-based VPS (OpenVZ) for production workloads. You share the kernel with every other customer on the host. If they crash the kernel, you go down. If they exhaust the entropy pool, your SSL handshakes hang.
We use KVM (Kernel-based Virtual Machine). You get your own dedicated kernel, guaranteed RAM, and true isolation. For a high-performance API gateway, jitter is the enemy. KVM provides the consistency required for reliable sub-100ms response times.
Final Thoughts
Optimizing an API gateway is a game of millimeters. You tune the kernel, you enable HTTP/2 in Nginx, and you ensure your hardware path is clear of bottlenecks. But if you build your house on sand—slow disks and legally risky jurisdictions—the configuration flags won't save you.
Do not let slow I/O or legal ambiguity kill your project. Deploy a KVM instance on CoolVDS today and see what true low-latency infrastructure looks like.