Scaling API Throughput: Why Your Reverse Proxy is Choking
I recently audited a payment gateway backend for a client in Oslo. They were running a standard LAMP stack on a generic budget VPS, and every time their traffic spiked during a marketing campaign, their API latency shot up from 200ms to 4 seconds. The database wasn't the bottleneck; the web server was choking on open file descriptors and context switching. If you are serious about low latency, default configurations are your enemy.
In the world of high-performance hosting, we often obsess over code optimization but neglect the infrastructure that delivers it. Today, we aren't talking about optimizing PHP or Python code. We are going to look at the edge: the API Gateway (or Reverse Proxy) that sits between your users and your application logic. Whether you are using Nginx or HAProxy, the principles of tuning the Linux kernel remain the same.
1. The Software Layer: Tuning Nginx for Concurrency
Most operating systems ship with Nginx configurations designed for low-memory footprints, not high concurrency. If you are serving an API, you are dealing with thousands of small, short-lived connections. The default worker_processes and connection limits will throttle you long before your CPU maxes out.
Here is a production-ready snippet from a recent deployment on an Ubuntu 12.04 LTS node. This configuration aggressively keeps connections alive to reduce the TCP handshake overhead, which is critical for mobile clients on flaky 3G networks.
Key Directives to Change
Open your /etc/nginx/nginx.conf and look at these values:
user www-data;
worker_processes auto; # Requires Nginx 1.3.8+ or 1.2.5+, otherwise set to CPU core count
pid /var/run/nginx.pid;
worker_rlimit_nofile 100000; # Critical: Allows nginx to open more files than the default ulimit
events {
worker_connections 4096;
multi_accept on;
use epoll;
}
http {
##
# Basic Settings
##
sendfile on;
tcp_nopush on;
tcp_nodelay on; # Vital for API responses to send data immediately without buffering
keepalive_timeout 30;
types_hash_max_size 2048;
# reduce buffer sizes for API payloads to save memory per connection
client_body_buffer_size 10K;
client_header_buffer_size 1k;
client_max_body_size 8m;
large_client_header_buffers 2 1k;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# Logging Settings
##
# Turn off access logs for high-traffic APIs to save Disk I/O, or log to syslog
access_log off;
error_log /var/log/nginx/error.log crit;
}
Pro Tip: Setting access_log off; can improve throughput by 10-15% on high-load systems by eliminating disk writes for every single request. If you need analytics, offload logging to a dedicated syslog server or use a non-blocking logger.
2. The Kernel Layer: Sysctl Optimization
Nginx can only do what the Linux kernel allows it to do. By default, the Linux TCP stack is tuned for a general-purpose desktop, not a high-throughput server. The most common error we see in dmesg is "possible SYN flooding on port 80". This often isn't a DDoS attack; it's just the kernel dropping valid connections because the backlog queue is full.
We need to modify /etc/sysctl.conf to widen the TCP limits. These settings are aggressive but necessary for handling thousands of concurrent API requests.
# Increase system file descriptor limit
fs.file-max = 100000
# Increase the size of the receive queue.
net.core.netdev_max_backlog = 4096
# Increase the maximum number of connections in the wait state
net.core.somaxconn = 4096
# Increase the maximum amount of option memory buffers
net.core.optmem_max = 25165824
# TCP Window Scaling
net.ipv4.tcp_window_scaling = 1
# Reuse sockets in TIME_WAIT state for new connections
# This is crucial for internal communication between Nginx and PHP-FPM/upstream
net.ipv4.tcp_tw_reuse = 1
# Note: Be careful with tcp_tw_recycle behind NAT, but useful for internal IPs
net.ipv4.tcp_tw_recycle = 0
# Buffer sizes for TCP
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
After editing, run sysctl -p to apply changes without a reboot.
3. The Hardware Reality: Why Virtualization Matters
You can tune software all day, but if your underlying infrastructure steals your CPU cycles, it's pointless. In the budget hosting market, many providers use OpenVZ or Virtuozzo containerization. These technologies allow providers to oversell RAM and CPU significantly.
If a "noisy neighbor" on your physical node starts compiling code or running a heavy MySQL query, your API latency will spike because you are sharing the host's kernel.
At CoolVDS, we exclusively use KVM (Kernel-based Virtual Machine). KVM provides full hardware virtualization. Your RAM is allocated, your CPU cycles are reserved, and you run your own isolated kernel. This isolation is mandatory for consistent API performance.
The Storage Bottleneck
Database queries often stall waiting for disk I/O. In 2013, spinning rust (HDD) simply cannot keep up with the random read/write patterns of a busy API. While standard SSDs are becoming common, CoolVDS is deploying enterprise-grade SSD storage arrays connected via high-speed interfaces. This ensures that when your Nginx cache misses, the fetch from disk is almost instantaneous.
| Feature | Budget VPS (OpenVZ/HDD) | CoolVDS (KVM/SSD) |
|---|---|---|
| IOPS (Random Read) | ~80 - 150 | ~20,000+ |
| Kernel Access | Shared (Restricted) | Dedicated (Full Sysctl control) |
| Latency Stability | High Jitter | Consistent |
4. Local Nuances: Norway and GDPR Compliance
If your user base is in Scandinavia, hosting in Frankfurt or London adds unnecessary milliseconds. Physics is physics. A packet traveling from Oslo to Frankfurt and back takes time. By utilizing VPS Norway infrastructure, you cut that round-trip time (RTT) significantly.
Furthermore, with the Data Inspectorate (Datatilsynet) tightening scrutiny on data privacy, knowing exactly where your data resides is becoming a competitive advantage. Keeping your customer data on Norwegian soil simplifies compliance with local regulations, a trend we expect to grow as European data sovereignty laws evolve.
Conclusion
Performance isn't an accident; it's an architecture. By moving from default configs to tuned Nginx workers, optimizing your Linux TCP stack, and upgrading to KVM-based managed hosting with SSD storage, you can handle 10x the traffic on the same size instance.
Don't let legacy hardware be the reason your app feels slow. Deploy a test instance on CoolVDS today and experience the difference raw I/O power makes.