Console Login

API Gateway Performance Tuning: Surviving the Microservices Storm (2016 Edition)

Stop Letting Your Gateway Choke Your Architecture

Microservices are the new standard. Everyone is breaking their monoliths into tiny pieces. That sounds great in the boardroom, but for us in the trenches, it means one thing: network overhead.

If you have twenty services talking to each other, a 50ms delay at the gateway isn't just an annoyance; it is a compounding disaster. I recently audited a setup for a Norwegian e-commerce client expecting heavy traffic. They were running a stock Nginx config on a budget VPS hosted somewhere in Germany. The result? 500ms latency on simple GET requests. Unacceptable.

We fixed it. We dropped latency to under 30ms. Here is the exact playbook we used, focusing on the stack available right now in early 2016: Nginx 1.9 (with HTTP/2), CentOS 7, and raw Kernel tuning.

1. The OS Layer: It Starts with the Kernel

Your fancy API gateway software does not matter if the Linux kernel drops packets because you hit the file descriptor limit. Default Linux distros are tuned for general-purpose desktop usage, not high-throughput packet switching.

Open /etc/sysctl.conf. We need to aggressively tune the TCP stack. The goal here is to recycle connections fast and allow massive concurrency.

# /etc/sysctl.conf

# Maximize the number of open file descriptors
fs.file-max = 2097152

# Allow more connections to be handled
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Reuse specific TCP connections in TIME_WAIT state
net.ipv4.tcp_tw_reuse = 1
# Note: tcp_tw_recycle is deprecated/dangerous in 2016 kernels if using NAT. Stick to reuse.

# Expand the port range for ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535

# Increase TCP buffer sizes for high-speed networks (10Gbps+)
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864

Apply this with sysctl -p. If you are on a shared container environment (like OpenVZ), these settings often fail because you don't own the kernel. This is why we exclusively run KVM at CoolVDS. You need your own kernel to perform deep tuning.

2. The Engine: Nginx 1.9.x + Lua (OpenResty)

In 2016, if you aren't looking at OpenResty, you are doing it wrong. It bundles Nginx with LuaJIT, allowing us to handle complex routing logic without the performance hit of a full application server.

However, even standard Nginx needs specific directives to handle thousands of requests per second (RPS). The worker_processes and keepalive settings are critical.

The Configuration Block

Edit your nginx.conf:

user www-data;
worker_processes auto; # Detects CPU cores automatically
worker_rlimit_nofile 1048576; # Must match fs.file-max roughly

events {
    use epoll;
    worker_connections 10240;
    multi_accept on; # Accept as many connections as possible
}

http {
    # ... logs and mime types ...

    # Optimization for API payloads
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;

    # KEEPALIVE IS MANDATORY FOR MICROSERVICES
    # This prevents opening a new TCP handshake for every internal request
    upstream backend_service {
        server 10.0.0.5:8080;
        keepalive 64;
    }

    server {
        listen 443 ssl http2; # HTTP/2 is the game changer of 2016
        server_name api.yourdomain.no;

        # SSL Tuning (Crucial for latency)
        ssl_certificate /etc/letsencrypt/live/domain/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/domain/privkey.pem;
        ssl_session_cache shared:SSL:50m;
        ssl_session_timeout 1d;
        ssl_session_tickets off;

        location / {
            proxy_pass http://backend_service;
            proxy_http_version 1.1; # Required for keepalive
            proxy_set_header Connection "";
            
            # Buffer tuning
            proxy_buffers 16 16k;
            proxy_buffer_size 32k;
        }
    }
}
Pro Tip: Notice the http2 directive in the listen directive? HTTP/2 was standardized last year (2015). It allows multiplexing requests over a single connection. For an API Gateway serving mobile apps, this reduces latency drastically compared to HTTP/1.1.

3. The Hardware Reality: NVMe and CPU Steal

You can tune software all day, but you cannot tune away bad physics. Most budget VPS providers in Europe are still running spinning rust (HDD) or cheap SATA SSDs in RAID arrays that degrade when neighbors get noisy.

For an API Gateway, I/O wait is the enemy. Logging access requests to disk can block your worker threads if the disk is slow. In our benchmarks, switching from SATA SSD to NVMe (Non-Volatile Memory Express) reduces log write latency by nearly 90%.

Furthermore, check for CPU Steal. In a terminal, run:

top

Look at the %st value. If it is above 0.0, your host is overselling CPUs. Your API requests are sitting in a queue waiting for the hypervisor. At CoolVDS, we pin resources to ensure %st stays at flat zero. When you are processing 10k RPS, consistency is more important than raw burst speed.

4. Data Sovereignty: The Norwegian Context

We are currently in a legal limbo. The Safe Harbor agreement was invalidated last year (Schrems I), and while the "Privacy Shield" is being discussed, uncertainty rules. The EU is also finalizing the texts for the upcoming General Data Protection Regulation (GDPR), which will likely change everything in the next two years.

For Norwegian businesses, the safest bet right now is keeping data on Norwegian soil. Hosting in Oslo reduces your legal exposure compared to hosting with US giants that pipe data across the Atlantic. Plus, the latency benefits are simple physics:

Route Approx. Latency
Oslo User -> AWS (Frankfurt) ~35-45ms
Oslo User -> CoolVDS (Oslo) ~2-5ms

That 30ms difference happens on every single request.

5. Verify Your Gains

Do not guess. Measure. Use wrk, a modern HTTP benchmarking tool, to stress test your new configuration.

# Install wrk (requires compiling on CentOS 7 usually)
git clone https://github.com/wg/wrk.git
cd wrk && make

# Run a test: 12 threads, 400 connections, 30 seconds
./wrk -t12 -c400 -d30s https://api.yourdomain.no/

If you see timeouts, check your ulimit -n. If you see high latency but no errors, check your upstream keepalive.

Final Thoughts

Building a high-performance API gateway in 2016 requires a holistic approach: Kernel tuning, modern protocols like HTTP/2, and underlying hardware that doesn't steal your CPU cycles. Don't let your infrastructure be the bottleneck of your innovation.

Ready to drop your latency? Deploy a high-frequency KVM instance on CoolVDS today. We offer native NVMe storage and Oslo-based datacenters tailored for the Nordic market.