API Gateway Performance Tuning: The Sub-Millisecond Guide for 2024

If your API Gateway adds more than 15 milliseconds to a request, you are doing it wrong. In the high-frequency trading floors of Oslo or the data-heavy logistics hubs of Northern Europe, latency isn't just an annoyance; it's a direct revenue leak. I have seen too many architectures where a bloated Java-based gateway running on oversold cloud infrastructure creates a bottleneck that no amount of backend code optimization can fix.

We are not talking about switching web frameworks here. We are talking about the raw mechanics of packet processing, kernel interrupts, and file descriptors. This is about taking a standard Nginx or Envoy setup and stripping away the fat until it screams.

Let’s dissect how to tune your API gateway for maximum throughput on a Linux environment, specifically tailored for the high-bandwidth reality of the Norwegian infrastructure market.

1. The Hardware Lie: Steal Time is the Enemy

Before touching a config file, acknowledge the hardware reality. You cannot tune a gateway on a noisy VPS. If your %st (steal time) in top is above 0.0%, your benchmarks are invalid. API Gateways are CPU-bound during TLS termination and I/O-bound during request routing.

This is why we benchmark exclusively on CoolVDS instances. The KVM virtualization ensures that the CPU cycles you pay for are actually yours. When you pin a process to a core on a CoolVDS NVMe instance, it stays there. No neighbor on the same physical host is going to steal your L3 cache.

2. Kernel Tuning: The `sysctl.conf` Essentials

Linux defaults are designed for general-purpose desktop usage, not for handling 50,000 concurrent connections. We need to widen the TCP pipe.

Open /etc/sysctl.conf. We are going to adjust the backlog queue and enable Fast Open. This is crucial for reducing the round-trip time (RTT) during the TCP handshake, especially for clients connecting from outside the Oslo region.

Key Kernel Directives

First, verify your current limit:

cat /proc/sys/net/core/somaxconn

If it says 128 or 1024, you are throttling your own success. Here is the production-grade configuration we deploy on high-performance nodes:

# /etc/sysctl.conf

# Maximize the backlog of incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Reuse sockets in TIME_WAIT state for new connections
# Critical for high-throughput API gateways talking to backends
net.ipv4.tcp_tw_reuse = 1

# Increase ephemeral port range to avoid port exhaustion
net.ipv4.ip_local_port_range = 1024 65535

# TCP Fast Open (TFO) reduces handshake RTT
net.ipv4.tcp_fastopen = 3

# Congestion control - BBR is generally superior for mixed WAN traffic
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr

Apply these changes immediately:

sysctl -p

Pro Tip: If you are hosting in Norway but serving users in Central Europe, tcp_fastopen effectively cuts the latency penalty of physical distance by allowing data transfer during the handshake.

3. Nginx: The Gateway Config That Matters

Most tutorials tell you to set worker_processes auto; and walk away. That is insufficient. For an API Gateway, we need to manage file descriptors and upstream keepalives aggressively.

Every connection to your gateway is a file descriptor. Every connection from your gateway to your microservice is another file descriptor. If you hit the limit, your users get 502s.

Raising the Limits

Check your system limits:

ulimit -n

If it's 1024, bump it in /etc/security/limits.conf. Then, configure Nginx to use them.

# nginx.conf

user www-data;
worker_processes auto;

# The number of file descriptors per worker
worker_rlimit_nofile 65535;

events {
    # Essential for Linux high performance
    use epoll;
    
    # Allow a worker to accept all new connections at once
    multi_accept on;
    
    worker_connections 65535;
}

http {
    # ... logs and mime types ...

    # OPTIMIZATION: Disable access logs for static assets or health checks to save I/O
    access_log off;
    
    # sendfile copies data between descriptors within the kernel
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    
    # Keepalive timeout - balance between resource usage and connection setup cost
    keepalive_timeout 65;
    
    # Gzip settings for JSON payloads
    gzip on;
    gzip_min_length 1024;
    gzip_types application/json;
}

The Upstream Keepalive Trap

This is where 90% of setups fail. By default, Nginx acts as a reverse proxy that closes the connection to the backend service after every request. This forces a new TCP handshake for every API call internally. It burns CPU and adds latency.

You must configure the upstream block to keep connections open.

upstream backend_microservices {
    server 10.0.0.5:8080;
    server 10.0.0.6:8080;

    # KEEPALIVE: Keep 64 idle connections to the backend open per worker
    keepalive 64;
}

server {
    listen 443 ssl http2;
    server_name api.coolvds-client.no;

    location / {
        proxy_pass http://backend_microservices;
        
        # Required for HTTP/1.1 keepalive to backends
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

4. TLS Termination: Speed vs. Security

Decryption is expensive. Modern CPUs (like the ones backing CoolVDS instances) support AES-NI instructions, which hardware-accelerate AES encryption. However, for mobile clients, ChaCha20-Poly1305 is often faster and less battery-intensive.

Test your OpenSSL speed:

openssl speed -evp aes-256-gcm

Ensure you are using HTTP/3 (QUIC) if your gateway supports it (Nginx 1.25+). QUIC runs over UDP and eliminates the Head-of-Line blocking problem common in HTTP/2.

5. Comparison: Nginx vs. Traefik vs. Kong

We see a lot of devs moving to Traefik for the "ease of use" with Docker. Be careful. Go's garbage collector (used in Traefik) can introduce latency spikes that C-based Nginx does not have. Here is the breakdown for 2024:

Feature	Nginx (C)	Kong (LuaJIT/C)	Traefik (Go)
Throughput	Highest	High	Medium
Latency Consistency	Rock Solid	Very Good	Occasional GC Spikes
Config Complexity	High	Medium	Low
Ideal Use Case	Edge Ingress	API Management	Internal Service Mesh

6. The Norwegian Context: NIX and GDPR

Latency is geography. Hosting your API Gateway in Frankfurt when your user base is in Bergen adds 20-30ms of pure physics to every request. By utilizing CoolVDS infrastructure located directly in Norway, you peer directly at NIX (Norwegian Internet Exchange).

Furthermore, the Datatilsynet (Norwegian Data Protection Authority) is increasingly strict regarding Schrems II and data transfers to non-adequate countries. Running your gateway and termination logic on Norwegian soil isn't just a performance play; it's a compliance fortress. You retain full sovereignty over the data ingress.

7. Verification

Don't trust my word. Benchmarking is the only truth. Use wrk to hammer your endpoint.

wrk -t12 -c400 -d30s https://your-api.no/endpoint

Look at the Stdev (Standard Deviation). If it's high, your gateway is choking on context switches. On a properly tuned CoolVDS instance, that standard deviation should be negligible.

Conclusion

Performance isn't an accident. It is the result of deliberate architectural choices: selecting the right virtualization (KVM), tuning the Linux kernel for network throughput, and configuring your gateway to maintain persistent connections. The default settings are for safety; your settings should be for speed.

Don't let slow I/O kill your SEO or your user experience. Deploy a test instance on CoolVDS today, apply these sysctl configs, and watch your latency drop.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

API Gateway Performance Tuning: The Sub-Millisecond Guide for 2024

API Gateway Performance Tuning: The Sub-Millisecond Guide for 2024

1. The Hardware Lie: Steal Time is the Enemy

2. Kernel Tuning: The `sysctl.conf` Essentials

Key Kernel Directives

3. Nginx: The Gateway Config That Matters

Raising the Limits

The Upstream Keepalive Trap

4. TLS Termination: Speed vs. Security

5. Comparison: Nginx vs. Traefik vs. Kong

6. The Norwegian Context: NIX and GDPR

7. Verification

Conclusion

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025