API Gateway Performance Tuning: Breaking the 100ms Barrier

There is a specific kind of silence that fills a room when a lead developer realizes their API Gateway is the bottleneck. It’s not the database. It’s not the application logic. It’s the door frame itself. In the high-stakes environment of Nordic tech, where users expect instantaneous interaction whether they are in Oslo or Tromsø, a slow handshake is a death sentence for your application.

Most default VPS configurations are woefully inadequate for high-concurrency API traffic. They are tuned for general-purpose web serving, not the bursty, connection-heavy patterns of modern microservices. I have spent the last three weeks debugging a payment processing cluster that was timing out during peak loads. The culprit? Default file descriptor limits and a TCP stack behaving like it’s still 1999.

Today, we are going to fix that. We will tune an Nginx-based API Gateway running on CentOS 7 (or Ubuntu 18.04 LTS) to handle thousands of requests per second without breaking a sweat. And we are going to do it on hardware that actually supports high I/O, because tuning software on spinning rust is a waste of time.

The Hardware Foundation: Why I/O Wait Kills APIs

Before we touch a single config file, we need to address the infrastructure. An API Gateway logs heavily. Access logs, error logs, audit trails for GDPR compliance—especially relevant here in Norway under Datatilsynet's watchful eye. If your disk I/O is slow, your Nginx workers block while writing to disk. This is "I/O Wait," and it causes latency spikes that look like network issues but are actually disk issues.

This is why we standardized on CoolVDS for our reference architecture. They don't use standard SSDs; they use NVMe storage. In 2019, the difference between SATA SSD and NVMe is the difference between a bicycle and a Tesla. On a CoolVDS instance, the write latency is negligible, meaning your gateway creates logs and moves on instantly.

Step 1: The OS Layer (Kernel Tuning)

Linux defaults are conservative. For an API Gateway, we need to open the floodgates. We need to modify /etc/sysctl.conf to handle a massive number of open connections and rapid TCP recycling.

Pro Tip: Be careful with tcp_tw_recycle. In modern Linux kernels (4.12+), it is deprecated and can cause issues with NAT. Stick to tcp_tw_reuse.

Add these lines to your sysctl configuration to optimize the TCP stack for low latency and high concurrency:

# /etc/sysctl.conf

# Maximize the backlog of incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65535

# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase TCP buffer sizes for 10Gbps+ networks (common in Nordic datacenters)
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# Protection against SYN flood attacks
net.ipv4.tcp_syncookies = 1

Apply these changes with sysctl -p. If you are running on a standard shared hosting provider, you might not have permission to change these. This is another reason why a KVM-based VPS from CoolVDS is essential; you get full kernel control.

Step 2: Nginx Worker Configuration

Nginx is the gold standard for API Gateways in 2019. Whether you use raw Nginx or OpenResty (the engine behind Kong), the worker configuration dictates your concurrency limit. The standard worker_processes 1; is insufficient.

Open your nginx.conf and locate the main context block:

user nginx;
worker_processes auto; # Automatically detects CPU cores
worker_rlimit_nofile 65535; # Allows Nginx to open this many files/sockets

events {
    worker_connections 16384; # Connections per worker
    use epoll; # Essential for Linux performance
    multi_accept on; # Accept as many connections as possible
}

The worker_rlimit_nofile directive is critical. Without it, Nginx will hit the OS limit (often 1024) and start dropping connections with "Too many open files" errors, regardless of your RAM.

Step 3: Upstream Keepalives & SSL

One of the biggest latency killers is the SSL handshake. Establishing a secure connection is expensive computationally. To mitigate this, we use Keepalives to the backend services and optimized TLS settings for the client.

Here is a production-ready upstream configuration for a microservice architecture:

http {
    # ... other settings ...

    # Upstream definition with Keepalive
    upstream backend_service {
        server 10.0.0.5:8080;
        server 10.0.0.6:8080;
        
        # Keep 64 idle connections open to the backend
        keepalive 64;
    }

    server {
        listen 443 ssl http2;
        server_name api.yourdomain.no;

        # TLS Optimization
        ssl_protocols TLSv1.2 TLSv1.3; # TLS 1.3 is faster and more secure
        ssl_ciphers EECDH+AESGCM:EDH+AESGCM;
        ssl_session_cache shared:SSL:10m;
        ssl_session_timeout 10m;
        
        # OCSP Stapling (Speeds up handshake by verifying cert on server side)
        ssl_stapling on;
        ssl_stapling_verify on;
        resolver 1.1.1.1 8.8.8.8 valid=300s;

        location / {
            proxy_pass http://backend_service;
            
            # Required for keepalive to work
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            
            # Forwarding headers for logging/security
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }
    }
}

The Norwegian Latency Advantage

Code optimization can only take you so far. Physics is the final boss. If your target audience is in Scandinavia, hosting your API Gateway in Frankfurt or London adds 20-30ms of round-trip time (RTT) purely due to distance. Hosting in the US adds 100ms+.

User Location	Server in US (Virginia)	Server in Germany	CoolVDS (Oslo/Nearby)
Oslo, Norway	~110 ms	~35 ms	~2 ms
Bergen, Norway	~115 ms	~40 ms	~8 ms
Stockholm, Sweden	~115 ms	~30 ms	~12 ms

By placing your CoolVDS instance locally, you are routing traffic through NIX (Norwegian Internet Exchange) peers, drastically reducing hops. For fintech or real-time bidding apps, this isn't a luxury; it's a requirement.

Step 4: Rate Limiting (DDoS Protection)

Performance isn't just about speed; it's about stability. A single abusive client can saturate your workers. Implementing a limit_req_zone is mandatory.

http {
    # Define a zone named 'api_limit' with 10MB memory, allowing 10 requests/sec
    limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;

    server {
        location /api/ {
            # Burst allows brief spikes (up to 20), nodelay processes them instantly
            limit_req zone=api_limit burst=20 nodelay;
            
            proxy_pass http://backend_service;
        }
    }
}

This configuration is polite but firm. It allows legitimate users to burst traffic (loading a dashboard) but clamps down on scrapers or denial-of-service attempts. Combined with CoolVDS's network-level DDoS protection, your gateway remains resilient.

Conclusion: Stop Tolerating Lag

In 2019, there is no excuse for a sluggish API. The tools are mature, HTTP/2 is standard, and hardware like NVMe is accessible. The difference between a mediocre platform and a market leader often comes down to the milliseconds shaved off during the handshake.

You have the config. You have the kernel tweaks. Now you need the engine to run it.

Don't let slow I/O kill your SEO or your user experience. Deploy a high-performance test instance on CoolVDS in 55 seconds and see the difference raw NVMe power makes.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

API Gateway Performance Tuning: Breaking the 100ms Barrier in 2019

API Gateway Performance Tuning: Breaking the 100ms Barrier

The Hardware Foundation: Why I/O Wait Kills APIs

Step 1: The OS Layer (Kernel Tuning)

Step 2: Nginx Worker Configuration

Step 3: Upstream Keepalives & SSL

The Norwegian Latency Advantage

Step 4: Rate Limiting (DDoS Protection)

Conclusion: Stop Tolerating Lag

/// RELATED POSTS

API Gateway Tuning: Crushing Latency in High-Traffic Nordic Systems

Silence the Noise: Advanced APM Strategies for High-Throughput Norwegian Systems

Bun vs. Node.js in 2025: Why High-Performance Runtimes Die on Cheap VPS Hardware

Zero-Compromise API Gateway Tuning: Reducing Latency from Oslo to the Edge

Nordic Latency Killers: Advanced API Gateway Tuning for High-Throughput Systems

Zen 5 in the Datacenter: Why We Deployed AMD Ryzen 9000 Series for High-Performance VDS