Scaling Microservices: The Reality of gRPC Load Balancing in 2018

The TCP Lie: Why Your Load Balancer is Failing You

I recently watched a competent engineering team deploy a Go-based microservices architecture. They did everything right: protobuf definitions, generated code, solid unit tests. They deployed it to a cluster, threw a standard L4 load balancer in front of it, and went to lunch. By the time they came back, one server was melting at 100% CPU usage while the other three were effectively idling.

This is the gRPC trap. Unlike REST (HTTP/1.1), which cycles connections frequently, gRPC uses HTTP/2. It creates a single, persistent TCP connection and multiplexes multiple requests over it. A standard Layer 4 load balancer sees that one TCP connection, sends it to a backend, and considers its job done. It doesn't know—or care—that you are pushing 5,000 RPS through that single pipe.

If you are building distributed systems in 2018, you cannot rely on legacy balancing strategies. We need to talk about application-layer balancing, specifically effectively handling the HTTP/2 framing.

Strategy 1: The Proxy Approach (Now Viable with NGINX 1.13.10)

Until March of this year, proxying gRPC was a headache requiring hacky TCP passthroughs or relying on the Envoy proxy, which adds significant complexity to the stack. However, NGINX 1.13.10 (released March 2018) finally introduced native gRPC support. This is a game-changer for those of us who prefer stable, battle-tested binaries over bleeding-edge service meshes.

With this update, NGINX can terminate the HTTP/2 connection, inspect the gRPC frames, and redistribute the calls to your upstream backend servers on a request-by-request basis, not just connection-by-connection.

Here is a production-ready snippet for nginx.conf that we are currently testing on our internal CoolVDS monitoring cluster:

http {
    upstream grpcservers {
        # The backend gRPC servers
        server 10.10.5.1:50051;
        server 10.10.5.2:50051;
    }

    server {
        listen 443 http2 ssl;
        server_name api.coolvds.com;

        # Standard SSL config omitted for brevity
        ssl_certificate /etc/ssl/certs/server.crt;
        ssl_certificate_key /etc/ssl/private/server.key;

        location /helloworld.Greeter/ {
            grpc_pass grpc://grpcservers;
            
            # pass the SSL client certificate to the backend if needed
            grpc_set_header X-Ssl-Cert $ssl_client_escaped_cert;
        }
    }
}

Why this matters: By using grpc_pass, NGINX becomes gRPC-aware. It effectively effectively multiplexes incoming calls across the upstream pool. However, be aware of the CPU overhead. Terminating TLS and re-encrypting for the backend (if you use zero-trust networking) requires cycles. On shared hosting, this introduces jitter. This is why we enforce KVM isolation on all CoolVDS instances; you need guaranteed CPU time slices to handle high-throughput TLS termination without "noisy neighbor" interruptions.

Strategy 2: Thick Client-Side Load Balancing

If you control both the client and the server (typical in internal microservices), you might skip the proxy entirely. The gRPC client libraries for Go and Java have built-in support for this, though it is often poorly documented.

Instead of connecting to a VIP (Virtual IP), your client connects to a service discovery backend (like Consul or etcd), fetches the list of IPs, and maintains a connection to all of them. The client then internally creates a sub-channel for each backend and distributes the load.

Here is how you configure the client in Go (using the current gRPC-go v1.11 implementation):

import (
    "google.golang.org/grpc"
    "google.golang.org/grpc/balancer/roundrobin"
)

func main() {
    // This tells the client to look at all resolved addresses
    // and round-robin the requests.
    conn, err := grpc.Dial(
        "dns:///my-service.local:50051", 
        grpc.WithBalancerName(roundrobin.Name),
        grpc.WithInsecure(),
    )
    if err != nil {
        log.Fatalf("did not connect: %v", err)
    }
    defer conn.Close()
    // ... make RPC calls
}

Pro Tip: Client-side balancing offers the lowest latency because there is no middleman hop. However, it complicates your code. You now have to handle resolver logic inside your application. If your service discovery (Consul/Zookeeper) lags, your clients might try to hit dead nodes.

The Infrastructure Factor: Latency and Compliance

Regardless of whether you choose NGINX or client-side balancing, the physical network dictates your floor latency. gRPC is often used for "chatter" between services—hundreds of calls to render a single page. If your latency between microservices is 10ms, and you make 50 sequential calls, your page load is delayed by half a second.

This is where geography hits hard. If you are serving Norwegian users or businesses, hosting your control plane in Frankfurt or London adds avoidable milliseconds.

GDPR is 10 Days Away

We are writing this in mid-May 2018. The GDPR enforcement date (May 25) is looming. gRPC headers often contain metadata that could be considered PII (Personally Identifiable Information). When configuring NGINX or your application logs, ensure you are not blindly dumping the full payload to disk.

At CoolVDS, we've seen an uptick in Norwegian enterprises moving workloads from US-based clouds back to local infrastructure to simplify Datatilsynet compliance. Running your gRPC mesh on servers physically located in Oslo or nearby Nordic hubs ensures data sovereignty and drastically reduces RTT (Round Trip Time) via the NIX (Norwegian Internet Exchange).

Comparison: Proxy vs. Client-Side

Feature	Proxy (NGINX)	Client-Side (Go/Java)
Complexity	Medium (Ops focus)	High (Dev focus)
Latency	Low (1 extra hop)	Lowest (Direct)
Language Support	Agnostic	Language specific
Caching	Centralized	Difficult

Final Thoughts

For most teams I work with in 2018, the NGINX proxy approach is the pragmatic winner. It keeps complexity out of your application code and leverages a tool your Ops team already knows. Just ensure your underlying VPS has the I/O throughput to handle the logging and the CPU grunt for the crypto.

Don't let your infrastructure become the bottleneck for your code. If you need a sandbox to test NGINX 1.13.10 configurations without risking your production setup, spin up a high-performance instance on CoolVDS. We offer pure NVMe storage which is critical when you start logging high-volume gRPC traces.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

Scaling Microservices: The Reality of gRPC Load Balancing in 2018

The TCP Lie: Why Your Load Balancer is Failing You

Strategy 1: The Proxy Approach (Now Viable with NGINX 1.13.10)

Strategy 2: Thick Client-Side Load Balancing

The Infrastructure Factor: Latency and Compliance

GDPR is 10 Days Away

Comparison: Proxy vs. Client-Side

Final Thoughts

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025