Microservices in Production: Battle-Tested Patterns for Low Latency Architecture
I have seen the logs. I have seen the stack traces. And I have seen a perfectly good eCommerce platform grind to a halt because someone decided to split a monolith into twenty services without calculating the network overhead. Moving to microservices isn't just about breaking up code; it's about trading memory function calls for network packets. If you don't respect the physics of latency, you are building a distributed disaster.
It is late 2018. Docker has stabilized, Kubernetes is winning the orchestration war against Swarm and Mesos, and yet, I still see teams deploying containers on distinct physical servers without a private network strategy. If your users are in Oslo and your API gateway is bouncing requests through a datacenter in Frankfurt before hitting a database in Ireland, you have already failed. Here is how we build microservices that actually scale, keeping the Nordic context—and the Datatilsynet—in mind.
1. The API Gateway Pattern: Nginx as the Guard Dog
Do not expose your microservices directly to the public internet. Just don't. It is a security nightmare and an SSL termination headache. You need a unified entry point. In 2018, while tools like Kong are maturing, a raw, well-tuned Nginx instance is still the most performant tool for the job if you know what you are doing.
The Gateway offloads SSL, handles basic load balancing, and can enforce rate limiting before a request ever hits your application logic. Here is a production-ready snippet for an Nginx gateway routing to a user service and an order service, optimized for high throughput.
worker_processes auto;
worker_rlimit_nofile 65535;
events {
multi_accept on;
worker_connections 16384;
}
http {
# Optimizing for low latency
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# Define upstreams (internal DNS or IP:Port)
upstream user_service {
least_conn;
server user-service:8080 max_fails=3 fail_timeout=30s;
keepalive 32;
}
upstream order_service {
least_conn;
server order-service:8080 max_fails=3 fail_timeout=30s;
keepalive 32;
}
server {
listen 443 ssl http2;
server_name api.yourdomain.no;
# SSL optimizations omitted for brevity
location /v1/users {
proxy_pass http://user_service;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Real-IP $remote_addr;
}
location /v1/orders {
proxy_pass http://order_service;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Real-IP $remote_addr;
}
}
}
Pro Tip: Notice thekeepalive 32;in the upstream block andproxy_http_version 1.1;. Without this, Nginx opens a new TCP connection to your backend microservice for every single request. That handshake overhead will kill your performance at scale. Reuse those connections.
2. Containerization: The Immutable Artifact
If "it works on my machine" is still a valid excuse in your office, you are doing it wrong. We use Docker to ensure that the environment is identical from the developer's laptop to the production KVM slice. Multi-stage builds (introduced recently in Docker 17.05) are mandatory now to keep image sizes down. No one wants to pull a 1GB image over the wire during a rollback.
Here is a lean Dockerfile for a Go-based microservice. It compiles in one container and ships only the binary in a tiny Alpine Linux container.
# Build Stage
FROM golang:1.11-alpine AS builder
# Install git for dependencies
RUN apk update && apk add --no-cache git
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
# Build a static binary
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .
# Final Stage
FROM alpine:3.8
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/main .
EXPOSE 8080
CMD ["./main"]
3. Service Discovery and Orchestration
Hardcoding IP addresses in 2018 is a fireable offense. Services die, restart, and scale up. Their IPs change. You need Service Discovery. While Consul is excellent, if you are running Kubernetes (and you should be, provided you have the ops capacity), it handles this natively via CoreDNS (which became GA in Kubernetes 1.11).
However, running Kubernetes requires raw power. The control plane overhead is real. This is where the underlying infrastructure matters. We don't run containers on shared, oversold HDDs. The I/O wait time caused by "noisy neighbors" on cheap hosting will cause timeouts in your inter-service communication. At CoolVDS, we map NVMe storage directly to KVM instances. This ensures that when etcd writes state to disk, it happens instantly.
The Deployment Configuration
Here is a standard Kubernetes 1.12 deployment manifest. Note the resource limits. If you don't set these, one memory-leaking service will starve the neighbor nodes and crash the whole cluster.
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-processor
labels:
app: payment
spec:
replicas: 3
selector:
matchLabels:
app: payment
template:
metadata:
labels:
app: payment
spec:
containers:
- name: payment-api
image: registry.coolvds.com/payment:v1.4.2
ports:
- containerPort: 8080
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
4. The Data Sovereignty & Latency Nexus
In Norway, we have a unique challenge. We are outside the EU but part of the EEA. GDPR (General Data Protection Regulation), which went into full enforcement this May, hits us just as hard. If your microservices architecture involves sharding customer data across borders, you are inviting scrutiny from Datatilsynet.
Keeping data local isn't just a legal checkbox; it's a performance hack. The speed of light is constant. Roundtrip time (RTT) from Oslo to Amsterdam is roughly 15-20ms. RTT from Oslo to a local CoolVDS node in Oslo is <2ms. In a microservices chain where Request A calls Service B which calls Service C, those latencies compound.
| Scenario | Network Hops | Est. Total Latency Overhead |
|---|---|---|
| Monolith (Local DB) | 0 (In-memory) | < 0.1ms |
| Microservices (CoolVDS Local LAN) | 3 Internal Hops | ~1-2ms |
| Microservices (Cross-Region) | 3 External Hops | ~60-100ms |
That 100ms delay is perceptible to users. It kills conversion rates. This is why infrastructure locality is the unsung hero of software architecture.
5. The Database-Per-Service Dilemma
The hardest pill to swallow: Shared databases are an anti-pattern. If Service A and Service B both write to the same `users` table, you have created a distributed monolith. You have coupled them at the schema level.
Instead, use an event-driven approach. When a user registers, the User Service writes to its own Postgres instance and emits an event (perhaps via RabbitMQ or Kafka) that the Email Service listens to. Yes, this introduces eventual consistency. Yes, it is harder to debug. But it is the only way to scale writes independently.
To run this setup, you need high IOPS. A queue backlog on a slow disk creates a bottleneck that ripples through the system. We configure our KVM host nodes with deadline or noop I/O schedulers to pass throughput control directly to the NVMe controllers, ensuring your message queues never choke.
Final Thoughts
Microservices are not a silver bullet. They are a complexity exchange. You trade code complexity for operational complexity. If you are going to make that trade, ensure your foundation is solid.
You need raw compute that doesn't steal CPU cycles, storage that keeps up with asynchronous writes, and a network that keeps your packets within the Norwegian border. Don't let your infrastructure be the bottleneck your code has to work around.
Ready to test your architecture? Spin up a CoolVDS High-Frequency Compute instance in Oslo today. Ping times so low, you'll think it's localhost.