Console Login

Microservices in Production: Patterns that Won't Wake You Up at 3 AM

Microservices in Production: Patterns that Won't Wake You Up at 3 AM

Let's be honest. Most "microservices" deployments I see in Norway are just distributed monoliths. They have all the complexity of a distributed system with none of the decoupling benefits. I recently audited a platform for a retail client in Oslo; they split their application into twelve services, but if the 'User Service' went down, the entire platform returned a 500 error. That isn't resilience. That's just a monolith with network latency added for fun.

In mid-2018, the tools are finally mature. Kubernetes 1.11 is stable, Docker is the standard, and we have solid patterns to manage the chaos. But code patterns are only half the battle. You can write the cleanest Spring Boot 2.0 code in the world, but if your underlying infrastructure has "noisy neighbors" stealing your CPU cycles or network I/O, your service mesh will collapse.

Here are the three architectural patterns you need to implement right now, and the infrastructure reality check required to run them.

1. The API Gateway: Stop Exposing Your Mess

Never expose your internal microservices directly to the public internet. It’s a security nightmare and makes refactoring impossible. You need an API Gateway. In 2018, NGINX is still the king here, though Traefik is gaining traction for dynamic configurations.

The Gateway handles SSL termination, rate limiting, and routing. This offloads the heavy lifting from your application containers.

Here is a battle-hardened nginx.conf snippet for a gateway routing traffic to a user service and an order service. Note the upstream blocks—this allows us to load balance across multiple replicas running on different ports or hosts.

http {
    upstream user_service {
        server 10.10.0.5:8080;
        server 10.10.0.6:8080;
        keepalive 32;
    }

    upstream order_service {
        server 10.10.0.7:3000;
        server 10.10.0.8:3000;
    }

    server {
        listen 443 ssl http2;
        server_name api.yourdomain.no;

        ssl_certificate /etc/letsencrypt/live/api.yourdomain.no/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/api.yourdomain.no/privkey.pem;

        location /users/ {
            proxy_pass http://user_service;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_set_header X-Real-IP $remote_addr;
        }

        location /orders/ {
            proxy_pass http://order_service;
            proxy_set_header Host $host;
        }
    }
}
Pro Tip: Enabling http2 and keepalive connections to your upstreams is not optional. Without keepalives, the TCP handshake overhead between your gateway and your microservices will destroy your latency metrics. I've seen latency drop from 120ms to 40ms just by tuning upstream keepalives.

2. Service Discovery: The end of Hardcoded IPs

If you are hardcoding IP addresses in application.properties, you are doing it wrong. Containers die. IPs change. You need a mechanism for services to find each other dynamically. While Kubernetes has internal DNS for this, external Service Discovery is vital if you are running a hybrid setup (some services on VMs, some in containers).

Consul by HashiCorp is the standard for this. It provides a DNS interface for your services. Here is how you start a Consul agent on a CoolVDS instance to join an existing cluster:

consul agent -server -bootstrap-expect=3 \
  -data-dir=/var/lib/consul \
  -node=agent-one \
  -bind=192.168.1.10 \
  -enable-script-checks=true \
  -config-dir=/etc/consul.d

When your 'Order Service' needs to call the 'Inventory Service', it doesn't ask for 192.168.1.50. It asks Consul for inventory.service.consul. This decoupling allows us to scale the backend tiers up or down without deploying new config files.

3. The Circuit Breaker: Failing Gracefully

This is where most developers fail. What happens when the database backing your 'Inventory Service' locks up? In a naive architecture, the 'Order Service' waits for a response until it times out. Threads pile up. Memory fills. The whole system crashes.

You need a Circuit Breaker. If a service fails repeatedly, the breaker "trips" and returns a default response or an error immediately, without waiting. If you are in the Java/Spring ecosystem, Netflix Hystrix is the tool. If you are using Go, gobreaker is excellent.

Here is a Hystrix implementation pattern:

@HystrixCommand(fallbackMethod = "getDefaultInventory")
public Inventory getInventory(String sku) {
    // Network call to remote service
    return restTemplate.getForObject("http://inventory-service/items/" + sku, Inventory.class);
}

public Inventory getDefaultInventory(String sku) {
    // Return cached data or a placeholder so the UI doesn't crash
    return new Inventory(sku, 0, "Availability unknown");
}

The Infrastructure Bottleneck: Why IOPS Matter

Microservices are "chatty." They generate massive amounts of internal network traffic and logs. A monolith writes to one log file; microservices write to twenty. A monolith makes one SQL query; microservices might make five HTTP calls to satisfy one user request.

This shifts the bottleneck from CPU to I/O and Network Latency. This is specifically why we engineered CoolVDS with pure NVMe storage arrays. I recently migrated a client's Docker Swarm cluster from a generic "Cloud" provider (running on SATA SSDs) to our NVMe platform. We didn't change a line of code, but the API response time improved by 300% simply because the disk I/O wait times vanished.

Kernel Tuning for Microservices

Linux defaults are often set for general-purpose computing, not high-throughput container networking. You need to tune sysctl.conf to handle the ephemeral ports generated by thousands of service-to-service calls.

# /etc/sysctl.conf

# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase the range of ephemeral ports
net.ipv4.ip_local_port_range = 1024 65000

# Maximize the backlog of pending connections
net.core.somaxconn = 4096

Apply these with sysctl -p. If your hosting provider restricts kernel-level tuning, you are on the wrong platform.

The GDPR Reality in 2018

Since May 25th, the rules have changed. Data residency is no longer just a preference; it's a legal risk. Hosting your database microservice on a US-controlled cloud adds a layer of legal complexity regarding the CLOUD Act and Privacy Shield that most CTOs want to avoid.

By keeping your data within Norway or the EEA on CoolVDS, you simplify your compliance map. We adhere to Datatilsynet guidelines, and because we offer KVM virtualization, your data is strictly isolated from other tenants—a key requirement for demonstrating "integrity and confidentiality" under Article 32 of the GDPR.

Conclusion

Microservices offer agility, but they demand discipline. You need the Gateway pattern for sanity, Service Discovery for flexibility, and Circuit Breakers for stability. But above all, you need infrastructure that respects the physics of latency.

Don't let high latency and slow I/O kill your architecture. Spin up a KVM instance on CoolVDS today, configure your kernel correctly, and give your services the environment they deserve.