Console Login

Taming the Microservices Hydra: Implementing a Service Mesh with Linkerd on Kubernetes

Taming the Microservices Hydra: Implementing a Service Mesh with Linkerd on Kubernetes

Let’s be honest with ourselves. We all bought the ticket to the microservices hype train in 2016. We chopped up our reliable (albeit messy) PHP and Java monoliths into twenty different Go and Node.js services. We containerized everything with Docker. We felt modern.

And then production went down at 3:00 AM.

The problem wasn't the code inside the containers. The problem was the network. When you replace a function call with a network call, you introduce latency, packet loss, and connection refusals. We traded a complexity problem for a distributed computing problem. Suddenly, debugging a slow request involves tcpdump on five different nodes rather than reading a stack trace.

This is where the concept of a Service Mesh comes in. While tools like Netflix's Hystrix/Ribbon libraries have handled this for Java shops, the rest of us need a polyglot solution. Right now, in early 2017, Linkerd is the tool actually delivering on this promise.

The Architecture: Sidecars and Proxies

A service mesh abstracts the network away from your application. Instead of your frontend service talking directly to your backend API, it talks to a local proxy. That proxy handles load balancing, circuit breaking, and retries. Your code assumes the network is perfect; the mesh handles the reality that it isn't.

For this implementation, we are going to look at deploying Linkerd (built on Twitter's battle-tested Finagle) onto a Kubernetes 1.5 cluster. We will deploy it as a DaemonSet, ensuring one Linkerd router runs on every node in our cluster.

Pro Tip: Linkerd runs on the JVM. It is not lightweight. If you are trying to run this on a cheap $5 VPS with 512MB RAM, you will crash. You need dedicated resources. This is why we deploy our production meshes on CoolVDS NVMe instances. The high IOPS on the NVMe drives helps with the JVM startup, and the guaranteed RAM allocation ensures the Garbage Collector doesn't pause traffic during peak loads.

Step 1: The Linkerd Configuration

The heart of Linkerd is the routing configuration. We use dtabs (delegation tables) to define how logical names map to concrete addresses in Kubernetes. Here is a robust configuration file for a basic K8s environment.

admin:
  port: 9990

routers:
- protocol: http
  label: outgoing
  dtab: |
    /svc       => /#/io.l5d.k8s/default/http;
    /host      => /svc;
    /svc/world => /srv/world-v1;
  interpreter:
    kind: default
    transformers:
    - kind: io.l5d.k8s.localnode
  servers:
  - port: 4140
    ip: 0.0.0.0

telemetry:
- kind: io.l5d.prometheus
- kind: io.l5d.recentRequests
  sampleRate: 0.25

This configuration sets up an HTTP router on port 4140. It uses the Kubernetes API to discover where your pods are running. Notice the telemetry section? This is critical. It exposes metrics that we can scrape with Prometheus. If you aren't monitoring latency percentiles (p95, p99), you are flying blind.

Step 2: Deploying the DaemonSet

We need to ensure Linkerd starts on every worker node. In Kubernetes 1.5, we use the DaemonSet resource. Create a file named linkerd-ds.yaml:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  labels:
    app: linkerd
  name: linkerd
spec:
  template:
    metadata:
      labels:
        app: linkerd
    spec:
      volumes:
      - name: l5d-config
        configMap:
          name: "l5d-config"
      containers:
      - name: l5d
        image: buoyantio/linkerd:0.8.6
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        args:
        - /io.buoyant/linkerd/config/config.yaml
        ports:
        - name: outgoing
          containerPort: 4140
          hostPort: 4140
        - name: admin
          containerPort: 9990
          hostPort: 9990
        volumeMounts:
        - name: "l5d-config"
          mountPath: "/io.buoyant/linkerd/config"
          readOnly: true

Deploy this with the standard command:

kubectl create -f linkerd-ds.yaml

Once running, every node in your cluster now has a smart router listening on port 4140. Your applications can simply set their http_proxy environment variable to $(NODE_NAME):4140, and Linkerd will handle the service discovery.

Why Infrastructure Consistency Matters

Implementing a service mesh adds a