Console Login

Stop SSH-ing Into Production: A Battle-Tested GitOps Workflow for 2019

Stop SSH-ing Into Production: A Battle-Tested GitOps Workflow for 2019

I still remember the silence in the room. It was 3:00 AM on a Tuesday, and a fatigue-ridden developer had just typed kubectl delete namespace prod instead of dev. There were no guardrails. No audit trail. Just an instant void where our client's e-commerce platform used to be. That incident wasn't a failure of personnel; it was a failure of architecture. If you are still manually applying manifests or, god forbid, editing files directly on the server via SSH, you are essentially gambling with your infrastructure.

In late 2019, the standard for reliability isn't just "uptime"; it's recoverability. This is where GitOps comes in. By using Git as the single source of truth for your declarative infrastructure, you eliminate configuration drift and manual error. If the cluster state diverges from Git, the cluster corrects itself. If a node dies, the state remains safe in the repo.

Below, I’m going to walk you through a setup I’ve been deploying for clients across Oslo and Bergen who demand absolute data sovereignty and stability. We will use Kubernetes, Helm 2 (yes, we’ll secure Tiller), and Flux.

The Architecture: Pull vs. Push

Most CI/CD pipelines today are "Push" based. Jenkins or GitLab CI builds an artifact and runs a script to deploy it to the server. The problem? The CI server needs god-mode access to your production cluster. That is a massive security vector.

We are implementing a "Pull" based workflow. The cluster reaches out to the Git repository to check for updates. This means you don't expose your Kubernetes API to the outside world. For those of us hosting in Norway to comply with strict GDPR mandates and keep data under the watchful eye of Datatilsynet, reducing external attack surfaces is paramount.

Pro Tip: Network latency matters. If your Git repo is hosted in the US (GitHub) but your nodes are in Oslo, sync times can lag. Hosting a GitLab instance on a CoolVDS NVMe instance directly in Norway ensures your "source of truth" is milliseconds away from your production workload.

Step 1: The Infrastructure Foundation

Before we touch software, we need hardware. GitOps agents like Flux are constantly polling. If you run this on a budget VPS with "shared" vCPUs, the CPU steal (stolen time) from noisy neighbors will cause the reconciliation loop to hang. I’ve seen Flux time out simply because the host node was oversold.

We use CoolVDS KVM instances because they offer true resource isolation. For a standard Kubernetes master node, we strictly define the I/O scheduler to prioritize latency.

# Check your I/O scheduler
cat /sys/block/vda/queue/scheduler
# [mq-deadline] none

# If you are running etcd on the same node, ensure you are using NVMe.
# Spinning rust (HDD) will cause etcd leader elections to fail under load.

Step 2: Securing Helm 2 (The Tiller Problem)

As of late 2019, Helm 3 is still in beta, so we are sticking with Helm 2 for production stability. However, Tiller (the server-side component) is a known security risk if installed blindly. It runs with root privileges by default. We must lock this down.

Here is how to install Tiller with a dedicated ServiceAccount and limit its blast radius:

# Create a ServiceAccount for Tiller
kubectl -n kube-system create serviceaccount tiller

# Create a ClusterRoleBinding (or RoleBinding for namespace isolation)
kubectl create clusterrolebinding tiller \n  --clusterrole=cluster-admin \n  --serviceaccount=kube-system:tiller

# Initialize Helm with the service account
helm init --service-account tiller --wait

Step 3: implementing Flux

Flux will live inside your cluster and monitor a specific Git repository. When it sees a new Docker image tag or a change in YAML, it applies it.

First, add the Flux chart repo:

helm repo add fluxcd https://charts.fluxcd.io

Now, deploy Flux. Note that we are setting the git.pollInterval to 1 minute. On standard hosting, this aggressive polling might trigger CPU limit throttling. On high-performance infrastructure like CoolVDS, it barely registers.

helm install --name flux \n--set git.url=git@gitlab.com:your-org/k8s-config.git \n--set git.branch=master \n--set git.pollInterval=1m \n--set registry.pollInterval=1m \n--namespace flux \nfluxcd/flux

Once installed, you need the SSH key to add to your Git provider as a Deploy Key:

fluxctl identity --k8s-fwd-ns flux

Step 4: The Deployment Manifest

Your application repository and your configuration repository should be separate. In your config repo, you define the desired state. Here is a robust Nginx deployment example that includes resource quotas—something many devs forget until the server crashes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nordic-app
  namespace: production
  annotations:
    flux.weave.works/automated: "true"
    flux.weave.works/tag.chart-image: glob:1.2.*
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nordic-app
  template:
    metadata:
      labels:
        app: nordic-app
    spec:
      containers:
      - name: nginx
        image: nginx:1.17.3-alpine
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "64Mi"
            cpu: "50m"
          limits:
            memory: "128Mi"
            cpu: "100m"
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 10

Notice the annotation flux.weave.works/automated: "true". This tells Flux: "If you see a new image in the registry matching 1.2.*, update this deployment automatically." This is the magic. Your CI pipeline builds the image, pushes it to the registry, and Flux handles the rollout inside the cluster.

Handling Secrets in Git

You cannot commit raw secrets to Git. That is a GDPR violation waiting to happen. In 2019, the best practice is Sealed Secrets by Bitnami. It uses asymmetric cryptography. You encrypt the secret on your laptop using a public key, and only the controller running inside your CoolVDS cluster (which holds the private key) can decrypt it.

# Client side encryption
kubeseal --format=yaml < secret.yaml > sealed-secret.yaml

You can safely commit sealed-secret.yaml to a public repo. Even if someone clones it, they cannot read your database credentials.

Why Infrastructure Choice Dictates Success

You can have the most elegant GitOps workflow in the world, but if the underlying virtualization layer introduces latency or packet loss, your synchronization will fail. We often see "etcd" timeouts in logs from clients migrating away from budget providers. Etcd is incredibly sensitive to disk write latency.

At CoolVDS, we don't just throw "SSD" on the spec sheet. We use enterprise-grade NVMe storage with high queue depths specifically to handle the I/O patterns of Kubernetes and databases. When you are serving traffic to Oslo, every millisecond of latency at the disk layer eventually bubbles up to the user.

Final Thoughts

The transition to GitOps requires a mental shift. You must resist the urge to hot-fix production manually. But the payoff is immense: a self-healing infrastructure that documents its own history. For European companies navigating the complexities of data privacy and uptime requirements, this isn't just a trend; it's the new baseline for professional operations.

Ready to build a cluster that doesn't keep you awake at night? Don't let slow I/O kill your API server. Deploy a high-performance KVM instance on CoolVDS in 55 seconds and see the difference dedicated resources make.