Console Login

GitOps Workflows: Stop SSH-ing Into Production (2019 Guide)

GitOps Workflows: Stop SSH-ing Into Production

If I catch you running kubectl apply -f from your laptop against a production cluster, we are going to have a problem. It is late 2019. We have moved past the era of "Cowboy DevOps." I have seen too many platforms crash because a junior dev manually patched a config map, went home, and left the night shift to deal with the configuration drift when the autoscaler kicked in.

The concept is simple, yet implementation is painful: Git is the single source of truth. If it is not in the repo, it does not exist in the cluster. This is GitOps.

In this guide, we are going to build a reconciliation loop that actually works, using the current stable stack: Kubernetes 1.16, ArgoCD, and GitLab CI. We will also discuss why underlying hardware performance (IOPS) is the silent killer of CD controllers.

The Architecture: Pull vs. Push

For years, we relied on Jenkins pipelines to "push" changes. The CI server had the cluster credentials. This is a security nightmare. If your Jenkins instance gets compromised, your entire infrastructure is open.

The GitOps "Pull" model reverses this. An agent inside the cluster (the Controller) watches the Git repository. When it detects a change (a commit), it pulls the manifest and applies it. No cluster credentials ever leave the secure environment.

The Stack for Late 2019

  • Infrastructure: CoolVDS NVMe KVM Instances (Oslo Datacenter).
  • Orchestration: Kubernetes 1.16 (No more extensions/v1beta1, update your manifests!).
  • CD Controller: ArgoCD v1.2.
  • CI: GitLab CI (Self-hosted or SaaS).

Step 1: The Infrastructure Layer

GitOps relies heavily on the control plane. Your CD controller (ArgoCD) is constantly hashing the state of your Git repo against the live state of your cluster. This is CPU and I/O intensive.

Pro Tip: Do not put your control plane on cheap, oversold VPS hosting. I use CoolVDS because they offer true KVM virtualization. When ArgoCD starts reconciling 50+ microservices, shared CPU steal time on lesser providers will cause your sync operations to time out. You need dedicated cycles.

Furthermore, if you are operating in Norway, latency to the repo matters. With CoolVDS peering directly at NIX (Norwegian Internet Exchange), the round-trip time for pulling large manifests is negligible.

Step 2: The Application Manifest

Let's look at a standard stateless application deployment. Note the API version. In Kubernetes 1.16, deprecated APIs are finally being rejected. If you are migrating legacy manifests, this will break.

apiVersion: apps/v1 kind: Deployment metadata: name: backend-api namespace: production labels: app: backend-api spec: replicas: 3 selector: matchLabels: app: backend-api template: metadata: labels: app: backend-api spec: containers: - name: api image: registry.coolvds.com/my-org/backend:v1.4.2 ports: - containerPort: 8080 resources: requests: memory: "128Mi" cpu: "250m" limits: memory: "256Mi" cpu: "500m" livenessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 3 periodSeconds: 3

Step 3: Configuring the CD Controller

We are using ArgoCD here. It is rapidly outpacing Flux v1 because of its UI and easier visualization of dependencies. However, do not get addicted to the UI. The configuration of Argo itself should be declarative.

Here is how we define the "Application" CRD to tell ArgoCD what to watch:

apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: production-backend namespace: argocd spec: project: default source: repoURL: git@gitlab.com:my-org/infra-manifests.git targetRevision: HEAD path: k8s/production/backend destination: server: https://kubernetes.default.svc namespace: production syncPolicy: automated: prune: true selfHeal: true

Critical Analysis: Look at selfHeal: true. This is the magic. If a developer manually changes the replica count on the cluster to fix a load issue, ArgoCD will immediately detect the drift and revert it back to the state defined in Git. This forces discipline. If you want to scale, you commit the change to Git.

Step 4: The CI Pipeline (Build & Tag)

The CI system (GitLab/Jenkins) should never touch the cluster. Its only job is to build the Docker image, push it to the registry, and then update the manifest repository with the new tag.

Here is a snippet for .gitlab-ci.yml using kaniko (which is safer than Docker-in-Docker):

build: stage: build image: name: gcr.io/kaniko-project/executor:debug entrypoint: [""] script: - echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json - /kaniko/executor --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/Dockerfile --destination $CI_REGISTRY_IMAGE:$CI_COMMIT_TAG

Compliance and Data Sovereignty

We are seeing stricter enforcement of GDPR locally. The Datatilsynet (Norwegian Data Protection Authority) is watching. While code manifests usually don't contain PII (Personally Identifiable Information), the Secrets often managed alongside them do.

Never commit raw secrets to Git. In 2019, the standard approach is Sealed Secrets (by Bitnami) or Mozilla SOPS. These tools encrypt the secret locally. You commit the encrypted blob. The controller inside the cluster (running on your secure CoolVDS instance) has the private key to decrypt it.

Feature Push (Jenkins/GitLab CI) Pull (GitOps/ArgoCD)
Cluster Credentials Stored in CI Server (Risky) Stay inside Cluster (Secure)
Drift Detection No Yes (Instant Reversion)
Disaster Recovery Manual redeploy kubectl apply -f gitops-repo

Why Performance Matters for GitOps

I recently audited a setup where the GitOps agent was timing out. The culprit? Slow disk I/O on the etcd datastore. Kubernetes is chatty. When you add a GitOps controller scanning hundreds of objects every minute, you generate significant IOPS.

This is where standard VPS providers fail. They throttle your IOPS. For a robust GitOps implementation, I recommend deploying on CoolVDS NVMe instances. The low latency storage ensures etcd remains stable, and the reconciliation loops happen instantly. If your infrastructure is sluggish, your "automated" deployments become a bottleneck.

Final Thoughts

GitOps is not about tools; it is about workflow. It forces a contract between Dev and Ops. If you are still manually editing YAML on the server, stop. It takes 10 minutes to spin up a new CoolVDS instance and bootstrap a clean K8s cluster. Do it right, or don't do it at all.