GitOps in Production: Stop kubectl apply Before You Break The Cluster
It is 3:00 AM on a Saturday. Your monitoring dashboard is bleeding red. Someone executed a hotfix directly on the production cluster three days ago, bypassing the repository. Now, the autoscaler triggered a new node, the pod rescheduled, and that manual configuration is gone. The site is down.
If this scenario sounds familiar, your workflow is broken. In 2020, there is zero excuse for managing infrastructure state from a developer's laptop. We are moving beyond the era of scripted deployments into the era of GitOpsβwhere Git is the single source of truth, and your cluster synchronizes itself.
I have spent the last six months migrating a major Norwegian fintech platform from Jenkins pipelines to a pure GitOps model. The latency requirements were strict, and compliance with Datatilsynet (The Norwegian Data Protection Authority) was non-negotiable. Here is how we built it, the tools we used, and why the underlying hardware (specifically NVMe-backed KVM) determines your success or failure.
The Core Philosophy: CI is not CD
The biggest mistake teams make is conflating Continuous Integration (CI) with Continuous Delivery (CD). In a traditional setup, Jenkins or GitLab CI builds the container and then runs a script to push it to the cluster. This is push-based. It is fragile. It requires giving your CI server god-mode access to your production environment.
GitOps inverts this. Your cluster pulls the state. The CI pipeline's only job is to build an artifact and update a manifest in a Git repository. An operator inside the cluster (like ArgoCD or Flux) sees the change and syncs it.
Pro Tip: Security is the primary driver here. By using a pull-based mechanism, you stop storing KUBECONFIG credentials in your CI/CD SaaS. If your CI provider gets hacked, your cluster remains locked. This is crucial for GDPR compliance when hosting in Norway.
The Tooling Stack (April 2020 Edition)
While Flux has been the pioneer, ArgoCD has recently matured into the superior choice for visual observability. With the release of Kubernetes 1.18 last month, we are seeing significant stability improvements in the API server that make these controllers incredibly responsive.
1. The Directory Structure
Do not store your application source code and your Kubernetes manifests in the same repo. Separate them. If you don't, a simple README update triggers a deployment loop.
Repo A (Application): Java/Go/Node source code + Dockerfile.
Repo B (Config): Helm charts, Kustomize files, YAML manifests.
2. The CI Pipeline (GitHub Actions / GitLab CI)
Here is a stripped-down example of what your CI should actually do. It builds, pushes, and then commits a tag update to the Config repo.
name: Build and Update Manifest
on:
push:
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Build and Push Docker Image
run: |
docker build -t registry.coolvds.com/app:${{ github.sha }} .
docker push registry.coolvds.com/app:${{ github.sha }}
- name: Update Config Repo
run: |
git clone https://github.com/my-org/infra-config.git
cd infra-config
# Use kustomize to update the tag
cd overlays/production
kustomize edit set image my-app=registry.coolvds.com/app:${{ github.sha }}
git config user.name "CI Bot"
git commit -am "Bump image tag to ${{ github.sha }}"
git push origin master
The Operator: ArgoCD Configuration
Once the manifest is updated in Git, ArgoCD takes over. We run ArgoCD on CoolVDS High-Performance instances because the Redis cache used by Argo requires low-latency I/O to maintain the state of hundreds of microservices. If your underlying storage is standard HDD or shared SATA SSD, you will see sync delays.
Here is the declarative Application manifest we use to bootstrap the cluster. Note the syncPolicy.
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: payment-processor
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/my-org/infra-config.git
targetRevision: HEAD
path: overlays/production
destination:
server: https://kubernetes.default.svc
namespace: payments
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
The selfHeal: true flag is the magic. If a junior dev manually deletes a Service, ArgoCD recreates it immediately. This is self-repairing infrastructure.
Handling Secrets in Git
You cannot commit raw YAML secrets to Git. We use Bitnami Sealed Secrets. It uses asymmetric encryption. You encrypt with a public key (safe to commit), and only the controller running in your cluster (which holds the private key) can decrypt it.
Install the client:
brew install sealed-secrets-cli
Generate a sealed secret:
kubectl create secret generic db-pass --from-literal=password=SuperSecret -o yaml --dry-run=client | kubeseal --format=yaml > db-pass-sealed.yaml
Now you can safely commit db-pass-sealed.yaml to your public repository.
Infrastructure: The Invisible Bottleneck
GitOps is heavy on the Kubernetes API server. Every sync operation involves checking the state of etcd. If your etcd latency spikes, your entire GitOps workflow stalls. The controller will report "Unknown State."
We benchmarked this. On standard VPS providers where CPU stealing is common, etcd fsync latency often exceeds 10ms, triggering leader elections and cluster instability. This is why for production Kubernetes, we strictly use CoolVDS NVMe instances. The direct-attached NVMe storage ensures fsync times are consistently under 2ms.
When hosting in Norway, you also have the advantage of the NIX (Norwegian Internet Exchange). If your users are in Oslo or Bergen, hosting your GitOps controllers and production workloads locally means your container registry pulls happen inside the national grid, drastically reducing deployment times compared to pulling from Frankfurt or Ireland.
Performance Tuning the Controller
To handle high-churn environments, tweak the ArgoCD controller flags in your install.yaml:
containers:
- name: argocd-application-controller
command:
- argocd-application-controller
- --status-processors
- "20"
- --operation-processors
- "10"
- --redis
- argocd-redis:6379
Increasing status processors allows the controller to reconcile more applications in parallel. However, do not increase this without sufficient CPU resources. A CoolVDS 4 vCPU instance handles roughly 500 concurrent application syncs comfortably with these settings.
The Verdict
Transitioning to GitOps is painful for the first week. You will fight with YAML indentation and secret encryption. But once it is running, the peace of mind is absolute. You can destroy your entire cluster, point ArgoCD at the repo, and have production back online in 10 minutes. That is true Disaster Recovery.
Stop patching servers manually. Define your state in Git, encrypt your secrets, and deploy on infrastructure that respects your I/O requirements.
Ready to stabilize your stack? Spin up a CoolVDS NVMe instance in Oslo today and start building your GitOps control plane.