Stop `kubectl apply`-ing Your Way to Disaster
If you are still SSH-ing into servers to tweak nginx.conf or running kubectl apply -f . from your laptop, you are a liability. Iβve said it. I've watched entire clusters implode because a senior engineer "just wanted to fix a quick typo" manually and forgot to commit the change. Three weeks later, the CI pipeline overwrote the hotfix, and the site went dark during a Black Friday flash sale.
In 2021, there is zero excuse for this. We aren't managing pet servers anymore; we are managing herds of cattle, and if you treat them like pets, they will bite you. The only way to guarantee consistency between what you think is running and what is running is GitOps.
The Architecture of Truth
GitOps isn't just a buzzword to throw around on LinkedIn. It is a strict operational framework where Git is the single source of truth. If it isn't in the repo, it doesn't exist.
For a robust setup targeting the European marketβwhere we have to worry about GDPR, Schrems II, and Datatilsynet breathing down our necks regarding data sovereigntyβI recommend the following stack currently stable as of late 2021:
- VCS: Self-hosted GitLab (preferred for data control) or GitHub.
- Controller: ArgoCD v2.1+ (It handles visual diffs better than Flux right now).
- Secret Management: Bitnami Sealed Secrets (simple) or HashiCorp Vault (complex).
- Infrastructure: KVM-based Virtual Private Servers (like CoolVDS) for the control plane.
Pro Tip: Don't run your GitOps controller on the same cluster it manages if you can avoid it. If the cluster goes down, you lose the tool you need to fix it. We run our ArgoCD instances on a dedicated CoolVDS management node in Oslo to ensure low latency access to the NIX (Norwegian Internet Exchange) while keeping management traffic separate from public traffic.
The Workflow: From Commit to Container
Here is the workflow I enforced at my last gig. It reduced deployment-related incidents by 90% in the first quarter.
1. The Repository Structure
Stop putting application code and infrastructure manifests in the same repo. It creates a noisy commit history and triggers unnecessary CI builds. Split them.
/my-app-source-code
βββ src/
βββ Dockerfile
βββ .gitlab-ci.yml
/my-infrastructure-repo
βββ base/
β βββ deployment.yaml
β βββ service.yaml
βββ overlays/
βββ production/
β βββ kustomization.yaml
β βββ patch-replicas.yaml
βββ staging/
2. The CI Pipeline (Continuous Integration)
The CI's only job is to run tests, build the Docker image, push it to the registry, and update the manifest repository. It does not touch the cluster.
Here is a snippet from a .gitlab-ci.yml that updates the image tag in the infrastructure repo using kustomize:
deploy_production:
stage: deploy
image: line/kubectl-kustomize:latest
script:
- git clone https://gitlab.com/org/infra-repo.git
- cd infra-repo/overlays/production
- kustomize edit set image my-app-image=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
- git config user.email "ci-bot@coolvds.com"
- git config user.name "CI Bot"
- git commit -am "Bump image to $CI_COMMIT_SHA"
- git push origin main
only:
- tags
3. The CD Controller (Continuous Deployment)
Once the manifest repo is updated, ArgoCD detects the drift. It sees that the Git state (new image SHA) differs from the Cluster state (old image SHA). It synchronizes them.
Here is a battle-hardened Application manifest. Note the selfHeal policy. If someone manually changes a replica count on the server, ArgoCD immediately reverts it.
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: production-payment-gateway
namespace: argocd
spec:
project: default
source:
repoURL: 'git@gitlab.com:org/infra-repo.git'
targetRevision: HEAD
path: overlays/production
destination:
server: 'https://kubernetes.default.svc'
namespace: payments
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
The Hardware Reality: Why IOPS Matter for GitOps
This is where many architects fail. They design a beautiful software architecture but deploy it on garbage infrastructure.
Tools like ArgoCD and the Kubernetes API server (etcd) are extremely chatty. They are constantly reading state, writing to the database, and checking diffs. I debugged a "broken" GitOps pipeline last month that turned out to be disk I/O latency. The provider's shared storage was saturated, causing etcd to timeout, which made the controller think the cluster was unresponsive.
You cannot tolerate "noisy neighbors" stealing your I/O cycles. This is why we reference CoolVDS in our internal wiki. Their NVMe storage stack provides the consistent random Read/Write speeds required for a responsive control plane. When I run fio benchmarks on a CoolVDS instance versus a standard cloud VPS, the difference in latency consistency is stark.
| Metric | Standard Cloud VPS | CoolVDS (KVM + NVMe) |
|---|---|---|
| Random Read (4k) | 2,500 IOPS | 45,000+ IOPS |
| Disk Latency (99th percentile) | 15ms - 40ms | < 0.5ms |
| Etcd Sync Duration | Variable (spikes) | Consistent |
Solving the Secret Problem
You cannot check passwords into Git. If you do, you have to rotate them immediately. In 2021, the cleanest approach for teams who don't want the overhead of HashiCorp Vault is Bitnami Sealed Secrets.
It uses asymmetric encryption. You encrypt the secret on your laptop using a public key. This produces a SealedSecret CRD that is safe to commit to public Git repos. Only the controller running inside the cluster (which holds the private key) can decrypt it.
# Create a secret locally (dry-run)
kubectl create secret generic db-creds \
--from-literal=password=SuperSecret123 \
--dry-run=client -o yaml > secret.yaml
# Seal it (safe for Git)
kubeseal --format=yaml < secret.yaml > sealed-secret.yaml
# Apply it (GitOps)
git add sealed-secret.yaml && git commit -m "Add db creds"
Compliance and the "Schrems II" Headache
For those of us operating in Norway and the broader EEA, the Schrems II ruling has complicated using US-owned cloud providers. If your GitOps controller is hosted on a US cloud, and it processes secrets or PII, you are in a grey area.
Hosting your GitOps control plane on a Norwegian VPS provider like CoolVDS mitigates this risk. Your data stays in Oslo. The jurisdiction is clear. When the auditors come knocking, you can point to the physical location of the servers and the lack of third-party data transfers.
Final Thoughts
GitOps is not optional for serious operations. It provides an audit trail, instant rollback capabilities, and disaster recovery (just re-apply the repo to a new cluster).
However, your workflow is only as reliable as the metal it runs on. Don't let IOPS bottlenecks masquerade as software bugs. Ensure your control plane has the dedicated resources it needs.
Ready to harden your infrastructure? Spin up a CoolVDS NVMe instance today and experience the difference low latency makes for your API server.