GitOps in Production: Stop Manual Kubectl Deployments Before You Break Prod

If you are still SSHing into your production servers to run docker pull or, worse, running kubectl apply -f . from your laptop, you are a ticking time bomb. I’ve seen it happen too many times. A developer hot-patches a config map to fix a "critical bug" at 2 AM. Two weeks later, the cluster autoscales, the pod restarts, and the patch vanishes. The site goes down. The logs are useless because the change was never committed.

It is 2018. We have better ways to handle this. The industry is coalescing around a concept Weaveworks calls GitOps. It is not just a buzzword; it is the only sane way to manage distributed systems at scale.

At its core, GitOps forces a simple rule: Git is the single source of truth. If it is not in the repo, it does not exist in the cluster. Period.

The Architecture of Truth

In a traditional push-based pipeline (like Jenkins jobs of yore), the CI server has the "keys to the kingdom." It builds the artifact and pushes it to the server. If the CI server is compromised, your production environment is toast. Furthermore, the CI server doesn't know if the actual state of the cluster matches the desired state. It only knows it ran a script.

GitOps flips this. You use an operator inside the cluster (like Weave Flux) that pulls changes.

Code: Developer pushes code to Git.
Build: CI (GitLab CI, CircleCI) builds the Docker image and pushes it to a private registry.
Config Update: The CI (or a developer) updates the deployment manifest in a separate config repository with the new image tag.
Sync: The cluster operator detects the change in the config repo and applies it.

This separates the build process from the deployment process. It is safer. It is cleaner. And it provides an audit trail that makes compliance auditors in the EU very happy.

Implementing the Workflow

Let's look at how this works in practice. We assume you are running a Kubernetes cluster (version 1.10+ recommended). For the underlying infrastructure, you need raw compute stability. Containers add abstraction overhead; running them on oversold shared VPS infrastructure is asking for I/O waits. We use CoolVDS KVM instances for our worker nodes because the dedicated resources ensure the Kubelet doesn't timeout due to neighbor noise.

1. The Container Build

First, optimize your build. We are seeing a lot of bloat in images lately. Use multi-stage builds (available since Docker 17.05) to keep your runtime artifacts small.

FROM golang:1.11-alpine as builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .

FROM alpine:3.8
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/main .
CMD ["./main"]

2. The CI Pipeline

In your `.gitlab-ci.yml`, do not deploy. Just build and tag. Here is a stripped-down example of what a 2018-era pipeline stage looks like:

stages:
  - build
  - release

docker_build:
  stage: build
  image: docker:stable
  services:
    - docker:dind
  script:
    - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

3. The Deployment Manifest

This is where the magic happens. In a separate repository (infrastructure-as-code), you define your state. This is what the cluster monitors.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: payment
  template:
    metadata:
      labels:
        app: payment
    spec:
      containers:
      - name: payment
        # The tag below is updated automatically by your CI or Flux
        image: registry.coolvds.com/payment-service:a1b2c3d 
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"

Pro Tip: Always define resource limits. If you don't, a memory leak in one pod can OOM (Out of Memory) kill your entire node. On CoolVDS, we respect these limits strictly at the hypervisor level, but Kubernetes needs to know about them to schedule effectively.

The Norwegian Context: Latency and Law

Why does infrastructure location matter in a GitOps workflow? Two reasons: Sync Latency and GDPR.

When you update your Git repository, your cluster pulls the new image. If your registry is in the US and your nodes are in Oslo, you are pulling gigabytes of data across the Atlantic. This slows down your "Mean Time to Recovery" (MTTR). By keeping your container registry and your worker nodes in the same datacenter—preferably peered directly at NIX (Norwegian Internet Exchange)—you cut retrieval times drastically.

Furthermore, GDPR is in full effect as of May this year. Datatilsynet is not to be trifled with. If you are mounting PersistentVolumes (PVs) to your pods, that data must reside within the legal framework you promised your customers. Using a US-based cloud provider introduces complexities regarding the Cloud Act. Hosting on CoolVDS in Norway simplifies this compliance architecture significantly. Your data stays here.

Handling "Drift"

The beauty of GitOps is drift detection. If someone manually changes the replica count to 5 using `kubectl scale`, the operator (like Weave Flux) wakes up, sees the Git repo says "3", and scales it back down to 3 immediately.

This enforces discipline. No more "cowboy engineering." If you want to scale, you make a Pull Request. The team reviews it. You merge it. The cluster updates.

Comparison: Push vs. Pull

Feature	CIOps (Jenkins Push)	GitOps (Cluster Pull)
Security Credentials	CI Server has Root Cluster Access	Cluster has Read-Only Git Access
Drift Detection	None (Fire and Forget)	Constant (Self-Healing)
Disaster Recovery	Complex Re-run of jobs	`kubectl apply -f git-repo`

Conclusion

Automation is not about being lazy. It is about being consistent. In 2018, manual server management is a professional negligence risk. By adopting GitOps, you gain an audit trail, automated recovery, and better security posture.

However, your orchestration is only as good as the metal it runs on. A Kubernetes node in a NotReady state because the underlying storage is thrashing will break your pipeline regardless of how clean your Git history is.

For your next cluster, ensure you are building on high-performance NVMe infrastructure that guarantees the IOPS your etcd and workloads demand. Deploy a CoolVDS instance today and stop fighting your infrastructure.

🍪 We Value Your Privacy

Privacy & Cookie Settings

Your Privacy Rights

GitOps in Production: Stop Manual Kubectl Deployments Before You Break Prod

GitOps in Production: Stop Manual Kubectl Deployments Before You Break Prod

The Architecture of Truth

Implementing the Workflow

1. The Container Build

2. The CI Pipeline

3. The Deployment Manifest

The Norwegian Context: Latency and Law

Handling "Drift"

Comparison: Push vs. Pull

Conclusion

/// RELATED POSTS

Edge Computing in Norway: Architecting for Sub-5ms Latency in 2025

Kubernetes Networking Deep Dive: Optimizing Packet Flow for Low Latency in 2025

Surviving the Packet Storm: A Deep Dive into Kubernetes Networking & CNI Performance in 2025

Surviving the Millisecond War: Edge Computing Architectures for the Nordic Market

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Serverless Without the Handcuffs: Implementing Private FaaS Patterns on High-Performance VDS in 2025