Operationalizing GitOps: Surviving the GDPR Crunch with Immutable Infrastructure
It is May 16, 2018. Look at your calendar. You have exactly nine days until the General Data Protection Regulation (GDPR) enforcement begins. If you are still SSH-ing into production servers to "fix a quick config issue" in /etc/nginx/sites-available, you are not just risking downtimeâyou are walking into a compliance minefield.
The concept of GitOpsâusing Git as the single source of truth for declarative infrastructureâis not just a trend pushed by Weaveworks. It is currently the only sane way to manage the complexity of distributed systems while maintaining the audit trail Datatilsynet (The Norwegian Data Protection Authority) will demand if you ever suffer a breach.
I have spent the last month migrating a high-traffic e-commerce platform in Oslo from a chaotic mess of Ansible scripts and manual hotfixes to a strict GitOps workflow on Kubernetes 1.10. Here is how we did it, why storage latency nearly killed the project, and why your choice of infrastructure matters more than your CI tool.
The Core Principle: If It's Not in Git, It Doesn't Exist
In a traditional OPS setup, a sysadmin might increase the max_connections in a MySQL database directly to handle a load spike. The problem? Next time the server reboots or autoscales, that change is lost. Worse, nobody knows who changed it or why.
With GitOps, the workflow changes:
- Declare: You change the config in a YAML file in a Git repository.
- Review: A pull request is created. A senior engineer reviews the diff.
- Merge: The change is merged to the
masterbranch. - Sync: An automated agent (CI pipeline or a cluster operator) detects the change and applies it to the cluster.
Pro Tip: Stop usingkubectl edit. It encourages "config drift." Disable write access to the cluster for individual developers. CI/CD should be the only user withadminprivileges.
The Pipeline: GitLab CI + Kubernetes
For this architecture, we rely on GitLab CI (widely used across Europe for its robust built-in registry) to drive the state. While tools like Weave Flux are gaining traction for the "pull" model, a "push" model via CI is often easier for teams to adopt immediately.
Here is a stripped-down version of a .gitlab-ci.yml that builds a container and updates a deployment. Note the use of Docker 18.03 multi-stage builds to keep images small.
1. The Build Stage
image: docker:18.03.0-ce
services:
- docker:dind
stages:
- build
- deploy
build_image:
stage: build
script:
- docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
2. The Deployment Manifest
You cannot just script imperative commands. You need declarative YAML. Here is a standard Deployment for a stateless app. Pay attention to the resource limitsâfailure to set these allows a memory leak to crash your neighbor nodes.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: payment-gateway
labels:
app: payment-gateway
spec:
replicas: 3
selector:
matchLabels:
app: payment-gateway
template:
metadata:
labels:
app: payment-gateway
spec:
containers:
- name: payment-gateway
image: registry.gitlab.com/org/repo:abc12345
ports:
- containerPort: 8080
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
3. The Sync Command
In your CI deploy stage, you do not run a script. You apply the state. We use envsubst to swap in the image tag dynamically before applying.
deploy_production:
stage: deploy
image: google/cloud-sdk:alpine
script:
- echo "$KUBE_CONFIG" > /tmp/kubeconfig
- export KUBECONFIG=/tmp/kubeconfig
- sed -e "s/IMAGE_TAG/$CI_COMMIT_SHA/" deployment.yaml | kubectl apply -f -
The Infrastructure Bottleneck: Why etcd Needs NVMe
This is where the theory meets the harsh reality of hardware. Kubernetes relies on etcd as its brain. Every time you update a deployment, every time a pod creates a log entry, etcd writes to disk. etcd is extremely sensitive to disk write latency.
In our initial testing on a standard budget VPS provider (spinning disks, shared IOPS), we saw frequent leader election failures. The cluster would split-brain because the disk couldn't write the state fast enough.
Fsync latency must be under 10ms. If your hosting provider over-sells storage, your GitOps workflow will time out, leaving your cluster in a half-deployed zombie state.
| Storage Type | Fsync Latency (Avg) | Kubernetes Stability |
|---|---|---|
| Standard HDD (Shared) | 40ms - 100ms | Critical Failures |
| Standard SSD (SATA) | 5ms - 15ms | Acceptable for Dev |
| CoolVDS NVMe | < 1ms | Production Ready |
This is why we deploy our control plane nodes strictly on CoolVDS instances. The underlying NVMe storage guarantees that etcd never chokes, even when we are pushing updates to 50 microservices simultaneously.
Data Sovereignty and The Norwegian Context
With GDPR looming, the physical location of your Git repository and your worker nodes is paramount. If you are processing data for Norwegian citizens, keeping that data within the EEA (European Economic Area) is mandatory. Hosting on US-controlled clouds adds a layer of legal complexity regarding the Privacy Shield framework.
CoolVDS operates out of compliant data centers with direct peering to NIX (Norwegian Internet Exchange). This offers two benefits:
- Legal Compliance: You know exactly where your bits live.
- Latency: Pulling Docker images from a local registry to a local node is significantly faster than traversing the Atlantic. When you are doing 50 deploys a day, those seconds add up.
Dealing with Secrets (The Hard Part)
The biggest flaw in GitOps is secret management. You cannot commit .env files to Git. In 2018, we have a few options, but the most robust method currently is using Helm (despite the security concerns with Tiller) combined with encrypted values, or manually creating Sealed Secrets.
Here is how we verify secrets exist before a deploy allows the pod to start:
# Check if secret exists
if ! kubectl get secret production-db-creds; then
echo "CRITICAL: Database credentials missing! Aborting deploy."
exit 1
fi
Conclusion: Automate or Expire
The era of manual sysadmin work is ending. The risksâlegal, technical, and operationalâare too high. By adopting a GitOps workflow, you gain a time-travel machine for your infrastructure. If a deployment breaks on Friday at 4:55 PM, you don't debug; you git revert.
But remember: software automation cannot fix hardware limitations. A declarative workflow on unstable I/O is just an automated disaster.
Ready to build a GDPR-ready cluster? Deploy a high-performance NVMe KVM instance on CoolVDS today and get your control plane responding in microseconds, not milliseconds.