Stop Your CI/CD Pipelines From Bleeding Money: Optimizing Build Latency in 2020
I recently watched a senior developer spend forty minutes playing ping-pong while waiting for a deployment pipeline to finish. It wasn't a massive monolith. It was a standard microservice. The culprit? Shared runners on a major public cloud provider choking on I/O wait.
If you are relying on default shared runners for your CI/CD, you are essentially renting a noisy apartment for a library. It doesn't work. In the Nordic market, where developer salaries are among the highest in the world, having engineers wait on npm install is not just annoying; it is fiscal negligence. Efficiency is key.
This guide dives into the architecture of high-performance self-hosted runners using GitLab CI (the standard for many Norwegian dev shops) and how raw infrastructure choices impact your Time-to-Deployment.
The Hidden Bottleneck: I/O Wait
Most people blame CPU when their builds are slow. They are usually wrong. CI/CD processes are heavily I/O bound. Think about what happens during a pipeline:
- Git cloning (disk write)
- Restoring cache (network + disk write)
- Extracting dependencies (massive small-file random write)
- Building Docker images (layer extraction and compression)
On a standard cloud instance with network-attached storage (Ceph or EBS equivalents), you are capped by IOPS. When your neighbor on the physical host decides to re-index their Elasticsearch cluster, your build stalls. I have seen docker build times fluctuate between 2 minutes and 12 minutes on the exact same commit, purely due to noisy neighbors.
The Fix: Self-Hosted Runners on NVMe
To fix this, we move from shared infrastructure to isolated, self-hosted runners. We need KVM virtualization to ensure our resources aren't stolen, and we need local NVMe storage. Spinning rust or network storage won't cut it.
Here is the setup we deployed for a client in Oslo recently. They needed to keep data within Norwegian borders due to strict interpretation of Datatilsynet guidelines regarding intellectual property, and they needed speed.
1. The Infrastructure Layer
We provisioned a CoolVDS NVMe instance. Why? Because the disk I/O is local to the hypervisor. We aren't waiting for the network.
Pro Tip: When choosing a VPS for CI/CD, ignore the CPU core count marketing. Look at the disk benchmarks. A 2-core server with local NVMe will beat a 6-core server with SATA SSDs for pipeline workloads every single time.
2. System Tuning
Before installing the runner, we need to prep the kernel for high-throughput container churning. Docker creates and destroys veth pairs rapidly. Default Linux settings are often too conservative.
Edit your /etc/sysctl.conf:
# /etc/sysctl.conf
# Increase max open files for heavy dependency trees (node_modules, I'm looking at you)
fs.file-max = 2097152
# Optimize ARP cache for rapid container creation/destruction
net.ipv4.neigh.default.gc_thresh1 = 1024
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh3 = 4096
# Increase backlog for incoming connections
net.core.somaxconn = 65535
Apply with sysctl -p. This prevents the dreaded "Resource temporarily unavailable" errors during parallel testing stages.
3. GitLab Runner Configuration
We use the Docker executor. This keeps the host clean. However, the default configuration is not optimized for speed. We need to leverage the Docker socket correctly (with care) or use privileged mode for Docker-in-Docker (dind).
Here is the installation command for Debian/Ubuntu (standard for 2020):
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash
sudo apt-get install gitlab-runner
Now, let's configure /etc/gitlab-runner/config.toml. The magic happens in the volumes and limit sections.
concurrent = 4
check_interval = 0
[[runners]]
name = "CoolVDS-Oslo-NVMe-01"
url = "https://gitlab.com/"
token = "YOUR_REGISTRATION_TOKEN"
executor = "docker"
[runners.custom_build_dir]
[runners.docker]
tls_verify = false
image = "docker:19.03.1"
privileged = true
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
# Mount the docker socket to reuse layers (Use with caution on public projects!)
# volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
shm_size = 0
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
4. Docker-in-Docker Optimization
If you are building Docker images inside your CI, you are likely using the dind service. This is heavy. On a slow disk, pulling the dind image adds 30-40 seconds. On CoolVDS NVMe, it takes about 3 seconds.
Here is an optimized .gitlab-ci.yml snippet ensuring we use the Overlay2 driver, which is far more efficient than the old VFS or AUFS drivers used in older setups.
variables:
DOCKER_DRIVER: overlay2
# Use the specific 19.03 version to match our host for stability
DOCKER_HOST: tcp://docker:2375
DOCKER_TLS_CERTDIR: ""
services:
- name: docker:19.03.1-dind
alias: docker
build:
stage: build
image: docker:19.03.1
before_script:
- docker info
# Attempt to pull the latest image to use as cache for the build
- docker pull $CI_REGISTRY_IMAGE:latest || true
script:
- docker build --cache-from $CI_REGISTRY_IMAGE:latest -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
The Results: Data from the Trenches
We ran a benchmark comparing a standard shared runner provided by GitLab against a CoolVDS 4GB RAM / 2 vCPU instance hosted in a datacenter near Oslo. The project was a standard React frontend + Node.js backend monorepo.
| Metric | Shared Cloud Runner | Self-Hosted CoolVDS (NVMe) | Improvement |
|---|---|---|---|
| Cache Restore | 48s | 12s | 4x Faster |
| npm install | 145s | 38s | 3.8x Faster |
| Docker Build | 210s | 65s | 3.2x Faster |
| Total Pipeline | ~7 mins | ~2 mins | 3.5x Faster |
Data Sovereignty and GDPR
Speed isn't the only factor. For Norwegian companies, where your data lives matters. While we wait to see how the legal landscape evolves regarding EU-US data transfers, keeping your artifacts and code execution on servers physically located in Europe (or better, Norway) is a sound risk mitigation strategy.
By using a local provider like CoolVDS, you ensure that your temporary build artifacts—which often contain intellectual property or even accidentally dumped database schemas—never leave the jurisdiction. Plus, the latency to NIX (Norwegian Internet Exchange) is negligible, meaning your developers in Oslo or Bergen push code and get feedback almost instantly.
Conclusion
If your developers are complaining about slow builds, don't just throw more RAM at the problem. Attack the I/O bottleneck. Moving to a self-hosted runner model on high-performance NVMe VPS infrastructure is the single most effective change you can make for your pipeline performance in 2020.
Stop renting noisy neighbors. Deploy a dedicated runner on CoolVDS today and get your deployment times down to where they belong.