Did you ever stop to think how much actual operational overhead we’ve been carrying around, simply because a critical piece of software—Gitaly, in this case—just didn’t play nice with our shiny new container orchestrator?
That’s the question that’s been buzzing in the back of my mind since GitLab announced that Gitaly on Kubernetes has hit general availability with version 18.11. For years, teams wrestling with Kubernetes for their GitLab deployments have been stuck in a kind of hybrid purgatory. Most of the stack, living and breathing in pods, while Gitaly, the heart of Git operations, stubbornly remained tethered to virtual machines. This wasn’t just a minor inconvenience; it was a constant source of complexity, a drag on efficiency, and a persistent thorn in the side of anyone trying to achieve true infrastructure consolidation.
But now? That era of uneasy cohabitation is over. Gitaly on Kubernetes is officially supported, and it feels like unlocking a new level in the DevOps game.
The Cgroup Conundrum and the Kubernetes Waltz
Now, I know what you’re thinking: “Just put it in a pod, right?” Well, it turns out Gitaly has some rather… particular requirements. Git operations are notoriously fickle beasts. They can guzzle memory like it’s going out of style, and their usage patterns are about as predictable as a rogue asteroid. To keep the main Gitaly process from sputtering and dying in an Out-Of-Memory (OOM) event, the traditional setup isolates each Git process within its own control group (cgroup). The main Gitaly process lives in one cgroup, while the grunt work of Git operations happens in others. If a Git process goes rogue on memory, it gets a swift termination, but the main Gitaly process remains blissfully unaware, continuing its noble work. It’s like having a dedicated bodyguard for your critical processes.
Translating this elegant isolation into the Kubernetes world, however, wasn’t exactly a walk in the park. Most Kubernetes clusters, as you know, are built around containerd. And containerd, until recently, had a bit of a hang-up: it would only let containers write to cgroupfs if they were running in full-on privileged mode. That’s a security no-no for most environments. The ingenious workaround involved mounting the /sys/fs/cgroup directory via an init container, essentially giving the pods the necessary writability without compromising the cluster’s security posture. It’s a bit like giving your delivery driver a special key to a locked room, rather than leaving the whole building unlocked.
Dancing Through Downtime: The Retry Revolution
And then there’s the dreaded pod restart. On a VM, you could often perform a graceful reload, swapping out binaries while keeping the socket alive. Kubernetes, particularly with StatefulSets, is often a bit more… abrupt. A Helm upgrade, a node shuffle, a quick config tweak – poof! The pod stops, the process terminates. For sharded Gitaly, which doesn’t inherently offer high availability, this meant actual downtime. Unacceptable, right?
GitLab’s solution? Making client retries configurable. Imagine this: during a brief pod restart, the main Gitaly process is down for a blink. Normally, this would be a dropped connection, a failed operation. But by tuning clients like Rails to retry requests for just a little while longer, the operation can wait out the brief interruption. You might see a tiny hiccup in latency for a moment, but the request ultimately succeeds. It’s the digital equivalent of taking a deep breath and trying again – and it works.
The Numbers Don’t Lie: Performance in the Wild
To prove this wasn’t just wishful thinking, GitLab ran some serious benchmarks. They pitted a Gitaly-on-VM setup against its new Kubernetes counterpart, triggering upgrades mid-operation and meticulously tracking success rates. The results? Well, they’re pretty darn impressive:
| Operation | VM Success Rate | Kubernetes Success Rate |
|---|---|---|
| git clone | 100% | 100% |
| git pull | 100% | 99.16% |
| git push | 99.66% | 100% |
Nearly identical. And the fact that Kubernetes, with its abrupt pod terminations and sudden socket closures, can achieve success rates this high is frankly remarkable. Achieving a perfect 100% across the board for every operation would still require the high-availability solution, Gitaly Cluster (with Praefect), which is actively being developed for Kubernetes. But even without it, these numbers are a proof to the engineering effort involved.
Unification: The True Prize
This isn’t just about making Gitaly fit into Kubernetes. It’s about consolidation. If you’ve been running that hybrid stack, painstakingly maintaining separate VM fleets alongside your Kubernetes infrastructure, this is your golden ticket. Moving Gitaly into the cluster means your entire GitLab stack—from the tiniest web server to the Git repositories themselves—lives under the singular, powerful management of Kubernetes. It’s cleaner, it’s simpler, and it dramatically reduces operational complexity.
For new adopters, especially those already committed to Kubernetes, this means you can finally deploy GitLab in a truly native, end-to-end Kubernetes fashion right out of the box with the Helm chart. No more awkward compromises.
The recommended path forward is crystal clear: utilize the GitLab Helm chart. And for goodness sake, read the official Gitaly on Kubernetes documentation. It’s your roadmap to navigating the nuances and sidestepping the common traps. Whether you’re deploying Gitaly as part of a full GitLab installation or as a standalone component, the documentation has you covered for both scenarios.
This feels like the inflection point we’ve been waiting for. The final piece of the puzzle is falling into place, allowing us to build truly unified, cloud-native DevOps platforms. The future? It looks incredibly streamlined.
🧬 Related Insights
- Read more: AKS Ingress Migration: 3 Clusters Switch Controllers
- Read more: Open Source Message Queues Compared: Kafka, RabbitMQ, NATS, and Pulsar
Frequently Asked Questions
What does Gitaly on Kubernetes actually do?
Gitaly on Kubernetes allows the Git repository storage service, Gitaly, to run directly within Kubernetes pods, eliminating the need for separate virtual machines and enabling a fully Kubernetes-native GitLab deployment.
Will this eliminate all downtime for GitLab upgrades?
While client retries significantly reduce downtime during pod restarts for most operations, achieving zero downtime for every scenario, especially for sharded Gitaly, will likely require Gitaly Cluster (Praefect) support on Kubernetes, which is under active development.