A developer hunched over a terminal, squinting at lines of code that look… almost right. That scene, now replicated across countless organizations, is the backdrop to a stark new reality: artificial intelligence is turbocharging software development, but it’s also creating a fresh set of headaches.
A survey of 213 IT leaders by CloudBees drops a data bomb: while a staggering 93% report productivity gains from AI tools, a full 81% have also witnessed an increase in production issues directly attributable to AI-generated code.
This isn’t a niche problem. CloudBees, presenting findings at Agentic DevOps World 2026, highlighted that nearly two-thirds (64%) of respondents have either widely adopted or fully integrated AI into their engineering workflows. The genie is out of the bottle, and it’s writing code.
The Double-Edged Sword of AI Code Generation
The numbers paint a clear picture of AI’s dual impact. On one hand, organizations are awash in more code than ever. The survey notes a 67% increase in code volume over the past 12 months. Yet, the output in terms of tangible features and pull requests only sees a modest bump—just 52% report higher development output. This suggests a significant portion of that increased code volume might be… less efficient, or perhaps, more error-prone.
This efficiency gap is compounded by a fundamental disconnect in tracking value. Only 31% of respondents can reliably correlate AI spending with specific business outcomes. Worse, a concerning 36% admit to tracking AI spend without measuring ROI, or not measuring it at all. It’s like buying premium fertilizer without knowing if your crops are growing.
The Rising Tide of Infrastructure Costs
And it’s costing them. More than half of those surveyed (54%) report a significant jump in CI/CD infrastructure spend in the last year. Alongside this, 53% have seen testing, security scanning, and deployment costs escalate, directly mirroring the growing code volume. The automation promised by AI is, ironically, driving up the very infrastructure it’s supposed to streamline. Only a quarter of organizations (27%) have implemented hard limits or quotas on token usage, and a paltry 18% have automated controls in place. This is the Wild West of AI adoption, with little regard for fiscal prudence.
Here’s a stat that should give DevOps teams pause: 70% of IT leaders now view test suite maintenance as a bigger burden than writing code itself. This directly implicates AI-generated code. If the code is harder to test, harder to validate, and more prone to bugs, the maintenance overhead will inevitably skyrocket.
Shawn Ahmed, CloudBees’ chief product officer, didn’t mince words. He stated that organizations are failing to grapple with the governance, validation, and accountability issues that are inherent when AI becomes a co-pilot in DevOps. The “token maxing” approach—developers encouraged to use AI tools liberally without considering the escalating costs of consumed tokens—is a clear symptom of this immaturity.
His point about predictability is sharp: less than half (45%) of organizations find their AI spend to be very predictable quarter-to-quarter. This isn’t sustainable. DevOps teams must re-evaluate their workflows as code volume continues its exponential ascent in this AI-driven era. The pipelines that process this code are crying out for optimization, and frankly, for better governance.
DevOps teams will need to revisit how workflows are being managed as the volume of code continues to exponentially increase in the AI era. It’s not exactly clear how much AI generated code is actually making it into production environments, but there is little doubt that the pipelines through which that code flows will need to be optimized.
While some may require new tools, the fundamental governance issues are non-negotiable. The question isn’t if these issues will arise, but how many avoidable problems will surface due to a lack of foresight.
Why This Matters for the Open Source Ecosystem
This data has profound implications for the open-source community. If AI-generated code is increasing production issues, and much of this code will eventually be built upon or integrated with open-source projects, then the pressure on maintainers will intensify. They’ll be tasked with validating code whose origin is opaque, potentially riddled with subtle errors, and lacking clear accountability. This survey is a stark warning: the rapid adoption of AI-powered coding tools, without strong governance and validation frameworks, risks turning the productivity gains into a maintenance nightmare for everyone, especially those stewarding our most critical open-source libraries.
🧬 Related Insights
- Read more: Grafana Assistant: AI Sees Your Infra First
- Read more: AI vs. Runtime Reality: Why Static Assumptions Fail
Frequently Asked Questions
What does CloudBees’ CARE Index measure? The CARE Index is a proprietary score designed to assess an enterprise’s ability to track, attribute, and forecast AI-driven costs against productivity outcomes, establishing a baseline for AI governance maturity.
Are organizations seeing more features delivered with AI code? While AI adoption correlates with increased code volume, only 52% of organizations report higher development output in terms of features and pull requests, suggesting a gap between code generation and tangible feature delivery.
What’s the biggest challenge for IT leaders regarding AI code? The survey indicates a significant challenge lies in addressing governance, validation, and accountability for AI-generated code, with many organizations prioritizing AI tool usage over cost control and code quality assessment.