Why 42% of Enterprise AI Projects Never Reach Production—and How Two Giants Beat the Odds

2 Sources

Share

A staggering 42% of companies abandoned most AI initiatives in 2025, more than double the previous year's rate. The culprit isn't bad technology—it's AI pilot sprawl, unclear ownership and governance, and data that's never production-ready. MassMutual and Mass General Brigham show what discipline looks like: 30% developer productivity gains, help desk times slashed from 11 minutes to one, and a shift from ungoverned experiments to measurable business results.

News article

Enterprise AI Projects Struggle to Escape Pilot Purgatory

Enterprise AI is stuck in a troubling pattern. According to S&P Global's 2025 Voice of the Enterprise survey, 42% of companies abandoned most AI initiatives last year—more than double the 17% recorded the year before

2

. On average, organizations scrapped 46% of their proof-of-concept (PoC) projects before they ever reached production. Budgets get approved, pilots get built, demos get applauded, and then nothing ships. The problem isn't the technology itself. AI projects rarely fail because of bad ideas or underperforming models. Instead, they collapse under AI pilot sprawl, unclear ownership and governance, and organizational hurdles that were never addressed during the planning phase.

Alex Potapov, an NTT DATA consultant who oversees GenAI implementations for global clients in energy and insurance, puts it bluntly: "The model is rarely the main problem"

2

. Most projects break down at the intersection of three things: data readiness, integration with enterprise systems, and unclear ownership across teams. Without proper data quality and governance, even the best models cannot produce reliable outputs. And when GenAI initiatives sit between IT, data, legal, security, and the business unit with no clear owner, the project stalls—no matter how impressive the demo was.

How MassMutual Turned AI Pilot Projects into Measurable Gains

MassMutual, a 175-year-old company serving millions of policy owners, offers a counterexample. The insurer has pushed AI into production across customer support, IT, customer acquisition, underwriting, servicing, and claims—achieving concrete results: 30% developer productivity gains, IT help desk resolution times reduced from 11 minutes to one, and customer service calls cut from 15 minutes to just one or two

1

.

Sears Merritt, MassMutual's head of enterprise technology and experience, credits a disciplined approach rooted in metrics and feedback loops. "We're always starting with why do we care about this problem?" Merritt explained at a recent VentureBeat event. "If we solve the problem, how are we gonna know we solved it? And, how much value is associated with doing that?"

1

. His team follows the scientific method, beginning with a hypothesis and testing whether it will tangibly drive the business forward. Some ideas are great but may be "intractable in the business" due to lack of data, access, or regulatory constraints.

Crucially, MassMutual won't advance an idea until there's crystal clarity on how success will be measured. Different departments and stakeholders define what quality means, choose a metric, and set minimum quality thresholds before a tool is placed into the hands of teams. "We don't go to production until there is a business partner that says, 'Yes, that works,'" Merritt said

1

. His team performs trust scoring to lower hallucination rates, establishes evaluation criteria, and monitors for feature and model drift. They also operate with a no-commitment policy on models, building common service layers, microservices, and APIs that sit between the AI layer and underlying systems—so when a better model emerges, swapping it in doesn't mean starting over.

Mass General Brigham Shuts Down AI Pilot Sprawl to Focus on ROI

Mass General Brigham (MGB) took a different route—initially embracing a "spray and pray" approach before course-correcting. Around 15,000 researchers in the not-for-profit health system have been using AI, ML, and deep learning for 10 to 15 years, but last year, CTO Nallan "Sri" Sriraman made a bold choice: His team shut down a sprawl of non-governed AI pilots

1

. "We did follow the thousand flowers bloom [methodology], but we didn't have a thousand flowers, we had probably a few tens of flowers trying to bloom," he said.

Like MassMutual, MGB pivoted to a more holistic view, examining why they were developing certain tools for specific workflows. They questioned what capabilities they wanted and needed, and what investment those required. Sriraman's team also spoke with their primary platform providers—Epic, Workday, ServiceNow, Microsoft—about their roadmaps. This was a "pivotal moment," as they realized they were building in-house tools that vendors were already providing or planning to roll out

1

. Instead of a wild West environment, Sriraman's team now distributes Microsoft Copilot to users across the business and uses a "small landing zone" where they can safely test more sophisticated products and control token use. They also began "consciously embedding AI champions" across business groups to drive adoption and accountability.

What Separates Successful AI Implementation Cycles from Abandoned Pilots

Potapov, who manages GenAI programs exceeding $30 million in potential revenue for industrial giants, sees a clear pattern in what separates the AI projects that ship from those that get abandoned. Data is often the first bottleneck. Many organizations believe they have high-quality knowledge bases, but when building a GenAI solution, information turns out to be fragmented across SharePoint, PDFs, internal tools, and outdated repositories

2

. Integration is the second challenge—a GenAI solution becomes valuable only when it fits into existing workflows, whether CRM systems or internal support platforms. That work often takes longer than the AI component itself.

The most underestimated challenge, however, is organizational ownership. Without clearly defined responsibility after the PoC phase, projects stall. Even in short pilots, certain elements cannot be compromised: the data pipeline should resemble the production architecture as closely as possible, security and compliance must be maintained, and the PoC must tie to a measurable business case with clear KPIs. Warning signs that a prototype was never designed for scalability include heavy manual intervention, no integration strategy, and workflows that exist in isolation from real enterprise systems

2

.

As Gartner forecasts that by 2028, more than half of enterprise AI models will be domain-specific, and by 2030 most organizations will shift to smaller, AI-augmented units, the pressure to move AI from pilot to production will only intensify. The companies that achieve tangible business results will be those that address governance, establish strong feedback loops, and treat integration with enterprise systems as a first-class concern—not an afterthought.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo