5 Sources
5 Sources
[1]
OpenAI report reveals a 6x productivity gap between AI power users and everyone else
The tools are available to everyone. The subscription is company-wide. The training sessions have been held. And yet, in offices from Wall Street to Silicon Valley, a stark divide is opening between workers who have woven artificial intelligence into the fabric of their daily work and colleagues who have barely touched it. The gap is not small. According to a new report from OpenAI analyzing usage patterns across its more than one million business customers, workers at the 95th percentile of AI adoption are sending six times as many messages to ChatGPT as the median employee at the same companies. For specific tasks, the divide is even more dramatic: frontier workers send 17 times as many coding-related messages as their typical peers, and among data analysts, the heaviest users engage the data analysis tool 16 times more frequently than the median. This is not a story about access. It is a story about a new form of workplace stratification emerging in real time -- one that may be reshaping who gets ahead, who falls behind, and what it means to be a skilled worker in the age of artificial intelligence. Everyone has the same tools, but not everyone is using them Perhaps the most striking finding in the OpenAI report is how little access explains. ChatGPT Enterprise is now deployed across more than 7 million workplace seats globally, a nine-fold increase from a year ago. The tools are the same for everyone. The capabilities are identical. And yet usage varies by orders of magnitude. Among monthly active users -- people who have logged in at least once in the past 30 days -- 19 percent have never tried the data analysis feature. Fourteen percent have never used reasoning capabilities. Twelve percent have never used search. These are not obscure features buried in submenus; they are core functionality that OpenAI highlights as transformative for knowledge work. The pattern inverts among daily users. Only 3 percent of people who use ChatGPT every day have never tried data analysis; just 1 percent have skipped reasoning or search. The implication is clear: the divide is not between those who have access and those who don't, but between those who have made AI a daily habit and those for whom it remains an occasional novelty. Employees who experiment more are saving dramatically more time The OpenAI report suggests that AI productivity gains are not evenly distributed across all users but concentrated among those who use the technology most intensively. Workers who engage across approximately seven distinct task types -- data analysis, coding, image generation, translation, writing, and others -- report saving five times as much time as those who use only four. Employees who save more than 10 hours per week consume eight times more AI credits than those who report no time savings at all. This creates a compounding dynamic. Workers who experiment broadly discover more uses. More uses lead to greater productivity gains. Greater productivity gains presumably lead to better performance reviews, more interesting assignments, and faster advancement -- which in turn provides more opportunity and incentive to deepen AI usage further. Seventy-five percent of surveyed workers report being able to complete tasks they previously could not perform, including programming support, spreadsheet automation, and technical troubleshooting. For workers who have embraced these capabilities, the boundaries of their roles are expanding. For those who have not, the boundaries may be contracting by comparison. The corporate AI paradox: $40 billion spent, 95 percent seeing no return The individual usage gap documented by OpenAI mirrors a broader pattern identified by a separate study from MIT's Project NANDA. Despite $30 billion to $40 billion invested in generative AI initiatives, only 5 percent of organizations are seeing transformative returns. The researchers call this the "GenAI Divide" -- a gap separating the few organizations that succeed in transforming processes with adaptive AI systems from the majority that remain stuck in pilots. The MIT report found limited disruption across industries: only two of nine major sectors -- technology and media -- show material business transformation from generative AI use. Large firms lead in pilot volume but lag in successful deployment. The pattern is consistent across both studies. Organizations and individuals are buying the technology. They are launching pilots. They are attending training sessions. But somewhere between adoption and transformation, most are getting stuck. While official AI projects stall, a shadow economy is thriving The MIT study reveals a striking disconnect: while only 40 percent of companies have purchased official LLM subscriptions, employees in over 90 percent of companies regularly use personal AI tools for work. Nearly every respondent reported using LLMs in some form as part of their regular workflow. "This 'shadow AI' often delivers better ROI than formal initiatives and reveals what actually works for bridging the divide," MIT's Project NANDA found. The shadow economy offers a clue to what's happening at the individual level within organizations. Employees who take initiative -- who sign up for personal subscriptions, who experiment on their own time, who figure out how to integrate AI into their workflows without waiting for IT approval -- are pulling ahead of colleagues who wait for official guidance that may never come. These shadow systems, largely unsanctioned, often deliver better performance and faster adoption than corporate tools. Worker sentiment reveals a preference for flexible, responsive tools -- precisely the kind of experimentation that separates OpenAI's frontier workers from the median. The biggest gaps show up in technical work that used to require specialists The largest relative gaps between frontier and median workers appear in coding, writing, and analysis -- precisely the task categories where AI capabilities have advanced most rapidly. Frontier workers are not just doing the same work faster; they appear to be doing different work entirely, expanding into technical domains that were previously inaccessible to them. Among ChatGPT Enterprise users outside of engineering, IT, and research, coding-related messages have grown 36 percent over the past six months. Someone in marketing or HR who learns to write scripts and automate workflows is becoming a categorically different employee than a peer who has not -- even if they hold the same title and started with the same skills. The academic research on AI and productivity offers a complicated picture. Several studies cited in the OpenAI report find that AI has an "equalizing effect," disproportionately helping lower-performing workers close the gap with their higher-performing peers. But the equalizing effect may apply only within the population of workers who actually use AI regularly. A meaningful share of workers are not in that group at all. They remain light users or non-users, even as their more adventurous colleagues pull away. Companies are divided too, and the gap is widening by the month The divide is not only between individual workers. It exists between entire organizations. Frontier firms -- those at the 95th percentile of adoption intensity -- generate approximately twice as many AI messages per employee as the median enterprise. For messages routed through custom GPTs, purpose-built tools that automate specific workflows, the gap widens to seven-fold. These numbers suggest fundamentally different operating models. At median companies, AI may be a productivity tool that individual workers use at their discretion. At frontier firms, AI appears to be embedded in core infrastructure: standardized workflows, persistent custom tools, systematic integration with internal data systems. The OpenAI report notes that roughly one in four enterprises still has not enabled connectors that give AI access to company data -- a basic step that dramatically increases the technology's utility. The MIT study found that companies that purchased AI tools from specialized vendors succeeded 67 percent of the time, while internal builds had only a one-in-three success rate. For many organizations, the AI era has technically arrived but has not yet begun in practice. The technology is no longer the problem -- organizations are For executives, the data presents an uncomfortable challenge. The technology is no longer the constraint. OpenAI notes that it releases a new feature or capability roughly every three days; the models are advancing faster than most organizations can absorb. The bottleneck has shifted from what AI can do to whether organizations are structured to take advantage of it. "The dividing line isn't intelligence," the MIT authors write. The problems with enterprise AI have to do with memory, adaptability, and learning capability. Problems stem less from regulations or model performance, and more from tools that fail to learn or adapt. Leading firms, according to the OpenAI report, consistently invest in executive sponsorship, data readiness, workflow standardization, and deliberate change management. They build cultures where custom AI tools are created, shared, and refined across teams. They track performance and run evaluations. They make AI adoption a strategic priority rather than an individual choice. The rest are leaving it to chance -- hoping that workers will discover the tools on their own, experiment on their own time, and somehow propagate best practices without infrastructure or incentive. The six-fold gap suggests this approach is not working. The window to catch up is closing faster than most companies realize With enterprise contracts locking in over the next 18 months, there's a shrinking window for vendors and adopters to cross the divide.The GenAI Divide identified by the MIT report is not going to last forever. But the organizations that figure out a way across it soonest will be the ones that define the next era of business. Both reports carry caveats. The OpenAI data comes from a company with an obvious interest in promoting AI adoption. The productivity figures are self-reported by customers already paying for the product. The MIT study, while independent, relies on interviews and surveys rather than direct measurement. The long-term effects of this technology on employment, wages, and workplace dynamics remain uncertain. But the core finding -- that access alone does not produce adoption, and that adoption varies enormously even within organizations that have made identical tools available to all -- is consistent with how previous technologies have diffused through the economy. Spreadsheets, email, and the internet all created similar divides before eventually becoming universal. The question is how long the current gap persists, who benefits during the transition, and what happens to workers who find themselves on the wrong side of it. For now, the divide is stark. Ninety percent of users said they prefer humans for "mission-critical work," while AI has "won the war for simple work." The workers who are pulling ahead are not doing so because they have access their colleagues lack. They are pulling ahead because they decided to use what everyone already has -- and kept using it until they figured out what it could do. The 6x gap is not about technology. It is about behavior. And behavior, unlike software, cannot be deployed with a company-wide rollout.
[2]
Struggling to get AI agents to work? This Google research could help | Fortune
Hello. 2025 was supposed to be the year of AI agents. But as the year draws to a close, it is clear such prognostications from tech vendors were overly optimistic. Yes, some companies have started to use AI agents. But most are not yet doing so, especially not in company-wide deployments. A McKinsey "State of AI" survey from last month found that a majority of businesses had yet to begin using AI agents, while 40% said they were experimenting. Less than a quarter said they had deployed AI agents at scale in at least one use case; and when the consulting firm asked people about whether they were using AI in specific functions, such as marketing and sales or human resources, the results were even worse. No more than 10% of survey respondents said they had AI agents "fully scaled" or were "in the process of scaling" in any of these areas. The one function with the most usage of scaled agents was IT (where agents are often used to automatically resolve service tickets or install software for employees), and even here only 2% reported having agents "fully scaled," with an additional 8% saying they were "scaling." A big part of the problem is that designing workflows for AI agents that will enable them to produce reliable results turns out to be difficult. Even the most capable of today's AI models sit on a strange boundary -- capable of doing certain tasks in a workflow as well as humans, but unable to do others. Complex tasks that involve gathering data from multiple sources and using software tools over many steps represent a particular challenge. The longer the workflow, the more risk that an error in one of the early steps in a process will compound, resulting in a failed outcome. Plus, the most capable AI models can be expensive to use at scale, especially if the workflow involves the agent having to do a lot of planning and reasoning. Many firms have sought to solve these problems by designing "multi-agent workflows," where different agents are spun up, with each assigned just one discrete step in the workflow, including sometimes using one agent to check the work of another agent. This can improve performance, but it too can wind up being expensive -- sometimes too expensive to make the workflow worth automating. Now a team at Google has conducted research that aims to give businesses a good rubric for deciding when it is better to use a single agent, as opposed to building a multi-agent workflow, and what type of multi-agent workflows might be best for a particular task. The researchers conducted 180 controlled experiments using AI models from Google, OpenAI, and Anthropic. It tried them against four different agentic AI benchmarks that covered a diverse set of goals: retrieving information from multiple websites; planning in a Minecraft game environment; planning and tool use to accomplish common business tasks such as answering emails, scheduling meetings, and using project management software; and a finance agent benchmark. That finance test requires agents to retrieve information from SEC filings and perform basic analytics, such as comparing actual results to management's forecasts from the prior quarter, figuring out how revenue derived from a specific product segment has changed over time, or figuring out how much cash a company might have free for M&A activity. In the past year, the conventional wisdom has been that multi-agent workflows produce more reliable results. (I've previously written about this view, which has been backed up by the experience of some companies, such as Prosus, here in Eye on AI.) But the Google researchers found instead that whether the conventional wisdom held was highly contingent on exactly what the task was. If the task was sequential, which was the case for many of the Minecraft benchmark tasks, then it turned out that so long as a single AI agent could perform the task accurately at least 45% of the time (which is a pretty low bar, in my opinion), then it was better to deploy just one agent. Using multiple agents, in any configuration, reduced overall performance by huge amounts, ranging between 39% and 70%. The reason, according to the researchers, is that if a company had a limited token budget for completing the entire task, then the demands of multiple agents trying to figure out how to use different tools would quickly overwhelm the budget. But if a task involved steps that could be performed in parallel, as was true for many of the financial analysis tasks, then multi-agent systems conveyed big advantages. What's more, the researchers found that exactly how the agents are configured to work with one another makes a big difference, too. For the financial-analysis tasks, a centralized multi-agent syste -- where a single coordinator agent directs and oversees the activity of multiple sub-agents and all communication flows to and from the coordinator -- produced the best result. This system performed 80% better than a single agent. Meanwhile, an independent multi-agent system, in which there is no coordinator and each agent is simply assigned a narrow role that they complete in parallel, was only 57% better than a single agent. Research like this should help companies figure out the best ways to configure AI agents and enable the technology to finally begin to deliver on last year's promises. For those selling AI agent technology, late is better than never. For the people working in the businesses using AI agents, we'll have to see what impact these agents have on the labor market. That's a story we'll be watching closely as we head into 2026. A grassroots NIMBY revolt is turning voters in Republican strongholds against the AI data-center boom -- by Eva Roytburg Accenture exec gets real on transformation: 'The data and AI strategy is not a separate strategy, it is the business strategy' -- by Nick Lichtenberg AWS CEO says replacing young employees with AI is 'one of the dumbest ideas' -- and bad for business: 'At some point the whole thing explodes on itself' -- by Sasha Rogelberg What happens to old AI chips? They're still put to good use and don't depreciate that fast, analyst says -- by Jason Ma President Trump signs executive order to stop state-level AI regulation. President Trump signed an executive order giving the U.S. Attorney General broad power to challenge and potentially overturn state laws that regulate artificial intelligence, arguing they hinder U.S. "global AI dominance." The order also allows federal agencies to withhold funding from states that keep such laws. Trump said he wanted to replace what he called a confusing patchwork of state rules with a single federal framework -- but the order did not contain any new federal requirements for those building AI models. Tech companies welcomed the move, but the executive order drew bipartisan criticism and is expected to face legal challenges from states and consumer groups who argue that only Congress can pre-empt state laws. Read more here from the New York Times. Oracle stock hammered on reports of data center delays, huge lease obligations. Oracle denied a Bloomberg report that it had delayed completion of data centers being built for OpenAI, saying all projects remain on track to meet contractual commitments despite labor and materials shortages. The report rattled investors already worried about Oracle's debt-heavy push into AI infrastructure under its $300 billion OpenAI deal, and investors pummeled Oracle's stock price. You can read more on Oracle's denial from Reuters here. Oracle was also shaken by reports that it has $248 billion in rental payments for data centers that will commence between now and 2028. That was covered by Bloomberg here. OpenAI launches new image generation model. The company debuted a new image generation AI model that it says offers more fine-grained editing control and generates images four times faster than its previous image creators. The move is being widely viewed as an effort by OpenAI to show that it has not lost ground to competitors, in particular Google, whose Nano Banana Pro image generation model has been the talk of the internet since it launched in late November. You can read more from Fortune's Sharon Goldman here. OpenAI hires Shopify executive in push to make ChatGPT an 'operating system' The AI company hired Glen Coates, who had been head of "core product" at Shopify, to be its new head of app platform, working under ChatGPT product head Nick Turley. "We're going to find out what happens if you architect an OS ground-up with a genius at its core that use its apps just like you can," Coates wrote in a LinkedIn post announcing the move. A Google DeepMind agent that can make complex plans in a virtual world. The AI lab debuted an updated version of its SIMA agent, called SIMA 2, that can navigate complex, 3D digital worlds, including those from different video games. Unlike earlier systems that only followed simple commands, SIMA 2 can understand broader goals, hold short conversations, and figure out multi-step plans on its own. In tests, it performed far better than its predecessor and came close to human players on many tasks, even in games it had never seen before. Notably, SIMA 2 can also teach itself new skills by setting its own challenges and learning from trial and error. The paper shows progress towards AI that can act, adapt, and learn in environments rather than just analyze text or images. The approach, which is based on reinforcement learning -- a technique where an agent learns by trial and error to accomplish a goal -- should help power more capable virtual assistants and, eventually, real-world robots. You can read the paper here. Is it safe? A few weeks ago, the Future of Life Institute (FLI) released its latest AI Safety Index, a report that grades leading AI labs on how they are doing on a range of safety criteria. A clear gap has emerged between three of the leading AI labs and pretty much everyone else. OpenAI, Google, and Anthropic all received grades in the "C" range. Anthropic and OpenAI both scored a C+, with Anthropic narrowly beating OpenAI on its total safety score. Google DeepMind's solid C was an improvement from the C- it scored when FLI last graded the field on their safety efforts back in July. But the rest of the pack is doing a pretty poor job. X.ai and Meta and DeepSeek all received Ds, while Alibaba, which makes the popular open source AI model Qwen, got a D-. (DeepSeek's grade was actually a step up from the F it received in the summer.) Despite this somewhat dismal picture, FLI CEO Max Tegmark -- ever an optimist -- told me he actually sees some good news in the results. Not only did all the labs pull up their raw scores by at least some degree, more AI companies agreed to submit data to FLI in order to be graded. Tegmark sees this as evidence that the AI Safety Index is starting to have its intended effect of creating "a race to the top" on AI safety. But Tegmark also allows that all three of the top-marked AI labs saw their scores for "current harms" from AI -- such as the negative impacts their models can have on mental health -- slip since they were assessed in the summer. And when it comes to potential "existential risks" to humanity, none of the labs gets a grade above D. Somehow that doesn't cheer me. Businesses took big steps forward on the AI journey in 2025, from hiring Chief AI Officers to experimenting with AI agents. The lessons learned -- both good and bad-combined with the technology's latest innovations will make 2026 another decisive year. Explore all of Fortune AIQ, and read the latest playbook below: -AI coding tools exploded in 2025. The first security exploits show what could go wrong. -The big AI New Year's resolution for businesses in 2026: ROI.
[3]
IBM, AWS veteran says 90% of your employees are stuck in first gear with AI, just asking it to 'write their mean email in a slightly more polite way' | Fortune
Employers are shelling out millions on artificial intelligence (AI) tools to boost productivity, but workers are still getting stuck using a tiny fraction of the tech's potential, according to a presentation from a top executive in the space who advises Fortune 500 companies on strategy and tech adoption. Allie K. Miller, the CEO of Open Machine, addressed the Fortune Brainstorm AI conference last week in San Francisco. Speaking from decades of experience at companies including IBM and Amazon Web Services (AWS), she argued that AI actually has four different, increasingly useful interaction modes. Miller, who helped launch the first multimodal AI team at IBM, said that AI can be a microtasker, companion, delegate, or a teammate, depending on the desired outcome. The problem, Miller said, is that most users never get beyond the first mode, using AI as a "microtasker," basically a glorified search engine, returning results for simple queries. Her central critique focused on the rudimentary way that most employees interact with Large Language Models (LLMs). While traditional software ("Software 1.0") required exact inputs to get exact outputs, AI allows for reasoning and adaptation. Mistaking the former for the latter adds up to a waste of your annual ChatGPT, Gemini, or other subscription, she argued. "Ninety percent of your employees are stuck in this mode. And so many employees think that they are an AI super user when all they are doing is asking AI to write their mean email in a slightly more polite way," Miller said. This roadblock is holding companies back from true productivity gains, added Miller. "Your annual subscriptions are made worthless because people are stuck in this mode," she said, implicitly encouraging organizations to rethink their AI investment budgets. Miller's ideas are backed with data. According to a November study from software company Cornerstone OnDemand, there is an increasingly split "shadow AI economy" thriving beneath the surface of corporate America. The study found that 80% of employees are using AI at work, yet fewer than half had received proper AI training. To unlock the actual value of enterprise AI, Miller's presentation outlined a shift toward three more advanced modes: "Companion," "Delegate," and the most critical evolution, "AI as a Teammate." By using AI through this interaction mode, the tech serves not as a reactive answer provider, but rather a collaborative partner that could be sitting in on meetings, fielding questions, as well as taking actions. Engineers at OpenAI are already doing this by incorporating the company's software engineering agent Codex into Slack and treating it essentially as a coworker, she added. While a "Delegate" might handle a 40-minute task like managing an inbox, the "Teammate" mode represents a fundamental shift in infrastructure. In this mode, AI is not transactional but ambient, "lifting up a system or a group and not the individual." Miller predicted a near-future inversion of the current workflow: "We will no longer be prompting AI ... AI will be prompting us because it will be in our systems and helping our team as a whole." But even for non-AI companies, incorporating the technology in this way essentially makes it the foundation of the business tasks employees complete daily, making it more of a productivity booster than a stand-alone curiosity for trivia questions. "The big difference for AI as a teammate is that AI is lifting up a system or a group and not the individual," she added. To bridge the gap between rewriting emails and deploying autonomous systems, the speaker introduced the concept of "Minimum Viable Autonomy" (MVA), a spin on the old product-design principle of minimum viable product, or most market-ready prototype. This approach encourages leaders to stop treating AI like a chatbot requiring "perfect 18-page prompts" and start treating it as goal-oriented software. "We are no longer giving step-by-step perfect instructions ... we are going to provide goals and boundaries and rules and AI systems are going to work from the goal backwards," the speaker explained. To operationalize this safely, the forecast suggested implementing "agent protocols" -- strict guidelines that group tasks into categories: "always do," "please ask first," and "never do." The speaker recommended a risk distribution portfolio for these agents: 70% on low-risk tasks, 20% on complex cross-department tasks, and 10% on strategic tasks that fundamentally change organizational structure. The presentation concluded with aggressive predictions for the immediate future. The speaker forecasted that within months, AI will be capable of working autonomously for over eight hours uninterrupted. Furthermore, as costs drop, companies will move from single queries to running hundreds of thousands of simulations for every market launch. However, these advancements come with a caveat for legacy-minded leadership. The veteran closed with a reminder that evaluating whether AI is "good or not" is the new essential product requirement. "AI is not just a tool," Miller concluded, "and the organizations who continue to treat it like one are going to wonder over the next decade what happened."
[4]
Business leaders agree AI is the future. They just wish it worked right now
Since ChatGPT exploded three years ago, companies big and small have leapt at the chance to adopt generative artificial intelligence and stuff it into as many products as possible. But so far, the vast majority of businesses are struggling to realize a meaningful return on their AI investments, according to company executives, advisors and the results of seven recent executive and worker surveys. Last spring, CellarTracker, a wine-collection app, built an AI-powered sommelier to make unvarnished wine recommendations based on a person's palate. The problem was the chatbot was too nice. "It's just very polite, instead of just saying, 'It's really unlikely you'll like the wine,'" CellarTracker CEO Eric LeVine said. It took six weeks of trial and error to coax the chatbot into offering an honest appraisal before the feature was launched. Since ChatGPT exploded three years ago, companies big and small have leapt at the chance to adopt generative artificial intelligence and stuff it into as many products as possible. But so far, the vast majority of businesses are struggling to realize a meaningful return on their AI investments, according to company executives, advisors and the results of seven recent executive and worker surveys. One survey of 1,576 executives conducted during the second quarter by research and advisory firm Forrester Research showed just 15% of respondents saw profit margins improve due to AI over the last year. Consulting firm BCG found that only 5% of 1,250 executives surveyed between May and mid-July saw widespread value from AI. Executives say they still believe generative AI will eventually transform their businesses, but they are reconsidering how quickly that will happen within their organizations. Forrester predicts that in 2026 companies will delay about 25% of their planned AI spending by a year. "The tech companies who have built this technology have spun this tale that this is all going to change quickly," Forrester analyst Brian Hopkins said. "But we humans don't change that fast." AI companies including OpenAI, Anthropic and Google are all doubling down on courting business customers in the next year. During a recent lunch with media editors in New York, OpenAI CEO Sam Altman said developing AI systems for companies could be a $100 billion market. All this is happening against the backdrop of unprecedented tech investment in everything from chips, to data centers, to energy sources. Whether these investments can be justified will be determined by companies' ability to figure out how to use AI to boost revenue, fatten margins or speed innovation. Failing that, the infrastructure build-out could trigger the kind of crash reminiscent of the dot-com bust in the early 2000s, some experts say. The 'easy' button Soon after ChatGPT's launch, companies worldwide created task forces dedicated to finding ways to embrace generative AI, a type of AI that can create original content like essays, software code and images through text prompts. One well-known issue with AI models is their tendency to please the user. This bias - what's called "sycophancy" - encourages users to chat more, but can impair the model's ability to give better advice. CellarTracker ran into this problem with its wine-recommendation feature, built on top of OpenAI's technology, CEO LeVine said. The chatbot performed well enough when asked for general recommendations. But when asked about specific vintages, the chatbot remained positive - even if all signals showed a person was highly unlikely to enjoy them. "We had to bend over backwards to get the models (any model) to be critical and suggest there are wines I might not like," LeVine said. Part of the solution was designing prompts that gave the model permission to say no. Companies have also struggled with AI's lack of consistency. Jeremy Nielsen, general manager at North American railroad service provider Cando Rail and Terminals, said the company recently tested an AI chatbot for employees to study internal safety reports and training materials. But Cando ran into a surprising stumbling block: the models couldn't consistently and correctly summarize the Canadian Rail Operating Rules, a roughly 100-page document that lays out the safety standards for the industry. Sometimes the models forgot or misinterpreted the rules; other times they invented them from whole cloth. AI researchers say models often struggle to recall what appears in the middle of a long document. Cando has dropped the project for now, but is testing other ideas. So far the company has spent $300,000 on developing AI products. "We all thought it'd be the easy button," Nielsen said. "And that's just not what happened." Humans make a comeback Human-staffed call centers and customer service were supposed to be heavily disrupted by AI, but companies quickly learned there are limits to the amount of human interaction that can be delegated to chatbots. In early 2024, Swedish payments company Klarna rolled out an OpenAI-powered customer service agent that it said could do the work of 700 full-time customer service agents. In 2025, however, CEO Sebastian Siemiathowski was forced to dial that back and acknowledge that some customers preferred to talk with humans. Siemiathowski said AI is reliable on simple tasks and can now do the work of about 850 agents, but more complex issues quickly get referred to human agents. For 2026, Klarna is focused on building its second-generation AI chatbot, which it hopes to ship soon, but human beings will remain a big part of the mix. "If you want to stay customer-obsessed, you can't rely [entirely] on AI," he said. Similarly, U.S. telecommunications giant Verizon is leaning back into human customer service agents in 2026 after attempts to delegate calls to AI. "I think 40% of consumers like the idea of still talking to a human, and they're frustrated that they can't get to a human agent," said Ivan Berg, who leads Verizon's AI-driven efforts to enhance service operations for business customers, in a interview this fall. The company, which has about 2,000 frontline customer service agents, still uses AI to screen calls, get information on customers, and direct them to either self-service systems or to human agents. Using AI to handle routine questions frees up agents to handle complex issues and try new things, such as making outbound calls and doing sales. "Empathy is probably the key thing that's holding us from having AI agents talk to customers holistically right now," Berg said. Shashi Upadhyay, president of product, engineering and AI at customer-service platform Zendesk, says AI excels in three areas: writing, coding and chatting. Zendesk's clients rely on generative AI to handle between 50% and 80% of their customer-support requests. But, he said, the idea that generative AI can do everything is "oversold." The 'jagged frontier' Large language models are rapidly conquering complex tasks in math and coding, but can still fail at comparatively trivial tasks. Researchers call this contradiction in capabilities the "jagged frontier" of AI. "It might be a Ferrari in math but a donkey at putting things in your calendar," said Anastasios Angelopoulos, the CEO and cofounder of LMArena, a popular benchmarking tool. Seemingly small issues can unexpectedly trip up AI systems. Many financial firms rely on data compiled from a broad range of sources, all of which can be formatted very differently. These differences might prompt an AI tool to "read patterns that don't exist," said Clark Shafer, director at advisory firm Alpha Financial Markets Consulting. Many companies are now looking into the potentially expensive, lengthy and complex process of reformatting their data to take advantage of AI, Shafer said. Dutch technology investment group Prosus says one of its in-house AI agents is meant to answer questions about its portfolio, similar to what the group's data analysts on staff already do. Theoretically, an employee could ask how often a Prosus-backed food-delivery firm was late to deliver sushi orders in Berlin last week. But for now, the tool doesn't always understand what neighborhoods are part of Berlin or what "last week" means, said Euro Beinat, head of AI for Prosus. "People thought AI was magic. It's not magic," Beinat said. "There's a lot of knowledge that needs to be encoded in these tools to work well." More handholding OpenAI is working on a new product for businesses and recently created internal teams, such as the Forward Deployed Engineering team, to work directly with clients to help them use OpenAI's technology to tackle specific problems, a spokesperson said. "Where we do see failure is people that jump in too big, they find that billion-dollar problem-that's going to take a few years," said Ashley Kramer, OpenAI's head of revenue, during an onstage interview at Reuters Momentum AI conference in November. Specifically, OpenAI is working with companies to find areas where AI can have a "high impact but maybe low lift at first," said Kramer. Rival AI lab Anthropic, which draws 80% of its revenue from business customers, is hiring "applied AI" experts who will embed with companies. For AI companies to succeed, they will have to view themselves as "partners and educators, rather than just deployers of technology," said Mike Krieger, Anthropic's head of product, in an interview earlier this year. An increasing number of startups, many founded by former OpenAI employees, are developing AI tools for specific sectors such as financial services or legal. These founders say companies will benefit from specialized models more than general-purpose or consumer tools like ChatGPT. It's a playbook that Writer, a San Francisco-based AI application startup, has been adopting. The company, which is now building AI agents for finance and marketing teams at large firms such as Vanguard and Prudential, puts its engineers on calls directly with clients to understand their workflows and co-build the agents. "Companies need more handholding in actually making AI tools useful for them," said May Habib, CEO of Writer.
[5]
OpenAI: AI Business Adoption Soars as Employees Save Over an Hour Daily | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. A new report from OpenAI, "The state of enterprise AI 2025," released Monday (Dec. 8), doesn't just chronicle growth; it signals a structural pivot where generative AI is transitioning from a supplemental productivity tool to embedded, mission-critical infrastructure within the global financial and digital economy. Organizations are moving beyond pilot programs to scaled, operational deployment, effectively codifying institutional knowledge into intelligent, persistent assistants. The average enterprise worker now sends 30% more ChatGPT messages weekly than a year ago, while API reasoning token consumption per organization increased 320-fold, demonstrating both broader adoption and deeper integration across business functions. More than 7 million workplace seats now use ChatGPT Enterprise, up ninefold year-over-year. The adoption curve varies sharply by readiness and resources. Large enterprises employing more than 1,000 workers prove twice as likely to build custom GPTs compared with smaller organizations, reflecting greater technical capacity to standardize AI-driven tasks, the report said. Custom GPT and Project usage rose 19-fold this year, with these tools now processing roughly 20% of all Enterprise messages as firms codify institutional knowledge into persistent assistants. Return-on-investment metrics suggest the technology delivers measurable value. The share of CFO reporting very positive ROI from generative AI jumped from 27% to 85%, according to a PYMNTS Intelligence survey. Three quarters of those executives now deploy AI for cybersecurity management. According to Open AI's report, enterprise workers attribute 40 to 60 minutes of daily time savings to AI use, with data science, engineering and communications roles reporting the highest gains at 60 to 80 minutes per active day. Survey data covering nearly 100 enterprises shows 87% of IT workers report faster issue resolution, 85% of marketing teams cite accelerated campaign execution and 73% of engineers describe shortened code delivery cycles. OpenAI's report said 70% of enterprise AI activity now takes place inside Projects, a configurable workspace that supports multi-step tasks with custom instructions and internal knowledge bases. The shift reflects deeper operational use, with some organizations running AI development at scale. BBVA, for example, maintains more than 4,000 active GPTs, a sign that AI-driven workflows are becoming embedded infrastructure rather than supplemental productivity tools. PYMNTS data shows the same pattern on the demand side. Workflow optimization is the strongest area of agreement among product leaders, with 98% expecting generative AI to improve internal processes, up sharply from 70% one year earlier. Technology companies lead API consumption, using the interface at rates 5 times higher year-over-year as they scale customer-facing applications including in-product assistants and search functions. Non-technology firms increased API use fivefold, suggesting adoption extends beyond product embedding toward operational deployments in customer service and content generation, which now represent approximately 20% of API activity. International expansion accelerated over the past six months, with Australia, Brazil, the Netherlands and France posting customer growth exceeding 143% year-over-year. The United Kingdom and Germany rank among the largest ChatGPT Enterprise markets outside the United States by customer count, while Japan leads corporate API customers internationally. That global expansion comes as the foundation-model market undergoes its sharpest shift in years. According to Menlo Ventures, Anthropic now earns 40% of enterprise large language model (LLM) spend, up from 24% last year and 12% in 2023, overtaking OpenAI as the enterprise leader. OpenAI's share fell to 27%, down from 50% in 2023, while Google increased its enterprise share from 7% in 2023 to 21% in 2025. Together, these three providers now account for 88% of enterprise LLM API usage, with the remaining 12% spread across Meta's Llama, Cohere, Mistral and a long tail of smaller models. The shift in model share aligns with how enterprises now deploy AI more broadly today. 76% of AI use cases are purchased rather than built internally, according to data from Menlo. Despite continued strong investments in internal builds, ready-made AI solutions are reaching production more quickly and demonstrating immediate value while enterprise tech stacks continue to mature.
Share
Share
Copy Link
A new OpenAI report analyzing over one million business customers reveals workers at the 95th percentile of AI adoption send six times as many messages to ChatGPT as median employees. Despite widespread access to enterprise AI tools across 7 million workplace seats, most organizations struggle to realize meaningful returns on their generative AI investments, with only 5% seeing transformative results from $30-40 billion in spending.
A striking divide is emerging in workplaces worldwide as enterprise AI tools become ubiquitous but usage patterns reveal dramatic disparities. According to OpenAI's latest report analyzing more than one million business customers, workers at the 95th percentile of AI adoption send six times as many messages to ChatGPT as the median employee at the same companies
1
. For specific tasks, the AI productivity gap widens even further: frontier workers send 17 times as many coding-related messages as typical peers, while heavy users of data analysis tools engage 16 times more frequently than the median1
.ChatGPT Enterprise is now deployed across more than 7 million workplace seats globally, representing a nine-fold increase from a year ago
1
. The average enterprise worker now sends 30% more ChatGPT messages weekly than a year ago, while API reasoning token consumption per organization increased 320-fold5
. This acceleration in enterprise AI activity demonstrates both broader adoption and deeper AI integration in workflows across business functions.
Source: PYMNTS
Despite widespread access to sophisticated tools, underutilization of AI tools remains a critical problem. Among monthly active users who have logged in at least once in the past 30 days, 19% have never tried the data analysis feature, 14% have never used reasoning capabilities, and 12% have never used search
1
. These core functionalities are highlighted by OpenAI as transformative for knowledge work, yet remain untapped by significant portions of users.Allie K. Miller, CEO of Open Machine and veteran of IBM and Amazon Web Services, argues that 90% of employees are stuck using AI as a "microtasker," essentially a glorified search engine for simple queries
3
. "Ninety percent of your employees are stuck in this mode. And so many employees think that they are an AI super user when all they are doing is asking AI to write their mean email in a slightly more polite way," Miller said at the Fortune Brainstorm AI conference3
. This fundamental misunderstanding of AI capabilities means annual subscriptions deliver minimal value when workers fail to progress beyond basic microtasks.
Source: Fortune
Workers who experiment across approximately seven distinct task types—including data analysis, coding, image generation, translation, and writing—report saving five times as much time as those who use only four
1
. Employees who save more than 10 hours per week consume eight times more AI credits than those reporting no time savings at all1
. Enterprise workers attribute 40 to 60 minutes of daily time savings to AI use, with data science, engineering, and communications roles reporting the highest productivity gains at 60 to 80 minutes per active day5
.This creates a compounding dynamic where workers who experiment broadly discover more uses, leading to greater productivity gains that presumably translate to better performance reviews and faster advancement. Seventy-five percent of surveyed workers report being able to complete tasks they previously could not perform, including programming support, spreadsheet automation, and technical troubleshooting
1
. For workers embracing these capabilities, role boundaries are expanding, while those who have not risk falling behind comparatively.The individual usage gap mirrors a broader pattern identified by MIT's Project NANDA. Despite $30 billion to $40 billion invested in generative AI investments, only 5% of organizations are seeing transformative returns
1
. The researchers call this the "GenAI Divide"—a gap separating the few organizations that succeed in transforming processes with adaptive AI systems from the majority stuck in pilots1
.A Forrester Research survey of 1,576 executives showed just 15% saw profit margins improve due to AI over the last year
4
. Consulting firm BCG found that only 5% of 1,250 executives surveyed between May and mid-July saw widespread value from AI . The return on investment for generative AI remains elusive for most companies, despite widespread belief that the technology will eventually transform their businesses. Forrester predicts that in 2026, companies will delay about 25% of their planned AI spending by a year4
.While only 40% of companies have purchased official LLM subscriptions, employees in over 90% of companies regularly use personal AI tools for work
1
. Nearly every respondent in the MIT study reported using LLMs in some form as part of their regular workflow1
. A Cornerstone OnDemand study found that 80% of employees are using AI at work, yet fewer than half had received proper AI training3
. This "shadow AI" phenomenon often delivers better ROI than official initiatives, creating an informal economy of AI for business productivity beneath the surface of corporate structures.Related Stories
A McKinsey "State of AI" survey found that a majority of businesses had yet to begin using AI agents, while 40% said they were experimenting
2
. Less than a quarter had deployed AI agents at scale in at least one use case2
. The challenges in deploying AI agents stem from designing reliable workflows—even the most capable AI models struggle with complex tasks involving multiple data sources and software tools over many steps.
Source: ET
Google researchers conducted 180 controlled experiments using AI models from Google, OpenAI, and Anthropic to determine when single agents versus multi-agent systems work best
2
. For sequential tasks, single agents proved more effective if they could perform accurately at least 45% of the time. Using multiple agents reduced overall performance by 39% to 70% due to token budget constraints2
. However, for parallel tasks like financial analysis, centralized multi-agent systems with a coordinator agent performed 80% better than single agents2
.Companies face persistent technical challenges beyond deployment strategies. CellarTracker's wine-recommendation chatbot struggled with "sycophancy"—the tendency of AI models to please users rather than provide honest assessments
4
. It took six weeks to coax the chatbot into offering critical appraisals before launch4
. Cando Rail and Terminals spent $300,000 testing an AI chatbot for employees to study safety reports, but models couldn't consistently summarize the Canadian Rail Operating Rules—sometimes forgetting, misinterpreting, or inventing rules entirely4
.The foundation-model market is experiencing its sharpest shift in years. According to Menlo Ventures, Anthropic now earns 40% of enterprise LLM spend, up from 24% last year and 12% in 2023, overtaking OpenAI as the enterprise leader
5
. OpenAI's share fell to 27%, down from 50% in 2023, while Google increased its enterprise share from 7% in 2023 to 21% in 20255
. These three providers now account for 88% of enterprise LLM API usage, with the remaining 12% spread across Meta's Llama, Cohere, Mistral, and smaller models5
.Miller advocates for "Minimum Viable Autonomy," encouraging leaders to stop treating AI like a chatbot and start treating it as goal-oriented software with clear protocols: tasks grouped into "always do," "please ask first," and "never do" categories
3
. She recommends a risk distribution portfolio: 70% on low-risk tasks, 20% on complex cross-department tasks, and 10% on strategic tasks that fundamentally change organizational structure3
. As the technology matures, organizations face pressure to bridge the gap between rewriting emails and deploying autonomous systems that deliver measurable business transformation.Summarized by
Navi