3 Sources
[1]
Demand for data center CPUs has surged, and AI agents are responsible - why the CPU to GPU ratio is more important than ever for hyperscalers
While most of the attention has been on GPUs, CPUs have continued to maintain their relevance - and they're starting to attract more attention. The AI revolution that shows no signs of stopping appears at times to have echoes of the gold rush. Whisper networks spread quickly through communities about new scarce commodities, and suddenly there's a surge of interest as people snap up resources. For most of the ChatGPT era, you've struggled to get hold of a GPU for neither love nor money, with Nvidia practically able to manage its own waitlist, so great is the demand. Much of the media's attention - and plenty of investment - has been focused on the dash to grab as many GPUs as possible; most recently, memory has become a focal point. But in recent weeks and months, there's been a focus on ensuring that people have CPUs to match. For decades, the CPU has been the anonymous workhorse of the hardware stack, running operating systems, scheduling workloads, and keeping everything ticking over, rarely grabbing headlines unless there's a supply crunch or a generational leap in performance. Suddenly, it's being talked about in the same breath as scarce-as-gold GPUs. What's going on? "AI deployment at scale has forced organizations to look at the infrastructure underneath the hype," said Jason Beckett, chief technology officer in Europe, the Middle East and Africa at Hitachi Vantara, in comments to Tom's Hardware Premium. As Beckett points out, while most of the attention is focused on GPUs because they run the AI models, the CPUs are vital because they handle "everything else". And as agentic AI becomes the norm, there's a greater need for that CPU backbone to keep things running properly. "Always-on, multi-step reasoning systems don't create brief orchestration bursts around GPU workloads," said Beckett. "They demand high-core-count CPUs running at sustained loads, continuously. The infrastructure requirement was always structural. It's just now unavoidable." Readjusting ratios When data centers were previously being specced to deliver AI training and inference in the early days of the generative AI revolution, those building them accounted for a gargantuan bias in favor of GPUs. Chatbot conversations required between four and eight GPUs to every single CPU required, because the parallel equations required to meet user requests were GPU-inference heavy. But as the main use case of AI changes from chatbots to agents, the requirements have also altered. A slight delay for in-depth inference while an AI model 'thinks' was seen as an acceptable interface choice. But as agentic AI requires rapid responses and the smooth coordination of tool calls and much more, latency can be a killer. Bolstering CPU counts can help avoid any problems that can quickly spin out into something more significant, breaking the entire agentic stack. AMD, one of the major manufacturers of CPUs, has seen that shift first-hand. The company had previously forecast that the CPU market would grow at a rate of around 18% annually, but says that the change in requirements has materially changed the market. The rate of growth has now doubled to 35% a year, AMD claims, and will become a $120 billion market by the end of the decade. "What AMD and Arm's results are telling us is that this is a structural, not cyclical requirement," said Roger Cummings, CEO of PEAK:AIO, in an interview with Tom's Hardware Premium. "In actuality, two structural shifts are driving the demand surge: the rise of agentic AI and the need for deterministic, predictable performance at rack scale." Much of that CPU demand is being driven by hyperscalers, who recognize the integral role that CPUs play in developing the AI clusters that are likely to power the economy in the years to come. "As GPU clusters scale, CPUs are taking on larger roles in orchestration, memory management, networking, storage coordination, and inference handling," said Jeff Moore, vice president of strategic partnerships at Aegis Cooling, which specializes in next-gen liquid cooling solutions for AI and high-performance computing infrastructure, in an interview with Tom's Hardware Premium. There's a rise in CPU-to-GPU ratios inside AI deployments, said Moore, "particularly because distributed AI workloads generate significant demand for general-purpose compute, memory bandwidth, and east-west data movement." A recent TrendForce analysis points out that CPUs' contribution to latency - accounting for nearly 91% of all the delay in responses - is something that AI deployments are trying desperately to counteract. Changing designs That shift is now visible not just in financial forecasts, but in the physical design of AI infrastructure itself. In early generative AI deployments, racks were often built around dense GPU configurations, with CPUs effectively treated as supporting components - enough to keep the system running, but not a bottleneck concern. Things are shifting now. "In the media, an AI rack is pictured as a giant box of GPUs," said Hommer Zhao, founder of OurPCB, a PCB manufacturer with more than 15 years' experience, in comments to Tom's Hardware Premium. "But from a hardware design perspective, a GPU is just a very fast, very dumb engine. It cannot talk to the internet or pull data from a hard drive." Rather than a single host CPU loosely paired with multiple GPUs, hyperscalers are deploying configurations with higher core-count CPUs, more memory channels, and, in some cases, multiple CPUs per node to keep pace with data movement demands. There are also thermal and power considerations shaping how racks are populated. High-core-count CPUs, especially those optimized for cloud workloads, are being selected not just for raw performance but for efficiency under sustained load. In liquid-cooled environments, CPUs are increasingly part of the same thermal design envelope as GPUs, rather than an afterthought cooled separately with air. Financial signs of success Recent results from AMD and Arm reinforce the idea that this is not a short-term correction but a deeper architectural shift. AMD has reported strong growth in its data center CPU segment, driven in large part by hyperscaler demand for its EPYC processors, which offer high core counts and memory bandwidth well suited to AI orchestration tasks. Arm, meanwhile, is benefiting from hyperscalers designing their own custom silicon. "Arm accounts for close to half of all compute shipped to top hyperscalers in 2025, with over a billion Neoverse cores deployed," said Beckett. "Those are rack-level architectural decisions made years ago." AWS's Graviton, Google's Axion, and Microsoft's Cobalt chips all reflect a move toward CPU architectures tailored for specific workloads: high-throughput, energy-efficient, and tightly integrated with networking and storage. Arm's licensing model positions it at the center of this trend, and its recent financial results highlight how significant that hyperscaler-driven demand has become. Both sets of results point to a change in how CPUs are being valued. In traditional enterprise contexts, the hardware was often general-purpose and interchangeable. In hyperscaler environments, it's becoming a specialized infrastructure component, tuned for specific roles within AI systems, whether orchestration, inference at the edge, or data preprocessing. Taken together, the changes in rack design and vendor performance suggest that CPUs aren't a secondary consideration in AI infrastructure planning any more. Instead, they are becoming a critical factor in determining overall system efficiency and cost. "The spotlight hasn't revealed something new," said Beckett. "It's just finally illuminating what serious infrastructure teams never stopped building on."
[2]
Intel and pals cram 36,864 CPU cores into a 100kW rack while chasing the agentic AI dragon
COMPUTEX 2026 Intel is working with Foxconn and other infrastructure providers to develop rack-scale reference designs based on the chipmaker's Xeon processors. Announced during Intel's Computex keynote on Tuesday, these blueprints aim to provide greater CPU compute densities for running AI agents at scale. While AI models predominantly run on GPUs and other AI accelerators, the agent harnesses, like OpenClaw, which are used to connect them to tools, terminal shells, code interpreters, and other APIs, still run on CPUs. "Our customers are asking us to think at the system level to help them serve real agentic workloads at scale," Intel CEO Lip Bu Tan said. On stage, Tan revealed two examples of these blueprints. One is aimed at latency-sensitive agentic workloads and another designed for maximum density. Both designs support up to 128 of either Intel's 128-core Granite Rapids Xeon 6 or 288-core Clearwater Forest Xeon 6+ processors, totaling between 16,384 P-cores and 36,864 E-cores, alongside up to 384 TB of DDR5 in a 100kW power envelope. The reference designs come just months after Nvidia announced a similar rack-scale CPU platform packing 256 of its 88-core Vera CPUs. Arm is also working on a pair of rack-scale reference designs for agentic workloads based on its new AGI CPUs: a 36 kW air-cooled system with 8,160 cores and a 200 kW liquid cooled rack with 45,696 cores. Tan expects systems based on these reference designs to be broadly available from its ODM and OEM partners. Alongside agentic AI workloads, the company also revealed that newly launched inference cloud provider Vector Core Compute will be among the first to deploy the platform, and that Together.AI is its first commercial customer. The approach is based on Intel's earlier disaggregated AI blueprint it co-developed with partner SambaNova. The architecture desegregates compute heavy prefill operations to Nvidia GPUs while using SambaNova's AI accelerators for bandwidth-intensive decode operations to boost per-user token output by between 2-3x. If that sounds familiar it's not dissimilar to what Nvidia is doing with Groq's LPUs or what AWS is doing with Trainium and Cerebra's waferscale AI accelerators.®
[3]
Computex 2026: Intel announces Xeon 6+ processors, says AI will make CPUs important again
It's that time of the year again. Computex 2026 is happening in Taipei, Taiwan and almost every other brand is once again talking about AI. We got AI-powered laptops, home appliances, PC chips, and much more. Now when we say AI, chances are you automatically thought of a GPU. But Intel has a different take. The brand believes that AI will put the spotlight firmly on CPUs. In a press release, Intel said, "With the emergence of agentic AI, the growing demand for AI inference is changing the balance of power in the data center, returning the CPU to a position of prominence." Which, in simpler words, means that the next phase of AI may rely heavily on CPUs once again. And the reason behind this could be agentic AI. But before we delve deep into this, let us take a moment and talk about Agentic AI and what it really is. Also read: ROG turns 20: Asus unveils Strix Scar 18, Xbox Ally X20 bundle and more at Computex 2026 What is agentic AI? Agentic AI is simply an AI tool that can do tasks on its own instead of just replying to your questions. For instance, in the case of ChatGPT, you ask it something and it gives you an answer. That's it. But, agentic AI goes a step further. In this case, instead of waiting for commands one by one, it can plan actions, make decisions, and complete multi-step tasks automatically. For example, if you ask a normal AI assistant about some cheap flights to Tokyo, it will give you some links and call it a day. But an agentic AI assistant will search flights, compare prices, check your calendar, book the tickets, reserve a hotel, and remind you before the trip. And it will do all of that with minimal input from you. Agentic AI systems need to constantly coordinate tasks, manage memory, handle workflows, and communicate with other software. And according to companies like Intel, CPUs become more important for managing all that behind the scenes. Why Intel thinks CPUs matter again Intel CEO Lip-Bu Tan, during his keynote, said that with the rise of 'inference, agentic, and physical AI, Intel is poised to bring the world new innovations from the chip to systems level that promise to transform industry and society for the better'. The Intel CEO also added that they are proud to join all their partners in building 'great products that will delight customers and bring the power of AI to more people'. Now for the last few years, GPUs have dominated almost every AI conversation. Whether it is ChatGPT, image generators, or advanced enterprise AI systems, GPUs are usually the hardware doing the heavy lifting. But Intel believes the industry is slowly moving into a different phase now. The company says AI is shifting from simply training models to actually deploying and running them at scale. This process is called inference. In simple words, training is when you teach an AI model. Inference is when people actually start using it in the real world. And according to Intel, inference workloads are growing very quickly because of agentic AI systems. Think about it this way. If millions of people start using AI agents that continuously perform tasks, coordinate apps, process workflows, and manage requests in real time, data centres will suddenly have to deal with a lot more orchestration and communication between systems. That is where CPUs come in. Intel argues that CPUs are extremely important for handling things like scheduling, memory allocation, task coordination, concurrency, and moving data between components. GPUs are still incredibly important for AI processing itself, but CPUs become the "manager" that keeps the whole system running smoothly. Interestingly, Intel even referenced analyst Ben Bajarin during its announcement. According to him, AI infrastructure could slowly move from a one-CPU-to-four-GPU setup toward something much closer to one CPU per GPU in the future. Now no, this does not mean GPUs are suddenly becoming irrelevant. NVIDIA is still dominating the AI space right now. But Intel clearly believes CPUs are about to become much more important than people think. Intel's Xeon 6+ processors To support this AI push, Intel announced its new Xeon 6+ processors at Computex 2026. These are next-generation server CPUs built using Intel's 18A process technology, which is a major step for the company itself. Intel says the chips are specifically designed for cloud-native AI workloads, networking, and large-scale inference systems. But what the company repeatedly highlighted was efficiency and density. According to Intel, a single liquid-cooled rack powered by Xeon 6+ processors can deliver up to 36,864 cores inside 32U of compute space. Intel says this allows extremely high "agent density" for AI infrastructure while operating around 100-kilowatt rack power. Now admittedly, most consumers will never directly interact with hardware like this. But these systems quietly power a huge chunk of the internet. Everything from AI chatbots and cloud apps to streaming services and enterprise platforms relies heavily on data centre infrastructure. And power efficiency is becoming a massive issue in this space. Modern AI data centres consume enormous amounts of electricity. As AI workloads continue growing, companies are desperately trying to find ways to deliver more performance without massively increasing energy consumption. Intel appears to be positioning Xeon 6+ as a solution for exactly that problem.
Share
Copy Link
The rise of agentic AI is fundamentally reshaping data center infrastructure, driving an unexpected surge in demand for data center CPUs. Intel unveiled rack-scale reference designs packing up to 36,864 cores at Computex 2026, while AMD reports CPU market growth has doubled to 35% annually. As AI agents require continuous orchestration and multi-step task coordination, the CPU to GPU ratio is shifting dramatically from the early days of generative AI.
While GPUs have dominated headlines throughout the generative AI revolution, data center CPUs are experiencing an unexpected resurgence. The surge in demand for data center CPUs stems from a fundamental shift in how AI systems operate, moving beyond simple chatbot interactions to complex agentic AI systems that autonomously plan, decide, and execute multi-step tasks
1
. AMD now forecasts the CPU market will reach $120 billion by the end of the decade, with annual growth doubling from 18% to 35%1
.Agentic AI represents a departure from traditional AI assistants that simply respond to queries. Instead of waiting for commands one by one, AI agents can plan actions, make decisions, and complete tasks automatically with minimal human input
3
. Jason Beckett, CTO at Hitachi Vantara, explains that "always-on, multi-step reasoning systems don't create brief orchestration bursts around GPU workloads. They demand high-core-count CPUs running at sustained loads, continuously"1
. These systems need constant coordination of tasks, memory management, workflow handling, and communication with other software—operations where CPUs excel.
Source: Digit
Early generative AI deployments prioritized GPUs heavily, with ratios ranging from four to eight GPUs per single CPU
1
. But as AI infrastructure evolves to support inference and agentic workloads, this balance is changing. Analyst Ben Bajarin suggests the industry could move toward a one-CPU-to-one-GPU setup in the future3
. Jeff Moore from Aegis Cooling notes that "as GPU clusters scale, CPUs are taking on larger roles in orchestration, memory management, networking, storage coordination, and inference handling"1
. Recent TrendForce analysis reveals CPUs contribute nearly 91% of latency in AI responses, making their optimization critical for agentic systems1
.Related Stories

Source: The Register
At Computex 2026, Intel announced rack-scale reference designs developed with Foxconn and other infrastructure providers, specifically targeting CPUs for agentic workloads
2
. Intel CEO Lip-Bu Tan revealed two blueprints: one optimized for latency-sensitive operations and another for maximum compute densities2
. Both designs support up to 128 Intel Xeon processors—either the 128-core Granite Rapids Xeon 6 or 288-core Clearwater Forest Xeon 6+—delivering between 16,384 P-cores and 36,864 E-cores alongside up to 384 TB of DDR5 memory in a 100kW power envelope2
.Hyperscalers recognize the integral role CPUs play in developing AI clusters that will power future economies. Roger Cummings, CEO of PEAK:AIO, identifies "two structural shifts driving the demand surge: the rise of agentic AI and the need for deterministic, predictable performance at rack scale"
1
. Intel's new Xeon 6+ processors, built using the company's 18A process technology, are designed specifically for cloud-native AI workloads, networking, and large-scale inference systems3
. While AI models predominantly run on GPUs and accelerators, the agent harnesses connecting them to tools, terminal shells, code interpreters, and APIs still run on CPUs2
. Nvidia and Arm are also developing competing rack-scale CPU platforms, with Nvidia announcing a system packing 256 of its 88-core Vera CPUs, and Arm working on designs with up to 45,696 cores2
. As millions of users deploy AI agents that continuously perform tasks and coordinate applications in real time, data centers must handle significantly more orchestration and system communication—operations where CPUs serve as the critical "manager" keeping infrastructure running smoothly.🟡,Summarized by
Navi
[1]
[2]
26 Mar 2026•Technology

04 Mar 2026•Business and Economy

01 Jul 2026•Technology

1
Policy and Regulation

2
Policy and Regulation

3
Policy and Regulation
