3 Sources
3 Sources
[1]
Nvidia GTC 2026: What to expect at AI Burning Man
From Groq-ing about tokenomics to OpenClaw and the silicon that powers it, our predictions for the hottest ticket in town Nvidia has a bit of a problem. Popular generative AI workloads like code assistants and agentic systems generate massive quantities of tokens and need to move them at speed. But the GPU giant's chips currently struggle to deliver. That will start to change next week when Nvidia CEO Jensen Huang uses his company's GPU Technology Conference (better known as GTC) to explain how he will use the token-spewing accelerator tech he acquired with upstart Groq late last year. Market-watching firm SemiAnalysis' latest InferenceX benchmarks shows how Groq's tech helps to fill the gap in Nvidia's current portfolio. While Nvidia's NVL72 rack systems scale well at lower per-user token generation rates, they become progressively less efficient as user interactivity increases. By contrast, SRAM-heavy architectures, like those championed by Groq and Cerebras, excel in latency sensitive scenarios and can achieve token generation rates often exceeding 500 or even 1,000 tokens a second. That's many more tokens than GPU-based architectures can deliver. In fact, this capability is how Cerebras won OpenAI's business earlier this year to power its Codex model. Nvidia didn't own anything to match Cerebras until it acquired Groq's intellectual property and talent for a staggering $20 billion in December. By combining its GPU tech and CUDA software libraries with Groq's dataflow architecture, Nvidia has the opportunity to raise the Pareto curve dramatically, reducing the cost per token, while at the same time bolstering output speeds. Extending Nvidia's CUDA hardware stack to include Groq's dataflow architecture will not be easy. At GTC, Nvidia might announce it will add limited support for Groq's existing architecture relatively quickly. This GTC already feels a bit different as Nvidia has spilled the beans on its Rubin GPUs back at CES in January. To recap, Rubin packs up to 288 GB of HBM4 memory good for 22 TB/s of bandwidth and 35-50 petaFLOPS of dense NVFP4 performance depending on the use case. The launch represents a major performance uplift over Nvidia's current Blackwell-generation parts, delivering 5x the dense floating point throughput. So far, Nvidia has announced the chips will be available in both an eight-way HGX platform or its NVL72 rack system, which as the name suggests, crams 72 Rubin SXM modules into a single system. There's also Rubin GPX, which was announced back at Computex in June 2025, which will slot into select NVL racks to provide additional compute capacity for large context and video processing workflows. We expect to see Huang hammer on the performance optimizations and efficiency gains delivered by its growing portfolio of GPUs. But with those GPUs growing ever hotter - estimates put Rubin's thermal design power at 1.8kW or perhaps even higher - liquid cooling isn't optional. Some buyers may balk at that requirement, which would benefit AMD and its air-cooled kit. However, given the generation gains delivered by the Rubin architecture, there's nothing stopping Nvidia from releasing a single-die, air cooled version of the chip with five or six HBM stacks rather than eight. Such a chip would still deliver a 2.5x uplift in performance over Blackwell - without requiring liquid cooling. That's just speculation, but we have a sneaking suspicion we might see something along these lines during next week's festivities. Alongside its latest datacenter GPUs, we anticipate more details on Nvidia's standalone Vera CPU. First teased at last year's GTC, Vera features 88 custom-Arm cores which add support for simultaneous multithreading and a slew of confidential computing features previously only available on x86 platforms. So far, we've only seen the CPU packaged as part of Nvidia's Vera-Rubin superchip. However, we've since learned Nvidia will offer the chip as a standalone processor that will compete with Intel and AMD for some mainstream applications. Previously, Nvidia had offered Grace CPU superchips, but those were primarily for use in supercomputers and other HPC applications. However, last month the GPU giant revealed Meta would be its first partner to deploy Grace at scale and that the Social Network was already evaluating Vera CPUs for use in its datacenters as well. Alongside new datacenter silicon, we also anticipate Huang will share more details about Nvidia's next-gen Kyber racks and Feynman GPUs, which should debut in 2027 and 2028. We first saw the Kyber at last year's GTC. The 600 kW behemoth is set to cram 144 GPU sockets, each with four Rubin Ultra GPU dies into a standard rack form factor. Nvidia disclosed the existence of Kyber in part because datacenter operations were already struggling with the 120kW NVL72 systems announced the year before. By revealing Kyber, Nvidia lit a fire under datacenter physical infrastructure providers so they could provision the power supplies and cooling kit necessary to support such a system by 2027. With a yearly release cadence, Nvidia can't wait for the rest of the industry to catch up - it must telegraph its next move years in advance. With Feynman just two years out, we suspect Huang may repeat the exercise, setting new power and cooling targets, likely exceeding a megawatt per rack. Nvidia has long rumored to have been working on an Arm-based system on chip for PCs. A part capable of doing that job arrived last year in form of the DGX Spark and GB10 partner systems that put it to work. So far, however, OEMs have only used the chip in workstation class mini-PCs running Linux. Recent reports indicate Nvidia is working with the likes of Lenovo and Dell to bring a similar product to the Windows PC market. As we previously reported, Nvidia is also working with Intel to integrate its GPU dies into Chipzilla's next-gen processors. GTC seems like as good a time as any to throw gamers a bone and give Nvidia a new market to chase beyond its side hustles in the pro-visualization markets. Integrated Nvidia graphics might not be the RTX 50 Super series cards that many had hoped to see at CES, but given the state of the memory market, it seems unlikely we'll see them make an appearance at GTC. Beyond big iron and the remote possibility of some consumer hardware, you can bet on OpenClaw being a major talking point at GTC. Jensen Huang is apparently quite fond of the agentic framework in spite of its many security vulnerabilities, reportedly describing it as the "most important software release probably ever." The company is reportedly working on its own, presumably safer, version of the platform called NemoClaw. Speaking of claws, we also expect to see a fair few more robots take the stage. Since announcing its Isaac GR00T robotics platform nearly two years ago, Nvidia has launched a steady supply of new toolkits, frameworks, and hardware development platforms aimed at giving generative AI physical form. And to teach them to function in an unpredictable world, you can count on Nvidia's Omniverse digital twin platform to make another appearance. Introduced in 2019 at a time of rising Metaverse hype, the platform aimed to create a virtual environment in which physical processes could be simulated in the digital world before real-life implementation. Developers have since integrated Omniverse in a variety of simulation platforms, including those used to design and build AI bit barns. El Reg will be on the ground in San Jose next week for GTC to bring you the latest news from what has become one of the world's most-watched tech conferences. ®
[2]
Nvidia to focus on competition-beating AI advances at megaconference
SAN FRANCISCO, March 13 (Reuters) - When Jensen Huang strides onto the stage of a packed hockey arena to kick off Nvidia's (NVDA.O), opens new tab annual developer conference on Monday, he is likely to reveal products and partnerships geared toward keeping the AI chipmaker atop a growing array of competitors. Taking over the heart of Silicon Valley for most of a week, Nvidia GTC, as the conference is known, has become CEO Huang's preferred event to show off Nvidia's AI advances in chips, data centers, its chip programming software CUDA, digital assistants known as AI agents, and physical AI such as robots. This year, the four-day event is even more crucial as investors will seek assurance that Nvidia's strategy of plowing back its profits into the AI ecosystem is paying off. "I expect Nvidia to present a full-stack roadmap update from Rubin to Feynman while emphasizing inference, agentic AI, networking, and AI factory infrastructure," said eMarketer analyst Jacob Bourne, using the names for Nvidia's current and forthcoming generations of chips. Nvidia's chips sit at the center of hundreds of billions of dollars in investments in data centers by governments and companies around the globe, but the company is facing competition from other chipmakers and even from some of its customers who are developing their own chips. Analysts told Reuters they expect that overall AI chip market to keep growing, but Nvidia's slice to shrink somewhat as the AI chip market changes rapidly to where AI agents scurry back and forth among computer applications carrying out tasks on behalf of humans. That is a shift from training, where AI labs link many Nvidia chips together into one computer to chew through huge amounts of data to perfect their AI models. Those agents are expected to become so numerous that the humans asking them to do work will even need a new layer of AI middle managers - what technologists call an "orchestration" layer - to sit between human users and their fleets of agents. In some ways, analysts say, that's a good thing for Nvidia, because it signals that AI is becoming more useful. But those tasks, broadly known as "inference" in the AI industry, can also run on other kinds of chips, including the ones that big Nvidia customers such as OpenAI and Meta (META.O), opens new tab, which recently said it plans to release new AI chips every six months, can build for themselves. "Nvidia is definitely going to see more competition compared to a year ago," said KinNgai Chan, a managing director at Summit Insights Group. "Nvidia still has close to over 90% market share in both training and inference markets today." "We think Nvidia will begin to see share loss starting in 2027, once in-house ASIC programs gain some scale especially in the inference market," he said, referring to application-specific integrated circuits, chips tailored for a single function or custom workload, offering higher efficiency than general-purpose graphics processing units. NVIDIA IS SHORING UP DEFENSES The company spent $17 billion in December to purchase Groq, a chip startup that specializes in fast and cheap inference computing work. Talking about Groq on the company's earnings call last month, Huang said the company would showcase at GTC how Nvidia can plug Groq's ultra-fast AI technology into their existing CUDA platform. William McGonigle, analyst at Third Bridge, said his firm expects Nvidia to roll out a new line of servers that will combine Groq's chips with Nvidia's networking technologies to create a speedy and cost-efficient product. Another type of chip that poses an increasing competitive threat to Nvidia is the central processor unit, or CPU, the kind of chip long championed by Intel (INTC.O), opens new tab and Advanced Micro Devices (AMD.O), opens new tab. While those chips took a backseat to Nvidia's graphics processor units (GPUs) in recent years, McGonigle said they are "back in focus" and expects Nvidia to show off servers that use only its CPUs, which Huang talked up on a recent earnings call. "With the rise of agentic AI, the bottleneck is now at the agent orchestration level, which is carried out by the CPUs," McGonigle said. Analysts also expect Nvidia to elaborate on why it invested $2 billion each in Lumentum and Coherent, both of which make lasers for sending information between chips in the form of beams of light. Use of those lasers in what are called co-packaged optics could help speed up the connections among Nvidia's chips inside huge data centers, but they are not currently made in big enough volumes to match the number of chips Nvidia sells each year. "Nvidia will likely frame co-packaged optics as key to connecting massive AI clusters more efficiently, but the challenge is making it affordable enough to deploy at scale," said eMarketer's Bourne. Reporting by Stephen Nellis in San Francisco and Zaheer Kachwala in Bengaluru; Editing by Sayantani Ghosh and Sonali Paul Our Standards: The Thomson Reuters Trust Principles., opens new tab
[3]
Nvidia to focus on competition-beating AI advances at megaconference
Jensen Huang will unveil Nvidia's AI innovations at its developer conference. The company faces increasing competition from rivals and customers developing their own chips. Nvidia is investing heavily to maintain its lead in the AI chip market. Analysts expect shifts in the AI landscape, with a focus on agentic AI and inference tasks. When Jensen Huang strides onto the stage of a packed hockey arena to kick off Nvidia's annual developer conference on Monday, he is likely to reveal products and partnerships geared toward keeping the AI chipmaker atop a growing array of competitors. Taking over the heart of Silicon Valley for most of a week, Nvidia GTC, as the conference is known, has become CEO Huang's preferred event to show off Nvidia's AI advances in chips, data centers, its chip programming software CUDA, digital assistants known as AI agents, and physical AI such as robots. This year, the four-day event is even more crucial as investors will seek assurance that Nvidia's strategy of plowing back its profits into the AI ecosystem is paying off. "I expect Nvidia to present a full-stack roadmap update from Rubin to Feynman while emphasizing inference, agentic AI, networking, and AI factory infrastructure," said eMarketer analyst Jacob Bourne, using the names for Nvidia's current and forthcoming generations of chips. Nvidia's chips sit at the center of hundreds of billions of dollars in investments in data centers by governments and companies around the globe, but the company is facing competition from other chipmakers and even from some of its customers who are developing their own chips. Analysts told Reuters they expect that overall AI chip market to keep growing, but Nvidia's slice to shrink somewhat as the AI chip market changes rapidly to where AI agents scurry back and forth among computer applications carrying out tasks on behalf of humans. That is a shift from training, where AI labs link many Nvidia chips together into one computer to chew through huge amounts of data to perfect their AI models. Those agents are expected to become so numerous that the humans asking them to do work will even need a new layer of AI middle managers - what technologists call an "orchestration" layer - to sit between human users and their fleets of agents. In some ways, analysts say, that's a good thing for Nvidia, because it signals that AI is becoming more useful. But those tasks, broadly known as "inference" in the AI industry, can also run on other kinds of chips, including the ones that big Nvidia customers such as OpenAI and Meta, which recently said it plans to release new AI chips every six months, can build for themselves. "Nvidia is definitely going to see more competition compared to a year ago," said KinNgai Chan, a managing director at Summit Insights Group. "Nvidia still has close to over 90% market share in both training and inference markets today." "We think Nvidia will begin to see share loss starting in 2027, once in-house ASIC programs gain some scale especially in the inference market," he said, referring to application-specific integrated circuits, chips tailored for a single function or custom workload, offering higher efficiency than general-purpose graphics processing units. Nvidia is shoring up defences The company spent $17 billion in December to purchase Groq, a chip startup that specialises in fast and cheap inference computing work. Talking about Groq on the company's earnings call last month, Huang said the company would showcase at GTC how Nvidia can plug Groq's ultra-fast AI technology into their existing CUDA platform. William McGonigle, analyst at Third Bridge, said his firm expects Nvidia to roll out a new line of servers that will combine Groq's chips with Nvidia's networking technologies to create a speedy and cost-efficient product. Another type of chip that poses an increasing competitive threat to Nvidia is the central processor unit, or CPU, the kind of chip long championed by Intel and Advanced Micro Devices. While those chips took a backseat to Nvidia's graphics processor units (GPUs) in recent years, McGonigle said they are "back in focus" and expects Nvidia to show off servers that use only its CPUs, which Huang talked up on a recent earnings call. "With the rise of agentic AI, the bottleneck is now at the agent orchestration level, which is carried out by the CPUs," McGonigle said. Analysts also expect Nvidia to elaborate on why it invested $2 billion each in Lumentum and Coherent, both of which make lasers for sending information between chips in the form of beams of light. Use of those lasers in what are called co-packaged optics could help speed up the connections among Nvidia's chips inside huge data centers, but they are not currently made in big enough volumes to match the number of chips Nvidia sells each year. "Nvidia will likely frame co-packaged optics as key to connecting massive AI clusters more efficiently, but the challenge is making it affordable enough to deploy at scale," said eMarketer's Bourne.
Share
Share
Copy Link
Nvidia's annual GTC developer conference begins Monday with CEO Jensen Huang expected to unveil how the company will integrate its $17 billion Groq acquisition into existing platforms. Despite holding over 90% market share in AI chips, Nvidia confronts rising competition from AMD, Intel, and customers like Meta building their own chips as the industry shifts toward inference and agentic AI workloads.
When Jensen Huang takes the stage Monday at Nvidia's annual GTC developer conference in a packed Silicon Valley hockey arena, the AI chip giant faces a critical moment. Investors and industry watchers will scrutinize how Nvidia plans to maintain dominance in the rapidly evolving AI chip market while confronting competition from chipmakers like AMD and Intel, as well as customers including OpenAI and Meta developing their own hardware
2
. The four-day event serves as Huang's preferred platform to showcase advances in AI chips, AI data centers, CUDA programming software, digital assistants known as AI agents, and physical AI such as robotics2
.
Source: ET
Nvidia spent $17 billion in December to acquire Groq, a chip startup specializing in fast and cheap inference computing work
3
. This acquisition directly addresses a significant weakness in Nvidia's portfolio. Popular generative AI workloads like code assistants and agentic AI systems generate massive quantities of tokens and need to move them at speed, but Nvidia's graphics processing units currently struggle to deliver1
. According to SemiAnalysis' latest InferenceX benchmarks, while Nvidia's NVL72 rack systems scale well at lower per-user token generation rates, they become progressively less efficient as user interactivity increases1
.By contrast, SRAM-heavy architectures championed by Groq and Cerebras excel in latency-sensitive scenarios and can achieve token generation rates often exceeding 500 or even 1,000 tokens per second—far more than GPU-based architectures can deliver
1
. This capability is how Cerebras won OpenAI's business earlier this year to power its Codex model1
. Huang indicated during last month's earnings call that the company would showcase at GTC how Nvidia can plug Groq's ultra-fast artificial intelligence technology into their existing CUDA platform3
. Third Bridge analyst William McGonigle expects Nvidia to roll out a new line of servers that will combine Groq's chips with Nvidia's networking technologies to create a speedy and cost-efficient product2
.Nvidia already revealed details about its Rubin GPUs at CES in January, which pack up to 288 GB of HBM4 memory good for 22 TB/s of bandwidth and 35-50 petaFLOPS of dense NVFP4 performance depending on the use case
1
. The launch represents a major performance uplift over Nvidia's current Blackwell-generation parts, delivering 5x the dense floating point throughput1
. The chips will be available in both an eight-way HGX platform or its NVL72 rack system, which crams 72 Rubin SXM modules into a single system1
.
Source: The Register
EMarketer analyst Jacob Bourne expects Nvidia to present a full-stack roadmap update from Rubin to Feynman while emphasizing inference, agentic AI, networking, and AI factory infrastructure
2
. However, with Rubin's thermal design power estimated at 1.8kW or perhaps even higher, liquid cooling isn't optional1
. Some buyers may balk at that requirement, which would benefit AMD and its air-cooled kit1
. Industry observers speculate Nvidia might release a single-die, air-cooled version of the chip with five or six HBM stacks rather than eight, which would still deliver a 2.5x uplift in performance over Blackwell without requiring liquid cooling1
.Related Stories
Another type of chip that poses an increasing competitive threat to Nvidia is the central processing unit, long championed by Intel and Advanced Micro Devices
2
. While those chips took a backseat to Nvidia's graphics processing units in recent years, McGonigle said they are "back in focus" and expects Nvidia to show off servers that use only its CPUs, which Huang talked up on a recent earnings call2
.First teased at last year's GTC, Vera CPU features 88 custom-Arm cores which add support for simultaneous multithreading and a slew of confidential computing features previously only available on x86 platforms
1
. While previously offered primarily for supercomputers and other HPC applications, Nvidia revealed last month that Meta would be its first partner to deploy Grace at scale and that the social network was already evaluating Vera CPUs for use in its AI data centers as well1
. "With the rise of agentic AI, the bottleneck is now at the agent orchestration level, which is carried out by the CPUs," McGonigle explained3
.Nvidia's chips sit at the center of hundreds of billions of dollars in investments in AI data centers by governments and companies around the globe, but analysts expect the overall AI chip market to keep growing while Nvidia's slice shrinks somewhat
2
. The AI chip market changes rapidly as AI agents scurry back and forth among computer applications carrying out tasks on behalf of humans—a shift from training, where AI labs link many Nvidia chips together into one computer to chew through huge amounts of data to perfect their AI models2
."Nvidia is definitely going to see more Nvidia competition compared to a year ago," said KinNgai Chan, a managing director at Summit Insights Group. "Nvidia still has close to over 90% market share in both training and inference markets today." However, Chan added: "We think Nvidia will begin to see share loss starting in 2027, once in-house ASIC programs gain some scale especially in the inference market"
3
. These application-specific integrated circuits are chips tailored for a single function or custom workload, offering higher efficiency than general-purpose graphics processing units3
.Analysts also expect Nvidia to elaborate on why it invested $2 billion each in Lumentum and Coherent, both of which make lasers for sending information between chips in the form of beams of light
2
. Use of those lasers in what are called co-packaged optics could help speed up the connections among Nvidia's chips inside huge AI data centers, though Bourne noted: "Nvidia will likely frame co-packaged optics as key to connecting massive AI clusters more efficiently, but the challenge is making it affordable enough to deploy at scale"2
. This year's event is crucial as investors will seek assurance that Nvidia's strategy of plowing back its profits into the AI ecosystem is paying off3
.
Source: Reuters
Summarized by
Navi
[1]
1
Technology

2
Technology

3
Policy and Regulation
