2 Sources
2 Sources
[1]
Nvidia's new AI system Vera Rubin is 10 times more efficient than its predecessor -- here's a first look
First look at Vera Rubin, Nvidia's next AI system that's 10 times more efficient Nvidia's earnings on Wednesday are expected to show booming sales of the company's current rack-scale system. But all eyes are on its next AI system, Vera Rubin, which is scheduled to roll out later this year. Vera Rubin, which is made up of 1.3 million components, will deliver 10 times more performance per watt than its predecessor, Grace Blackwell, the company claims. That's a significant development when energy consumption is one of the most critical issues facing the artificial intelligence build-out. CNBC got an exclusive first look at Vera Rubin at Nvidia's headquarters in Santa Clara, California. Nvidia says the new AI system is a complex web of parts sourced from around the world. Its core chips include 72 Rubin graphics processing units, or GPUs, and 36 Vera central processing units, or CPUs, primarily made by Taiwan Semiconductor Manufacturing Co. The other parts, from liquid cooling elements to power systems and compute trays, come from more than 80 suppliers in at least 20 countries, including China, Vietnam, Thailand, Mexico, Israel and the U.S. One big challenge the company faces is the soaring costs of memory due to a global shortage from all the AI-driven demand. Dion Harris, Nvidia's AI infrastructure head, said in an interview that the company has been giving suppliers "very detailed forecasts." "We're aligning to make sure that everything we're shipping will be met by our supply chain," he said. "We're in good shape." It's a critical moment for Nvidia, which dominates the market for AI processors but faces intensifying competition from Advanced Micro Devices as well as custom silicon from Broadcom and Google's homegrown tensor processing units. Nvidia has plans to manufacture up to $500 billion of AI infrastructure in the U.S. through 2029, including making Blackwell GPUs at TSMC's new Arizona fabs. Grace Blackwell went into production in 2024, and changed the game on how much compute was possible with a single system. Vera Rubin, which is expected to ship in the second half of 2026, takes the company to another level. Nvidia CEO Jensen Huang announced in January that the system was in full production.
[2]
Here's a Look at One of the World's Most Complex AI Systems, the NVIDIA Vera Rubin, Integrating a Million Components
NVIDIA's next-gen Vera Rubin is currently under full production, and the company has provided us with an extensive overview of the rack architecture, diving into individual components. When we talk about rack generations, NVIDIA is set to feature major upgrades with Vera Rubin, which we'll discuss in depth, but based on a recent video by CNBC diving into the Vera Rubin architecture, we saw an extensive look at multiple components, ranging from the main compute node to networking and cooling elements. More importantly, NVIDIA's Senior Director of Infrastructure, Dion Harris, calls Vera Rubin one of the "world's most complex AI systems, arguing that what NVIDIA does is unique and difficult to execute. Given that Rubin is expected to see customer commitments soon, it is important that we dive into what an NVL72 rack actually looks like. And, one of the most essential elements of the rack out there is, of course, the Vera Rubin SuperChip itself. We have already talked about how the Rubin GPU and Vera CPU configuration looks from a technical perspective, but one important point to note is that major performance improvements come from NVIDIA integrating HBM4 with the GPU, along with dedicated SOCAMM modules. Altogether, memory bandwidth reaches a whopping 1.2 TB/s. NVIDIA's major upgrade with Vera Rubin also comes within the cooling department, since Team Green plans to integrate modular liquid cooling designs, covering SuperChip elements such as Rubin GPU and Vera CPU, through dedicated cold plates. NVIDIA's executives argue that Rubin deployment will indeed convince hyperscalers to switch to upgraded liquid-cooling systems, and, interestingly, the current implementation reduces water use, another benefit touted by NVIDIA. NVLink is an important aspect of Vera Rubin NVL72, and with the 6th-generation interconnection fabric, often called the "NVLink Spine", NVIDIA plans to deliver a total aggregate bandwidth of 260 TB/s per rack. Harris says that with the latest NVLink generation, the company has taken modularity to a whole new level, which is why it claims the NVLink 6 spine supports zero-downtime maintenance and rack-level RAS services. While estimates suggest that Vera Rubin will debut with a decent price hike, NVIDIA says that the architecture brings in a 10x reduction in inference token cost and 4x reduction in the number of GPUs to train MoE models vs Blackwell GB200, which means that the "most you buy, the more you save" rule by NVIDIA's CEO is still intact.
Share
Share
Copy Link
Nvidia has revealed its Vera Rubin AI system, featuring 1.3 million components and delivering 10 times more performance per watt than its predecessor. The rack-scale system integrates 72 Rubin GPUs and 36 Vera CPUs, addressing critical energy consumption challenges as the company prepares for second-half 2026 shipments.
Nvidia has provided an exclusive first look at Vera Rubin, its next-generation AI system that promises to reshape the AI infrastructure landscape with dramatic efficiency improvements
1
. The system delivers 10 times more performance per watt than Grace Blackwell, its predecessor, addressing one of the most pressing concerns in artificial intelligence deployment: high energy consumption1
. This energy-efficient breakthrough comes at a critical moment when data centers worldwide struggle with power demands from AI workloads.Dion Harris, Nvidia's Senior Director of AI infrastructure, describes Vera Rubin as one of the "world's most complex AI systems," comprising 1.3 million components sourced from more than 80 suppliers across at least 20 countries
1
2
. The complexity underscores what Nvidia does uniquely and why execution remains difficult to replicate, Harris argues2
.
Source: Wccftech
At the heart of the NVL72 rack sits the Vera Rubin SuperChip configuration, featuring 72 Rubin GPU units and 36 Vera CPU units primarily manufactured by Taiwan Semiconductor Manufacturing Co
1
. Major performance improvements stem from Nvidia integrating HBM4 memory with the GPU, alongside dedicated SOCAMM modules2
. This integration pushes memory bandwidth to a remarkable 1.2 TB/s, enabling faster data processing for demanding AI workloads2
.The AI system also features significant upgrades in cooling technology through modular liquid cooling designs that cover SuperChip elements with dedicated cold plates
2
. Nvidia executives believe Rubin deployment will convince hyperscalers to adopt upgraded liquid cooling systems, with the current implementation reducing water usage—another environmental benefit2
.The 6th-generation NVLink interconnection fabric, termed the "NVLink Spine," delivers a total aggregate bandwidth of 260 TB/s per rack
2
. Harris emphasizes that the latest NVLink generation takes modularity to new levels, supporting zero-downtime maintenance and rack-level RAS services2
. This connectivity boost enables more efficient distributed training and inference across multiple GPUs.Related Stories
Nvidia faces significant supply chain pressures, particularly with soaring memory costs driven by global shortages from AI-driven demand
1
. Harris stated the company has been providing suppliers "very detailed forecasts" to ensure alignment, adding "we're in good shape"1
. Components for Vera Rubin arrive from China, Vietnam, Thailand, Mexico, Israel, and the U.S., creating a complex logistics network1
.Jensen Huang announced in January that the system entered full production, with shipments expected in the second half of 2026
1
2
. Grace Blackwell went into production in 2024 and changed expectations for compute capability in a single system1
.While estimates suggest Vera Rubin will debut with a price increase, Nvidia claims the architecture delivers a 10x reduction in inference token cost and requires 4x fewer GPUs to train Mixture-of-Experts (MoE) models compared to Blackwell GB200
2
. This means Nvidia's "the more you buy, the more you save" principle remains intact despite higher upfront costs2
.The launch comes as Nvidia dominates the AI processor market but faces intensifying competition from Advanced Micro Devices, custom silicon from Broadcom, and Google's homegrown tensor processing units
1
. Nvidia has plans to manufacture up to $500 billion of AI infrastructure in the U.S. through 2029, including producing Blackwell GPUs at TSMC's new Arizona facilities1
. Watch for customer commitments as Vera Rubin approaches availability, which will signal whether hyperscalers view the efficiency gains as worth the investment in upgraded infrastructure.Summarized by
Navi
29 Oct 2025•Technology

06 Jan 2026•Technology

19 Mar 2025•Technology

1
Technology

2
Technology

3
Policy and Regulation
