Nvidia Vera Rubin architecture slashes AI costs by 10x with advanced networking at its core

Reviewed byNidhi Govil

32 Sources

Share

Nvidia CEO Jensen Huang announced the Vera Rubin architecture at CES 2026, declaring it's in full production. The next-generation AI superchip platform promises a 10x reduction in inference costs and requires four times fewer GPUs to train certain models compared to Blackwell. But the real innovation lies in its six-chip design, where advanced networking components work in concert to handle distributed AI workloads across data centers.

Nvidia Vera Rubin Enters Full Production with Dramatic Cost Reductions

Nvidia CEO Jensen Huang made a surprise announcement at the Consumer Electronics Show in Las Vegas this week, revealing that the company's Nvidia Vera Rubin architecture is already in full production

1

3

. The next-generation AI superchip platform, set to reach customers in the second half of 2026, promises to dramatically transform AI computing economics. According to Nvidia's performance data, the system will reduce AI inference costs by up to 10x and requires only one-fourth as many GPUs to train certain large models compared to the current Blackwell architecture

2

3

.

Source: TweakTown

Source: TweakTown

Named after astronomer Vera Florence Cooper Rubin, the Rubin architecture represents a fundamental shift in how Nvidia approaches AI infrastructure challenges. "Vera Rubin is designed to address this fundamental challenge that we have: The amount of computation necessary for AI is skyrocketing," Huang told the CES audience

2

. The platform will replace the Blackwell architecture, which has driven Nvidia's record-breaking data center revenue growth of 66 percent year-over-year .

Six-Chip Design Prioritizes Networking for Distributed AI Workloads

The Rubin architecture comprises six integrated chips working in what Nvidia calls "extreme co-design." At the center sits the Vera CPU, built with 88 custom Olympus cores and full Armv9.2 compatibility, alongside the Rubin GPU that delivers 50 petaflops of 4-bit computational power for transformer-based inference workloads—five times more than Blackwell's 10 petaflops

1

5

. Both the Vera CPU and Rubin GPU are built using Taiwan Semiconductor Manufacturing Company's 3nm fabrication process

3

.

Source: Wccftech

Source: Wccftech

But focusing solely on the GPU misses the bigger picture. Four advanced networking components complete the architecture: the NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 data processing unit, and Spectrum-6 Ethernet switch

1

5

. "The same unit connected in a different way will deliver a completely different level of performance," explains Gilad Shainer, senior vice president of networking at Nvidia. "That's why we call it extreme co-design"

1

.

Advanced Networking Components Enable Scale-Out AI Infrastructure

The networking innovations address a critical shift in AI model training and inference. "Two years back, inferencing was mainly run on a single GPU, a single box, a single server," Shainer notes. "Right now, inferencing is becoming distributed, and it's not just in a rack. It's going to go across racks"

1

. The NVLink 6 switch doubles bandwidth to 3,600 gigabytes per second for GPU-to-GPU connections, compared to 1,800 GB/s in the previous generation, while also doubling the number of SerDes and expanding in-network computing capabilities

1

.

In-network computing allows certain operations to be performed within the network itself rather than on individual GPUs, saving both time and power. For AI model training, this means operations like all-reduce—where GPUs need to share and average their computed gradients—can be done once on the network switch instead of requiring every GPU to perform the calculation

1

. The scale-out network, comprising the ConnectX-9, BlueField-4 paired with two Vera CPUs, and Spectrum-6 Ethernet switch with co-packaged optics, connects different racks within data centers while minimizing jitter to ensure synchronized distributed computing

1

.

Source: Analytics Insight

Source: Analytics Insight

Major Cloud Providers Commit to Rubin Deployment

Microsoft and CoreWeave will be among the first to offer services powered by Rubin chips later this year, with two major AI data centers that Microsoft is building in Georgia and Wisconsin set to include thousands of Rubin systems

3

. Amazon Web Services, Google Cloud, Anthropic, and OpenAI have also committed to the platform

2

5

. The platform will power HPE's Blue Lion supercomputer and the upcoming Doudna supercomputer at Lawrence Berkeley National Lab

2

.

The Rubin architecture will be available in multiple configurations, including the Nvidia Vera Rubin NVL72, which combines 36 Vera CPUs, 72 Rubin GPUs, NVLink 6 switches, multiple ConnectX-9 SuperNICs, and BlueField-4 DPUs

5

. According to Nvidia's tests, the platform operates 3.5 times faster than Blackwell on AI model training tasks and five times faster on inference, while supporting eight times more inference compute per watt for improved power efficiency

2

.

Implications for Large-Scale AI Deployment Economics

The dramatic cost reductions target a critical bottleneck in AI adoption. For mixture of experts models, the Rubin architecture can complete training in the same time as Blackwell while using a quarter of the GPUs and at one-seventh the token cost . Dion Harris, Nvidia's senior director of AI infrastructure solutions, points to growing memory demands from agentic AI and long-term tasks. "We've introduced a new tier of storage that connects externally to the compute device, which allows you to scale your storage pool much more efficiently," Harris explained

2

.

These gains arrive as competition intensifies to build AI infrastructure, with Huang estimating that between $3 trillion and $4 trillion will be spent on AI infrastructure over the next five years

2

. The AI supercomputing platform's efficiency improvements could make it harder for Nvidia's customers to justify moving away from its hardware ecosystem, while potentially accelerating mainstream adoption of advanced AI models by making large-scale AI deployment more economically viable

3

5

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo