Nvidia Unveils Rubin CPX: A New GPU Optimized for Long-Context AI Inference

Nvidia Introduces Rubin CPX: A New Era in AI Inference

Nvidia, the leading GPU manufacturer, has unveiled its latest innovation in AI hardware: the Rubin CPX GPU. Announced at the AI Infrastructure Summit, this new chip is specifically designed to handle long-context AI workloads, marking a significant advancement in the field of artificial intelligence processing 1

Source: Benzinga

Technical Specifications and Performance

The Rubin CPX boasts impressive specifications:

30 petaFLOPs of NVFP4 compute power
128GB of GDDR7 memory
Hardware-accelerated attention mechanism, 3x faster than the GB300 NVL72
Four NVENC and four NVDEC units for video acceleration 2
2
5
5

Source: AIM

Disaggregated Inference: A New Approach

Nvidia's Rubin CPX is part of a broader strategy called 'disaggregated inference'. This approach splits AI workloads into two phases:

Context phase: Handled by compute-optimized GPUs like the Rubin CPX
Generation phase: Managed by memory bandwidth-optimized GPUs like the standard Rubin 2
2

This strategy aims to improve efficiency and performance for AI tasks requiring extensive context processing, such as video generation and software development 3

Applications and Industry Impact

The Rubin CPX is designed to excel in scenarios where AI models need to process massive amounts of context:

Video generation: Processing up to 1 million tokens for an hour of video content
Code generation: Handling large codebases for AI-assisted programming
Research and high-definition video workflows 4
4

Nvidia claims that a $100 million investment in systems using Rubin CPX could potentially generate $5 billion in token revenue, highlighting the significant economic impact of this technology 4

Deployment and Future Roadmap

The Rubin CPX will be available as part of Nvidia's Vera Rubin NVL144 CPX rack, which includes:

144 Rubin CPX GPUs
144 standard Rubin GPUs
36 Vera CPUs
100TB of high-speed memory
1.7PB/s of memory bandwidth 5
5

The entire system is capable of delivering 8 exaFLOPs of NVFP4 compute power. Shipments are expected to begin in late 2026 1

Source: The Register

Looking ahead, Nvidia's roadmap includes:

Rubin Ultra: Expected in 2027
Feynman: Slated for 2028

These future iterations promise even higher density modules, HBM4E memory, and faster networking capabilities 5

Nvidia Unveils Rubin CPX: A New GPU Optimized for Long-Context AI Inference

Nvidia Introduces Rubin CPX: A New Era in AI Inference

Technical Specifications and Performance

Disaggregated Inference: A New Approach

Applications and Industry Impact

Deployment and Future Roadmap

References

Nvidia unveils new GPU designed for long-context inference | TechCrunch

Nvidia Rubin CPX forms one half of new, "disaggregated" AI inference architecture -- approach splits work between compute- and bandwidth-optimized chips for best performance

Nvidia's context-optimized Rubin CPX GPUs were inevitable

Nvidia unveils AI chips for video, software generation

Nvidia unveils Rubin CPX GPU with 128GB memory for enterprise AI workloads

Related Stories

Nvidia Unveils Vera Rubin Superchip: Six-Trillion Transistor AI Platform Set for 2026 Production

NVIDIA Unveils Next-Gen AI Powerhouses: Rubin and Rubin Ultra GPUs with Vera CPUs

NVIDIA Unveils Roadmap for Next-Gen AI GPUs: Blackwell Ultra and Vera Rubin

Recent Highlights

Google launches Gemini 3 Flash as default AI model, delivering speed with Pro-grade reasoning

OpenAI launches GPT Image 1.5 as AI image generator war with Google intensifies

OpenAI launches ChatGPT app store, opening doors for third-party developers to build AI-powered apps

Recent Highlights

Today's Top Stories

Doctors warn AI companions threaten mental health as kids turn to chatbots for friendship

AI resurrections of dead celebrities spark ethical debate over digital likeness control

Chinese AI models match Western rivals as open-source battle reshapes global AI landscape

AI hiring creates 'doom loop' as 78% of companies deploy AI agents for job interviews