NVIDIA Blackwell Dominates MLPerf Inference Benchmarks, AMD's MI325X Challenges Hopper

NVIDIA Blackwell Dominates Latest MLPerf Inference Benchmarks

In the recently released MLPerf Inference v5.0 benchmarks, NVIDIA's new Blackwell GPU architecture has demonstrated unprecedented performance, solidifying the company's leadership in AI computing. The benchmarks, which now include new tests to reflect the rapidly evolving AI landscape, showcase the capabilities of NVIDIA's latest offerings, particularly the B200 GPU 1 2.

Blackwell's Impressive Performance

The NVIDIA GB200 NVL72 system, featuring 72 Blackwell GPUs working as a single unit, delivered up to 30 times higher throughput on the new Llama 3.1 405B benchmark compared to the previous generation H200 NVL8 system. This remarkable improvement was achieved through more than triple the performance per GPU and a significantly larger NVIDIA NVLink interconnect domain 2.

In the new Llama 2 70B Interactive benchmark, which has stricter latency requirements, an NVIDIA DGX B200 system with eight Blackwell GPUs tripled the performance of a similar system using eight H200 GPUs. This test better reflects the demands of production deployments in delivering responsive user experiences 2.

AMD's Competitive Showing

While NVIDIA dominated the benchmarks, AMD's latest Instinct GPU, the MI325X, showed competitive performance against NVIDIA's H200 in specific tests. In the Llama2-70B test, an eight-GPU MI325X system came within 3 to 7 percent of the speed of a similarly configured H200-based system. For image generation tasks, the MI325X system was within 10 percent of the NVIDIA H200 computer's performance 1 3.

New Benchmarks Reflect Evolving AI Landscape

MLPerf has introduced three new benchmarks to keep pace with rapid developments in AI:

Llama2-70B Interactive: A more stringent version of the existing Llama2-70B test, requiring faster response times and token generation.
Llama3.405B: Designed to test LLMs with wide context windows, mimicking the capabilities needed for "agentic AI."
RGAT: A graph attention network benchmark for classifying information in complex networks 1.

Continued Optimization of Existing Architectures

NVIDIA's Hopper architecture, introduced in 2022, continues to show improvements through software optimizations. The H100 GPU has seen a 1.5x increase in throughput on the Llama 2 70B benchmark over the past year, while the memory-enhanced H200 extends that increase to 1.6x 2.

Implications for AI Factories

The advancements demonstrated in these benchmarks have significant implications for AI factories – infrastructure designed to manufacture intelligence at scale. The increased performance of Blackwell and the optimizations in Hopper architecture contribute to higher throughput and faster token rates, potentially reducing the cost per token for large language model inference 2.

Industry-wide Participation

The MLPerf benchmarks saw participation from 15 NVIDIA partners, including major cloud service providers and server manufacturers. This broad involvement reflects the widespread adoption of NVIDIA's AI platforms across the industry 2 3.

Looking Ahead

As AI models continue to grow in size and complexity, the race for more efficient and powerful hardware intensifies. While NVIDIA maintains a clear lead with its Blackwell architecture, AMD's competitive showing in specific benchmarks indicates ongoing innovation in the field. The upcoming NVIDIA Ultra platform, featuring the B300 GPU, promises to push performance boundaries even further 3.

NVIDIA Blackwell Dominates MLPerf Inference Benchmarks, AMD's MI325X Challenges Hopper

3 Sources

NVIDIA Blackwell Dominates Latest MLPerf Inference Benchmarks

Blackwell's Impressive Performance

AMD's Competitive Showing

New Benchmarks Reflect Evolving AI Landscape

Continued Optimization of Existing Architectures

Implications for AI Factories

Industry-wide Participation

Looking Ahead

NVIDIA Unveils Major GeForce NOW Upgrade with RTX 5080 Performance and Expanded Game Library

Google's Pixel 10 Series: AI-Powered Innovations and Hardware Upgrades Unveiled at Made by Google 2025 Event

Palo Alto Networks Forecasts Strong Growth Driven by AI-Powered Cybersecurity Solutions

OpenAI Tweaks GPT-5 to Be 'Warmer and Friendlier' Amid User Backlash

Europe's AI Regulations Could Thwart Trump's Deregulation Plans