NVIDIA Overcomes Blackwell AI Chip Design Flaw, Resumes Production with TSMC's Help

9 Sources

Share

NVIDIA CEO Jensen Huang admits to a design flaw in the company's latest Blackwell AI chips, which caused production delays. The issue has been resolved with TSMC's assistance, and mass production is back on schedule.

News article

NVIDIA Acknowledges Blackwell Design Flaw

NVIDIA's CEO Jensen Huang has publicly addressed a significant design flaw in the company's latest Blackwell AI chips. "We had a design flaw in Blackwell," Huang stated at an event in Copenhagen. "It was functional, but the design flaw caused the yield to be low. It was 100% Nvidia's fault."

3

This admission comes after reports of production delays that potentially affected major customers such as Meta, Google, and Microsoft

1

.

Technical Details of the Flaw

The Blackwell B100 and B200 GPUs utilize TSMC's CoWoS-L packaging technology, which employs an RDL interposer with local silicon interconnect bridges to achieve data transfer rates of about 10 TB/s. The flaw stemmed from a mismatch in thermal expansion properties between various components, causing system warping and failure

2

.

TSMC's Role in Resolution

Contrary to initial reports suggesting tensions between NVIDIA and TSMC, Huang clarified that TSMC played a crucial role in resolving the issue. "What TSMC did was to help us recover from that yield difficulty and resume the manufacturing of Blackwell at an incredible pace," Huang explained

1

. He dismissed claims of friction between the two companies as "fake news"

4

.

Complexity of the Blackwell Project

Huang highlighted the complexity of the Blackwell project, stating, "In order to make a Blackwell computer work, seven different types of chips were designed from scratch and had to be ramped into production at the same time."

3

This complexity may have contributed to the occurrence of the design flaw.

Resolution and Production Schedule

To address the issue, NVIDIA modified the top metal layers and bumps of the GPU silicon, enhancing production yields. While specific details of the fix remain undisclosed, the company confirmed that new masks were required

1

. The speed of resolution was noteworthy, as such issues typically take around three months to address in the semiconductor industry

2

.

Future Outlook and Market Demand

With the design flaw now resolved, mass production of the fixed Blackwell GPUs is set to begin in late October. Shipments are expected to start in early 2025, aligning with NVIDIA's fiscal year

1

. Demand for Blackwell chips remains high, with major tech companies placing significant orders. Google has ordered over 400,000 GB200 chips in a deal exceeding $10 billion, while Meta has placed a $10 billion order

1

.

Impact on NVIDIA's Market Position

The successful resolution of this design flaw is crucial for NVIDIA as it aims to maintain its dominant position in the AI chip market. The Blackwell platform, described by Huang as "the world's most powerful chip," is set to be up to 30x faster than its Grace Hopper predecessor in AI inference tasks, while also reducing cost and energy consumption by up to 25x

3

. This advancement is expected to solidify NVIDIA's leadership in the rapidly growing AI computing solutions market.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo