Amazon Web Services strikes dual AI chip deals with Nvidia and Cerebras to dominate inference

Reviewed byNidhi Govil

12 Sources

Share

Amazon Web Services has secured two major partnerships to strengthen its position in AI computing infrastructure. The cloud computing unit will receive 1 million Nvidia GPUs through 2027 while simultaneously deploying Cerebras Systems' chips alongside its own Trainium processors. Both deals target AI inference workloads, where trained models generate real-time responses to user queries.

Amazon Web Services Secures Dual AI Chip Partnerships

Amazon Web Services has locked in two significant partnerships that signal a strategic push to dominate the AI inference market. The cloud computing unit announced it will deploy 1 million Nvidia GPUs by the end of 2027, with deliveries beginning this year and extending through the timeline that Nvidia CEO Jensen Huang identified as representing a $1 trillion market opportunity for the company's Blackwell and Rubin chip families

2

. Simultaneously, AWS revealed a collaboration with Cerebras Systems, valued at $23.1 billion, to combine AI chips in a service launching in the second half of 2026

1

. These moves address the voracious demand for AI computing infrastructure as inference workloads now account for roughly two-thirds of AI compute, up from about a third in 2023

5

.

Source: Reuters

Source: Reuters

Cerebras and AWS Deploy Divide-and-Conquer Strategy for AI Inference

The partnership between Cerebras Systems and Amazon Web Services introduces a unique approach to handling AI inference workloads. AWS will deploy Cerebras' Wafer Scale Engine chips inside its data centers, linked to Amazon's own Trainium 3 processors through custom networking technology

3

. The collaboration splits inference tasks into two stages: Amazon's Trainium 3 silicon handles prefill, which transforms user prompts into tokens that AI systems understand, while Cerebras chips take over the decode stage to generate answers

1

. According to Cerebras CEO Andrew Feldman, this "divide and conquer strategy" makes it "easy as a click to get on Cerebras" for AWS customers ranging from individual developers to the largest banks

3

. AWS Vice President Nafea Bshara noted that while a Trainium-only service will likely remain cheaper, the combined chip offering will prove attractive "where time is money"

1

.

Source: SiliconANGLE

Source: SiliconANGLE

Nvidia Deal Extends Beyond GPUs to Full Infrastructure Stack

The Nvidia agreement encompasses far more than graphics processing units. Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, confirmed the transaction includes Nvidia's Spectrum networking chips and Groq chips that Nvidia released following its $17 billion licensing deal with an AI chip startup late last year

2

. AWS plans to use a combination of Nvidia's Groq chips alongside six others from Nvidia for more efficient AI inference. "Inference is hard. It's wickedly hard," Buck told Reuters. "To be the best at inference, it is not a one chip pony. We actually use all seven chips"

2

. The deal also includes deploying Nvidia's ConnectX and Spectrum X networking gear in AWS data centers, a significant move considering AWS has spent years perfecting its own custom networking equipment

4

.

Source: Decrypt

Source: Decrypt

Competition Intensifies in the AI Chip Market

These partnerships underscore the fierce competition to run trained AI models at scale. The market for inference-focused chips is expected to exceed $50 billion by 2026, according to Deloitte estimates cited by industry observers

5

. Cerebras has pioneered a unique approach using huge chips that can handle massive amounts of data in one go, seeking widespread adoption to challenge market leader Nvidia

1

. For Cerebras, which is planning an initial public offering, having Amazon as a customer raises its profile as AWS becomes the first of the hyperscalers to commit to using Cerebras

1

. AWS stated it expects its Trainium 3 and future Trainium 4 offerings "will continue to lead in price-performance versus merchant GPUs"

3

.

Infrastructure Control Shifts as Cloud AI Infrastructure Evolves

Industry strategists observe that these deals represent more than simple supply agreements. "Nvidia is becoming the infrastructure layer underneath the cloud providers, not just a chip vendor to them," Dermot McGrath, co-founder at ZenGen Labs, explained

5

. Berna Misa, deal partner at Boardy Ventures, described the shift as an "infrastructure flip," noting that Nvidia is "embedding its full stack across compute, networking, and inference inside AWS data centers that ran proprietary gear for years"

5

. Pichapen Prateepavanich, policy strategist and founder of Gather Beyond, noted that demand for inference is "driving long-term commitments" for more compute power and creating closer ties between cloud providers and chipmakers, resulting in a dynamic where cooperation and competition happen simultaneously

5

. AWS can use both Nvidia and its own chips in the same systems through large-scale GPU infrastructure deployments, giving customers more flexibility than rivals that keep theirs closed

5

. The collaboration between AWS and multiple AI hardware vendors signals that the infrastructure powering agentic AI systems capable of reasoning, planning, and acting autonomously across complex workflows requires diverse chip architectures working in concert

5

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo