Cerebras Amazon Partnership: AI Chips for Faster AI Inference

AWS and Cerebras Join Forces to Enhance AI Inference Computing

Amazon Web Services has struck a deal with Cerebras Systems to offer Cerebras AI chips on Amazon's cloud, marking a significant shift in the AI computing infrastructure landscape. The Cerebras Amazon partnership will combine AWS Trainium chips with Cerebras' innovative technology to deliver what the companies claim will be faster AI inference for LLMs than currently available solutions 1

. AWS, the largest provider of cloud computing power, plans to begin offering this new service in the second half of 2026, though financial terms remain undisclosed 1

Source: Market Screener

For Cerebras, valued at $23.1 billion and planning an initial public offering, securing Amazon as its first hyperscaler customer represents a major validation of its unique chip design 2

. The startup has already made waves earlier this year by signing a $10 billion deal to supply AI chips to OpenAI, the creator of ChatGPT 3

. According to Cerebras CEO Andrew Feldman, the partnership will "bring the fastest inference to a global customer base," leveraging AWS's reach across individual developers to the largest banks in the world 2

Source: Bloomberg

How the Divide and Conquer Strategy Works

The technical approach behind this collaboration centers on AI inference—the process where trained AI systems respond to user queries. The companies will employ what Feldman describes as a divide and conquer strategy, splitting inference tasks into prefill and decode stages 2

. Trainium 3 processors will handle the prefill phase, transforming user requests from human language into tokens that AI systems understand. Meanwhile, Cerebras' Wafer Scale Engine will take over the decode stage, generating the actual responses users seek 1

This approach to inference disaggregation typically faces a significant challenge: communication between different components can slow down processing. However, the partnership aims to overcome this drawback by using specialized AI chips that can more responsively handle inference tasks 1

. The improvement will be especially noticeable in applications requiring back-and-forth interaction, such as chatbots and coding tools that work in multiple stages. AWS Vice President Nafea Bshara noted that while a Trainium-only service will likely remain cheaper, the combined chip offering will appeal to use cases "where time is money" 1

Source: Reuters

Positioning to Compete with Nvidia

The collaboration arrives as the AI chip market intensifies, with multiple players seeking to compete with Nvidia, which dominates the graphics processing unit (GPU) market. Cerebras has pioneered a fundamentally different approach with its massive chips that don't rely on expensive high-bandwidth memory like Nvidia's flagship chips do 2

. The Cerebras CS-3 systems can handle massive amounts of data in one go, representing a unique architecture in the AI hardware landscape 4

Interestingly, analysts expect Nvidia to unveil a similar strategy next week, detailing how it plans to combine its own GPU chips with those from Groq, a startup Nvidia acquired for $17 billion in late December 5

. When asked about comparisons, Amazon stated it couldn't make detailed assessments of Nvidia's unrevealed offering but emphasized that its Trainium 3 program is "just months away from running production workloads" while the timeline for Nvidia-Groq pairing remains unclear . Amazon expects both Trainium 3 and future Trainium 4 chips to "continue to lead in price-performance versus merchant GPUs" 2

Access Through Amazon Bedrock and Data Centers

Cerebras chips will be deployed inside Amazon Web Services data centers, connected to Amazon's Trainium 3 custom AI chips through proprietary networking technology 3

. Customers will access these capabilities through Amazon Bedrock starting in the next couple of months, with AWS being the first cloud provider to offer Cerebras' specialized hardware for disaggregated inference 4

. Later this year, AWS plans to add support for Amazon Nova and other open-source models using this infrastructure 4

According to AWS Vice President David Brown, "The result will be inference that's an order of magnitude faster and higher performance than what's available today" 4

. The partnership addresses the voracious demand for AI computing infrastructure, with the two companies having prepared for this collaboration for several years 1

. AWS will deploy as many of the chips as demand requires, signaling confidence in market appetite for this combined solution. While Amazon remains a major Nvidia customer, this partnership demonstrates its commitment to improving data center economics and providing unique services through its own chip designs and strategic collaborations 1

Amazon Web Services Partners with Cerebras to Deliver Faster AI Inference Using Combined AI Chips

AWS and Cerebras Join Forces to Enhance AI Inference Computing

How the Divide and Conquer Strategy Works

Positioning to Compete with Nvidia

Access Through Amazon Bedrock and Data Centers

References

Amazon Will Use Cerebras' Giant Chips to Help Run AI Models

Cerebras Systems, Amazon strike deal to offer Cerebras AI chips on Amazon's cloud

Cerebras Systems, Amazon strike deal to offer AI chips on AWS cloud

Amazon's AWS Partners With Cerebras Systems To Deliver Faster AI Inference For LLMs - Amazon.com (NASDAQ:AMZN)

Cerebras Systems, Amazon strike deal to offer Cerebras AI chips on Amazon's cloud

Related Stories

Amazon Challenges Nvidia's AI Chip Dominance with Trainium and Project Rainier

Amazon unveils AI Factories with Nvidia partnership and launches Trainium3 chip for on-premises AI

Amazon Challenges Nvidia's AI Chip Dominance with Trainium 2

Recent Highlights

OpenAI Releases GPT-5.4, New AI Model Built for Agents and Professional Work

AI chatbots helped teens plan violent attacks in 8 of 10 cases, new investigation reveals

Pentagon shuts door on Anthropic talks as Microsoft and Big Tech rally behind AI firm's lawsuit

Recent Highlights

Today's Top Stories

Meta delays Avocado AI model release to May after falling short against Google and OpenAI

Gemini task automation goes live on Galaxy S26, letting AI order food and rides for you

Nvidia GTC Conference to showcase $20B Groq bet and new AI chip strategy amid rising competition

ByteDance Gains Access to 36,000 Nvidia Blackwell GPUs Through Malaysia Cloud Partner