12 Sources
12 Sources
[1]
Amazon Will Use Cerebras' Giant Chips to Help Run AI Models
Amazon.com Inc. plans to use chips from startup Cerebras Systems Inc. alongside its own Trainium processors, a combination that the companies say will be able to better run AI software. Amazon Web Services, the biggest provider of cloud computing power, will begin offering a new service based on the arrangement in the second half of 2026, according to a statement Friday. Financial terms weren't disclosed. The partnership marks the latest attempt to satisfy the voracious demand for AI computing infrastructure. The two companies have been preparing for this collaboration for several years, according to Amazon Web Services Vice President Nafea Bshara. AWS will deploy as many of the chips as it gets demand for, he said. For Cerebras, which is planning an initial public offering, having Amazon as a customer helps raise its profile in a huge potential market. AWS is the first of the hyperscalers -- the largest data center operators -- to commit to using Cerebras. Chips from the two companies will work together in what's known as inference computing -- the process of running AI software and generating answers to incoming queries. In a unique arrangement, Amazon's Trainium 3 silicon will work on prefill, or making sense of user prompts. Cerebras' Wafer Scale Engine will then take over and generate the answers. Get the Tech Newsletter bundle. Get the Tech Newsletter bundle. Get the Tech Newsletter bundle. Bloomberg's subscriber-only tech newsletters, and full access to all the articles they feature. Bloomberg's subscriber-only tech newsletters, and full access to all the articles they feature. Bloomberg's subscriber-only tech newsletters, and full access to all the articles they feature. Bloomberg may send me offers and promotions. Plus Signed UpPlus Sign UpPlus Sign Up By submitting my information, I agree to the Privacy Policy and Terms of Service. This so-called disaggregated work typically has a drawback: Communicating between the different components slows down the process. In this case, the companies aim to have an edge by using specialized chips that can more responsively handle inference tasks. The improvement will be especially apparent in areas that require a back-and-forth with the user, such as work creating computer code that's done in multiple stages. While a Trainium-only service will likely still be cheaper, the new combined chip offering will be attractive "where time is money," Bshara said. Cerebras Chief Executive Officer Andrew Feldman has signed up some of the biggest users of AI hardware as customers, and he sees the inroads as validation of his company's unusual chip design. Cerebras has pioneered a unique approach to processing information using huge chips that can handle massive amounts of data in one go. It's seeking widespread adoption of its technology in a bid to challenge market leader Nvidia Corp. The startup also operates its own data centers, which showcase the capabilities of its components and bring in recurring revenue. The Amazon announcement "brings the fastest inference to a much wider audience," Feldman said. Amazon has an "enormous reach." Though Amazon is also a major Nvidia customer, it's made headway with its own chip designs. The effort is aimed at improving the economics of its data centers and giving the company the ability to provide unique services.
[2]
Nvidia to sell 1 million chips to Amazon by end of 2027 in cloud deal
SAN FRANCISCO, March 19 (Reuters) - Nvidia (NVDA.O), opens new tab will sell 1 million of its graphics processing unit chips, along with a host of the AI giant's other offerings, to Amazon.com's (AMZN.O), opens new tab cloud computing unit by 2027, a Nvidia executive told Reuters on Thursday. Nvidia and Amazon Web Services said this week that AWS had reached a deal to buy its 1 million GPUs but had not disclosed the precise timing of the deal. Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, told Reuters on Thursday that the sales would start this year and extend through 2027. That is the same time frame through which Nvidia CEO Jensen Huang said the company sees an overall sales opportunity of $1 trillion for its Rubin and Blackwell families of chips. Nvidia and Amazon did not disclose the financial terms of their deal. But Buck told Reuters the transaction contains a broad mix of Nvidia chips beyond the 1 million GPUs, including Nvidia's Spectrum networking chips and the Groq chips that Nvidia released this week after its $17 billion licensing deal with an AI chip startup late last year. In particular, AWS plans to use a combination of Nvidia's Groq chips, along with six others from Nvidia, for more efficient inference, the name for the process by which AI systems generate answers and carry out tasks on behalf of users. "Inference is hard. It's wickedly hard," Buck told Reuters. "To be the best at inference, it is not a one chip pony. We actually use all seven chips." The deal also includes putting Nvidia's Connect X and Spectrum X networking gear in AWS data centers. That move is significant because AWS data centers use custom networking equipment that AWS has spent years perfecting. "They're still going to do that, of course," Buck said. "But we are collaborating now on deploying Connect X and Spectrum X for those important workloads and biggest customers across AI with AWS." Reporting by Stephen Nellis in San Francisco; Editing by David Gregorio Our Standards: The Thomson Reuters Trust Principles., opens new tab
[3]
Cerebras Systems, Amazon strike deal to offer Cerebras AI chips on Amazon's cloud
SAN FRANCISCO, March 13 (Reuters) - Amazon.com (AMZN.O), opens new tab and Cerebras Systems on Friday said they have reached a deal to combine the two companies' computing chips in a new service aimed at speeding up chatbots, coding tools and other artificial intelligence services. Valued at $23.1 billion, Cerebras is a chip startup aiming to take on Nvidia by building a fundamentally different kind of AI chip that does not rely on expensive high-bandwidth memory as Nvidia's flagship chips do. Earlier this year, Cerebras signed a $10 billion deal to supply chips to ChatGPT creator OpenAI. Under the deal announced Friday, Cerebras chips will sit inside Amazon Web Services (AWS) data centers and be linked to Amazon's own Trainium3 custom AI chips, connected with custom networking technology from Amazon. "Every customer large or small is on AWS, from individual developers to the largest banks in the world," Cerebras CEO Andrew Feldman told Reuters, saying the deal will "make it easy as a click to get on Cerebras." Both companies declined to disclose the size of the deal. Amazon and Cerebras will team up to tackle what is known as "inference," where previously trained AI systems take requests from users and spit out answers. The two companies will split up that task into two steps, one called "prefill" where the user's request is transformed from human words into the language of "tokens" that AI computers use, and a "decode" stage where the AI computer provides the answer the user is looking for. Amazon said its Trainium3 chips will handle prefill, while Cerebras chips handle decoding, what Feldman told Reuters is a "divide and conquer strategy." It is a similar strategy to the one that analysts expect Nvidia to unveil next week, when it details how it plans to combine its own graphics processing unit (GPU) chips with those from Groq, a startup it spent $17 billion on in late December. In a statement, Amazon said that it could not yet make a detailed comparison between its offering, which will come online in the second half of this year, and Nvidia's as-yet-unrevealed offering, but Amazon expects its service to be a better value. "The timeline for that (Nvidia-Groq) pairing remains unclear while our Trainium3 program is just months away from running production workloads," Amazon said in response to Reuters questions. "What we can say is that we believe (Trainium3) -- and future (Trainium4) -- will continue to lead in price-performance versus merchant GPUs." Reporting by Stephen Nellis in San Francisco, Editing by Franklin Paul Our Standards: The Thomson Reuters Trust Principles., opens new tab
[4]
Nvidia and AWS strike massive GPU supply deal through 2027
On March 19, senior executives at Nvidia said the company has reached an agreement with the cloud computing division of Amazon to supply large-scale GPU infrastructure through 2027. According to Reuters, Nvidia and Amazon Web Services (AWS) confirmed a deal involving the purchase of one million GPUs, though the timeline was not initially disclosed. Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, said on March 19 that deliveries under the agreement would begin in 2026 and extend through 2027. The schedule aligns with CEO Jensen Huang's broader forecast that demand for the company's Blackwell and Rubin chip families could ultimately represent a US$1 trillion market opportunity. While financial terms were not disclosed, Buck said the agreement spans far beyond GPUs. It also includes Nvidia's Spectrum networking chips, as well as Groq-related chips announced this week following Nvidia's recent licensing arrangement with the AI chip startup Groq. Under the plan, AWS will deploy a mix of Nvidia's Groq-related accelerators alongside six other Nvidia chip types to improve inference performance. Buck said inference workloads are highly complex and require coordinated use of all seven chip categories rather than reliance on a single processor. The agreement also covers the deployment of Nvidia's ConnectX and Spectrum-X networking hardware within AWS data centers. While AWS continues to develop its own custom silicon and infrastructure, the two companies will deepen collaboration on key AI workloads and major enterprise customers. Article translated by Elaine Chen and edited by Jack Wu
[5]
Nvidia Deepens Grip on Cloud AI With Major AWS Chip Deal - Decrypt
Observers say rising inference demand is reshaping infrastructure and competition. Nvidia will supply Amazon Web Services with a massive volume of GPUs through 2027 as the cloud provider ramps up its AI infrastructure and looks to meet growing demand. AWS announced earlier this week that it plans to deploy around 1 million Nvidia GPUs as part of its expanded AI infrastructure buildout. An Nvidia executive confirmed with Reuters on Thursday that the rollout is expected to run through the end of 2027. Commencing this year across AWS's global cloud regions, it will be rolled out alongside expanded work with Nvidia on networking and other infrastructure to build systems "capable of reasoning, planning, and acting autonomously across complex workflows," AWS said, pointing to its work on agentic AI systems. AWS continues to develop AI chips for both training and inference. The collaboration suggests demand may be shifting across the AI stack, while a growing share of activity appears tied to running models in live services. The deal comes as U.S. prosecutors pursue a case alleging Nvidia chips were smuggled to China, placing the company's global supply and controls under renewed scrutiny. Since 2022, Nvidia's most advanced chips have been tightly controlled as part of a broader U.S. strategy to limit China's progress in advanced computing and AI. Thursday's development closer to home could all but widen that gap. Observers say the deal structure offers clues about where demand is building and how the underlying infrastructure is changing at an increasingly rapid pace. "Nvidia is becoming the infrastructure layer underneath the cloud providers, not just a chip vendor to them," Dermot McGrath, co-founder at strategy and growth studio ZenGen Labs, told Decrypt. Chips in the deal are geared toward running AI models at scale, with a focus on lowering the cost of use, McGrath said, noting that inference now accounts for roughly two-thirds of AI compute, up from about a third in 2023. The market for inference-focused chips is expected to exceed $50 billion by 2026, he added, citing Deloitte estimates. AWS can use both Nvidia and its own chips in the same systems, giving customers more choice than rivals that keep theirs closed, McGrath explained, adding that this flexibility "is a differentiator." "Now Nvidia is doing the same thing one layer down, with networking and rack architecture instead of a programming model," he said. Inference chips are processors designed to run trained AI models in real time, rather than requiring retraining. Demand for inference is "driving long-term commitments" for more compute power, and is creating closer ties between cloud providers and chipmakers, Pichapen Prateepavanich, policy strategist and founder of infrastructure firm Gather Beyond, told Decrypt. "Cloud providers want independence over the long term, but in the near term they need Nvidia to remain competitive," she said, noting how this creates a dynamic where cooperation and competition happen at the same time. Still, control over AI infrastructure is also changing. What's happening is an "infrastructure flip," Berna Misa, deal partner at Boardy Ventures, an AI-led investment fund, told Decrypt. Nvidia is "embedding its full stack across compute, networking, and inference inside AWS data centers that ran proprietary gear for years," she said. But while AWS is developing its own AI chips, this "doesn't change the math," she explained, noting that inference relies on multiple components across the stack, with Nvidia supplying most of them. "When you're that deep in your customer's stack, switching cost and the context layer that comes out of it becomes the moat," she said.
[6]
AWS will bring Cerebras' wafer-size WSE-3 chip to its cloud platform - SiliconANGLE
AWS will bring Cerebras' wafer-size WSE-3 chip to its cloud platform Amazon Web Services Inc. will make Cerebras Systems Inc.'s WSE-3 artificial intelligence chip available to its customers. The companies announced the initiative today. It's part of a multiyear partnership that will also see AWS and Cerebras develop a "disaggregated architecture" for AI inference workloads. The technology is expected to increase the speed at which AI models generate output by a factor of 5. Cerebras' WSE-3 chip includes 900,000 cores and 44 gigabytes of on-chip SRAM. The company ships the processor as part of a water-cooled appliance called the CS-3. The system, which is about the size of a mini-fridge, combines 1 WSE-3 with external memory, network equipment and other auxiliary components. The newly announced partnership will see AWS deploy CS-3 appliances in its data centers. The systems will be made available to customers via the cloud giant's AWS Bedrock service, which provides access to internally-developed and third-party foundation models. CS-3 enables neural networks to generate prompt responses at a rate of several thousand tokens per second. The disaggregated architecture that AWS and Cerebras are developing will combine the WSE-3 with AWS Trainium, the cloud giant's line of custom AI chips. The goal of the integration is to speed up customers' inference workloads. A large language model processes prompts by splitting them into small units of data called tokens. Each token contains a few letters or numbers. The LLM generates three mathematical objects called the key, value and query for every single token in a prompt. Those objects help the model determine what parts of a prompt are important and which details can be deprioritized. The process through which an LLM processes a prompt is known as the prefill stage. It's followed by the decode phase, which is when the model generates its answer to the user's question. Prefill and decoding tasks are usually performed by the same chip. In AWS' disaggregated architecture, Trainium processors will power the prefill stage while the WSE-3 will perform decoding. Decoding involves a similar set of calculations as the prefill stage, but requires significantly more data movement. Information regularly travels between the underlying chip's logic circuits and memory. The faster the chip can move the information, the faster prompt responses are generated. One of the WSE-3's main selling points is that it can move data between its logic and memory circuits faster than many other chips. According to Cerebras, the processor provides 27 petabytes per second of internal memory bandwidth. That's more than 200 times the amount offered by Nvidia Corp.'s NVLink graphics card interconnect. AWS will link together the Trainium and WSE-3 chips in its data centers using an internally-developed network device called the Elastic Fabric Adapter, or EFA. Packets usually go through the host server's operating system when they move between chips. The EFA skips that step to speed up connections and automatically mitigates network congestion. "Disaggregated is ideal when you have large, stable workloads," Cerebras director of product marketing James Wang wrote in a blog post. "Most customers run a mix of workloads with different prefill/decode ratios, where the traditional aggregated approach is still ideal. We expect most customers will want access to both." The partnership comes a few weeks after Cerebras won another high-profile chip supply deal. OpenAI Group PBC agreed to purchase 750 megawatts worth of computing infrastructure from the company through 2028. The deal, which is reportedly worth over $10 billion, was announced between two funding rounds that together netted Cerebras more than $2 billion.
[7]
Nvidia to sell 1 million chips to Amazon by end of 2027 in cloud deal - The Economic Times
Nvidia and Amazon Web Services said this week that AWS had reached a deal to buy its 1 million GPUs but had not disclosed the precise timing of the deal. Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, told Reuters on Thursday that the sales would start this year and extend through 2027.Nvidia will sell 1 million of its graphics processing unit chips, along with a host of the AI giant's other offerings, to Amazon.com's cloud computing unit by 2027, a Nvidia executive told Reuters on Thursday. Nvidia and Amazon Web Services said this week that AWS had reached a deal to buy its 1 million GPUs but had not disclosed the precise timing of the deal. Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, told Reuters on Thursday that the sales would start this year and extend through 2027. That is the same time frame through which Nvidia CEO Jensen Huang said the company sees an overall sales opportunity of $1 trillion for its Rubin and Blackwell families of chips. Nvidia and Amazon did not disclose the financial terms of their deal. But Buck told Reuters the transaction contains a broad mix of Nvidia chips beyond the 1 million GPUs, including Nvidia's Spectrum networking chips and the Groq chips that Nvidia released this week after its $17 billion licensing deal with an AI chip startup late last year. In particular, AWS plans to use a combination of Nvidia's Groq chips, along with six others from Nvidia, for more efficient inference, the name for the process by which AI systems generate answers and carry out tasks on behalf of users. "Inference is hard. It's wickedly hard," Buck told Reuters. "To be the best at inference, it is not a one chip pony. We actually use all seven chips." The deal also includes putting Nvidia's Connect X and Spectrum X networking gear in AWS data centers. That move is significant because AWS data centers use custom networking equipment that AWS has spent years perfecting. "They're still going to do that, of course," Buck said. "But we are collaborating now on deploying Connect X and Spectrum X for those important workloads and biggest customers across AI with AWS."
[8]
Nvidia To Deliver 1 Million AI Chips To Amazon Web Services By 2027 In Massive Multi-Chip Deal Set To Supercharge Inference And Cloud Computing - Amazon.com (NASDAQ:AMZN), NVIDIA (NASDAQ:NVDA)
AWS Locks In Massive Multi-Year GPU Deal Ian Buck, Nvidia's vice president of hyperscale and high-performance computing, told Reuters on Thursday that shipments will begin this year and continue through 2027. While both companies confirmed the agreement earlier, the timeline had not been disclosed. Amazon and Nvidia did not immediately respond to Benzinga's request for comment. Beyond GPUs: Networking And AI Inference Chips Included The agreement goes beyond GPUs, encompassing a broader suite of Nvidia technologies. This includes Spectrum networking chips and ConnectX systems designed to accelerate data transfer within data centers. AWS will also deploy a mix of Nvidia's newer chips, including its recently introduced Groq offerings, alongside several others to improve AI inference -- the process of generating real-time outputs from trained models. $1 Trillion Opportunity Signals Long-Term Growth The deal aligns with CEO Jensen Huang's projection of a $1 trillion revenue opportunity tied to Nvidia's next-generation Blackwell and Rubin chip platforms. Despite developing its own custom hardware, AWS's continued reliance on Nvidia underscores the chipmaker's dominance in the rapidly expanding AI ecosystem. Price Action: Shares of Amazon closed at $208.77 on Thursday, down 0.52% and edged up 0.13% to $209.04 in after-hours trading. Nvidia closed at $178.56 during the regular session, down 1.02% and rose 0.34% to $179.17 in after-hours trading, according to Benzinga Pro. Benzinga Edge Stock Rankings indicate Amazon is lagging across short, medium and long-term performance trends. Its Quality score remains relatively solid, ranking in the 95th percentile. Disclaimer: This content was partially produced with the help of AI tools and was reviewed and published by Benzinga editors. Photo Courtesy: Mijansk786 on Shutterstock.com Market News and Data brought to you by Benzinga APIs To add Benzinga News as your preferred source on Google, click here.
[9]
Cerebras Systems, Amazon strike deal to offer AI chips on AWS cloud
Amazon Web Services and Cerebras Systems have joined forces. They will combine their computing chips in a new service. This service aims to speed up AI applications like chatbots and coding tools. Cerebras chips will be integrated into Amazon data centres. They will work alongside Amazon's Trainium3 AI chips. This collaboration is set to enhance AI inference capabilities. Amazon.com and Cerebras Systems on Friday said they have reached a deal to combine the two companies' computing chips in a new service aimed at speeding up chatbots, coding tools and other artificial intelligence services. Valued at $23.1 billion, Cerebras is a chip startup aiming to take on Nvidia by building a fundamentally different kind of AI chip that does not rely on expensive high-bandwidth memory as Nvidia's flagship chips do. Earlier this year, Cerebras signed a $10 billion deal to supply chips to ChatGPT creator OpenAI. Under the deal announced Friday, Cerebras chips will sit inside Amazon Web Services (AWS) data centers and be linked to Amazon's own Trainium3 custom AI chips, connected with custom networking technology from Amazon. "Every customer large or small is on AWS, from individual developers to the largest banks in the world," Cerebras CEO Andrew Feldman told Reuters, saying the deal will "make it easy as a click to get on Cerebras." Both companies declined to disclose the size of the deal. Amazon and Cerebras will team up to tackle what is known as "inference," where previously trained AI systems take requests from users and spit out answers. The two companies will split up that task into two steps, one called "prefill" where the user's request is transformed from human words into the language of "tokens" that AI computers use, and a "decode" stage where the AI computer provides the answer the user is looking for. Amazon said its Trainium3 chips will handle prefill, while Cerebras chips handle decoding, what Feldman told Reuters is a "divide and conquer strategy." It is a similar strategy to the one that analysts expect Nvidia to unveil next week, when it details how it plans to combine its own graphics processing unit (GPU) chips with those from Groq, a startup it spent $17 billion on in late December. In a statement, Amazon said that it could not yet make a detailed comparison between its offering, which will come online in the second half of this year, and Nvidia's as-yet-unrevealed offering, but Amazon expects its service to be a better value. "The timeline for that (Nvidia-Groq) pairing remains unclear while our Trainium3 program is just months away from running production workloads," Amazon said in response to Reuters questions. "What we can say is that we believe (Trainium3)-and future (Trainium4)-will continue to lead in price-performance versus merchant GPUs."
[10]
Amazon's AWS Partners With Cerebras Systems To Deliver Faster AI Inference For LLMs - Amazon.com (NASDAQ:AMZN)
The partnership aims to deliver the world's fastest AI inference for large language models (LLMs). Unmatched Speed Through Disaggregation The new solution integrates AWS Trainium chips with Cerebras CS-3 systems. By using "inference disaggregation," the system splits workloads. Trainium handles "prefill" (input processing), while the CS-3 focuses on "decode" (output generation). David Brown, Vice President at AWS, stated, "The result will be inference that's an order of magnitude faster and higher performance than what's available today." Exclusive Access via Amazon Bedrock The technology will be deployed within AWS data centers. Customers can access these speeds through Amazon Bedrock starting in the next couple of months. AWS is the first cloud provider to offer Cerebras' specialized hardware for disaggregated inference. Later this year, AWS will add support for Amazon Nova and other open-source models using this infrastructure. "Partnering with AWS... will bring the fastest inference to a global customer base," noted Cerebras CEO Andrew Feldman. Cerebras is also powering OpenAI with massive computing capacity. AMZN Price Action: Amazon.com shares were down 0.98% at $207.48 at the time of publication on Friday, according to Benzinga Pro data. Image via Shutterstock This content was partially produced with the help of AI tools and was reviewed and published by Benzinga editors. Market News and Data brought to you by Benzinga APIs To add Benzinga News as your preferred source on Google, click here.
[11]
Nvidia confirms 1 million GPU sale to AWS through 2027 By Investing.com
Investing.com -- NVIDIA Corporation (NASDAQ:NVDA) will deliver 1 million graphics processing units to Amazon.com's (NASDAQ:AMZN) AWS cloud computing division between 2026 and 2027, an executive confirmed to Reuters, marking the first official timeline for a major cloud partnership that includes a broad array of chips and networking equipment. Ian Buck, Nvidia's vice president of hyperscale and high-performance computing, told Reuters the GPU deliveries would begin this year and extend through 2027. The deal encompasses far more than GPUs, according to Buck. Amazon Web Services will also purchase Nvidia's Spectrum networking chips and newly released Groq chips, which Nvidia obtained through a $17 billion licensing deal with AI chip startup Groq in late 2025. AWS plans to deploy a combination of Groq chips alongside six other Nvidia chip types to optimize AI inference workloads -- the process by which AI systems generate responses and execute tasks. "Inference is hard. It's wickedly hard," Buck told Reuters. "To be the best at inference, it is not a one chip pony. We actually use all seven chips." The agreement also includes deploying Nvidia's Connect X and Spectrum X networking equipment in AWS data centers, a significant shift given that AWS has historically relied on custom-built networking gear perfected over years of internal development. "They're still going to do that, of course," Buck said of AWS's proprietary equipment. "But we are collaborating now on deploying Connect X and Spectrum X for those important workloads and biggest customers across AI with AWS." Neither company disclosed financial terms of the arrangement. Trillion-Dollar Opportunity The AWS deal timeline aligns with CEO Jensen Huang's projection that Nvidia faces a $1 trillion sales opportunity for its Rubin and Blackwell chip families through 2027. That estimate excludes CPUs, networking chips, Groq-based products, and a variant called Rubin Ultra, suggesting the total addressable market could extend significantly higher. Huang indicated the Groq integration could unlock $300 billion in annual revenue per gigawatt, with expectations that roughly 25% of GPU workloads will link up with Groq chips. The Nvidia-Groq system, dubbed the LPX, is positioned as an optional integration with Nvidia's Vera Rubin platform but is not yet being used at scale.
[12]
Cerebras Systems, Amazon strike deal to offer Cerebras AI chips on Amazon's cloud
SAN FRANCISCO, March 13 (Reuters) - Amazon.com and Cerebras Systems on Friday said they have reached a deal to combine the two companies' computing chips in a new service aimed at speeding up chatbots, coding tools and other artificial intelligence services. Valued at $23.1 billion, Cerebras is a chip startup aiming to take on Nvidia by building a fundamentally different kind of AI chip that does not rely on expensive high-bandwidth memory as Nvidia's flagship chips do. Earlier this year, Cerebras signed a $10 billion deal to supply chips to ChatGPT creator OpenAI. Under the deal announced Friday, Cerebras chips will sit inside Amazon Web Services (AWS) data centers and be linked to Amazon's own Trainium3 custom AI chips, connected with custom networking technology from Amazon. "Every customer large or small is on AWS, from individual developers to the largest banks in the world," Cerebras CEO Andrew Feldman told Reuters, saying the deal will "make it easy as a click to get on Cerebras." Both companies declined to disclose the size of the deal. Amazon and Cerebras will team up to tackle what is known as "inference," where previously trained AI systems take requests from users and spit out answers. The two companies will split up that task into two steps, one called "prefill" where the user's request is transformed from human words into the language of "tokens" that AI computers use, and a "decode" stage where the AI computer provides the answer the user is looking for. Amazon said its Trainium3 chips will handle prefill, while Cerebras chips handle decoding, what Feldman told Reuters is a "divide and conquer strategy." It is a similar strategy to the one that analysts expect Nvidia to unveil next week, when it details how it plans to combine its own graphics processing unit (GPU) chips with those from Groq, a startup it spent $17 billion on in late December. In a statement, Amazon said that it could not yet make a detailed comparison between its offering, which will come online in the second half of this year, and Nvidia's as-yet-unrevealed offering, but Amazon expects its service to be a better value. "The timeline for that (Nvidia-Groq) pairing remains unclear while our Trainium3 program is just months away from running production workloads," Amazon said in response to Reuters questions. "What we can say is that we believe (Trainium3)--and future (Trainium4)--will continue to lead in price-performance versus merchant GPUs." (Reporting by Stephen Nellis in San Francisco, Editing by Franklin Paul)
Share
Share
Copy Link
Amazon Web Services has secured two major partnerships to strengthen its position in AI computing infrastructure. The cloud computing unit will receive 1 million Nvidia GPUs through 2027 while simultaneously deploying Cerebras Systems' chips alongside its own Trainium processors. Both deals target AI inference workloads, where trained models generate real-time responses to user queries.
Amazon Web Services has locked in two significant partnerships that signal a strategic push to dominate the AI inference market. The cloud computing unit announced it will deploy 1 million Nvidia GPUs by the end of 2027, with deliveries beginning this year and extending through the timeline that Nvidia CEO Jensen Huang identified as representing a $1 trillion market opportunity for the company's Blackwell and Rubin chip families
2
. Simultaneously, AWS revealed a collaboration with Cerebras Systems, valued at $23.1 billion, to combine AI chips in a service launching in the second half of 20261
. These moves address the voracious demand for AI computing infrastructure as inference workloads now account for roughly two-thirds of AI compute, up from about a third in 20235
.
Source: Reuters
The partnership between Cerebras Systems and Amazon Web Services introduces a unique approach to handling AI inference workloads. AWS will deploy Cerebras' Wafer Scale Engine chips inside its data centers, linked to Amazon's own Trainium 3 processors through custom networking technology
3
. The collaboration splits inference tasks into two stages: Amazon's Trainium 3 silicon handles prefill, which transforms user prompts into tokens that AI systems understand, while Cerebras chips take over the decode stage to generate answers1
. According to Cerebras CEO Andrew Feldman, this "divide and conquer strategy" makes it "easy as a click to get on Cerebras" for AWS customers ranging from individual developers to the largest banks3
. AWS Vice President Nafea Bshara noted that while a Trainium-only service will likely remain cheaper, the combined chip offering will prove attractive "where time is money"1
.
Source: SiliconANGLE
The Nvidia agreement encompasses far more than graphics processing units. Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, confirmed the transaction includes Nvidia's Spectrum networking chips and Groq chips that Nvidia released following its $17 billion licensing deal with an AI chip startup late last year
2
. AWS plans to use a combination of Nvidia's Groq chips alongside six others from Nvidia for more efficient AI inference. "Inference is hard. It's wickedly hard," Buck told Reuters. "To be the best at inference, it is not a one chip pony. We actually use all seven chips"2
. The deal also includes deploying Nvidia's ConnectX and Spectrum X networking gear in AWS data centers, a significant move considering AWS has spent years perfecting its own custom networking equipment4
.
Source: Decrypt
Related Stories
These partnerships underscore the fierce competition to run trained AI models at scale. The market for inference-focused chips is expected to exceed $50 billion by 2026, according to Deloitte estimates cited by industry observers
5
. Cerebras has pioneered a unique approach using huge chips that can handle massive amounts of data in one go, seeking widespread adoption to challenge market leader Nvidia1
. For Cerebras, which is planning an initial public offering, having Amazon as a customer raises its profile as AWS becomes the first of the hyperscalers to commit to using Cerebras1
. AWS stated it expects its Trainium 3 and future Trainium 4 offerings "will continue to lead in price-performance versus merchant GPUs"3
.Industry strategists observe that these deals represent more than simple supply agreements. "Nvidia is becoming the infrastructure layer underneath the cloud providers, not just a chip vendor to them," Dermot McGrath, co-founder at ZenGen Labs, explained
5
. Berna Misa, deal partner at Boardy Ventures, described the shift as an "infrastructure flip," noting that Nvidia is "embedding its full stack across compute, networking, and inference inside AWS data centers that ran proprietary gear for years"5
. Pichapen Prateepavanich, policy strategist and founder of Gather Beyond, noted that demand for inference is "driving long-term commitments" for more compute power and creating closer ties between cloud providers and chipmakers, resulting in a dynamic where cooperation and competition happen simultaneously5
. AWS can use both Nvidia and its own chips in the same systems through large-scale GPU infrastructure deployments, giving customers more flexibility than rivals that keep theirs closed5
. The collaboration between AWS and multiple AI hardware vendors signals that the infrastructure powering agentic AI systems capable of reasoning, planning, and acting autonomously across complex workflows requires diverse chip architectures working in concert5
.Summarized by
Navi
10 Jul 2025•Technology

06 Sept 2025•Technology

02 Dec 2025•Technology

1
Technology

2
Science and Research

3
Startups
