18 Sources
18 Sources
[1]
Amazon challenges competitors with on-premises Nvidia 'AI Factories' | TechCrunch
Amazon announced a new product Tuesday called "AI Factories" that allows big corporations and governments to run its AI systems in their own data centers. Or as AWS puts it: customers supply the power and the data center, and AWS plunks in the AI system, manages it, and can tie it into other AWS cloud services. The idea is to cater to companies and governments concerned with data sovereignty, or absolute control over their data so it can't wind up in a competitor's or foreign adversary's hands. An on-prem AI Factory means not sending their data to a model maker, and not even sharing the hardware. If that product name sounds familiar, it should. That's what Nvidia calls its hardware systems that are chock full of tools needed to run AI, from its GPU chips to its networking tech. This AWS AI Factory is, in fact, a collaboration with Nvidia, both companies say. In this case, the AWS Factory will use a combination of AWS and Nvidia technology. Companies that deploy these systems can opt for Nvidia's latest Blackwell GPUs or Amazon's new Trainium3 chip. It uses AWS' homegrown networking, storage, databases and security and can tap into Amazon Bedrock -- the AI model selection and management service, and AWS SageMaker AI, the model building and training tool. Interestingly, AWS is far from the only giant cloud provider installing Nvidia AI Factories. In October, Microsoft showed off its first of many-to-come AI Factories rolling out into its global data centers to run OpenAI workloads. Microsoft didn't announce at the time that these extreme machines would be available for private clouds. Instead, Microsoft highlighted how it was leaning on a host of Nvidia AI Factory data center tech to build and connect its new "AI Superfactories," aka new state-of-the-art data centers being built in Wisconsin and Georgia. Last month, Microsoft also outlined the data centers and cloud services that would be built in local countries to address the data sovereignty issue. To be fair, its options also include "Azure Local," Microsoft's own managed hardware that could be installed on customer sites. Still, it is a bit ironic that AI is causing the biggest cloud providers to invest so heavily in corporate private data centers and hybrid clouds like it's 2009 all over again.
[2]
Amazon releases an impressive new AI chip and teases a Nvidia-friendly roadmap | TechCrunch
Amazon Web Services, which has been building its own AI training chips for years now, just introduced a new version known as Trainium3 that comes with some impressive specs. The cloud provider, which made the announcement Tuesday at its AWS re:Invent 2025, also teased the next product in on its AI training product roadmap: Trainium4, which is already in the works and will be able to work with Nvidia's chips. AWS used its annual tech conference to formally launch Trainium3 UltraServer, a system powered by the company's state-of-the art, 3 nanometer Trainium3 chip, as well as its homegrown networking tech. As you might expect, the third-generation chip and system offer big bumps in performance for AI training and inference over the second-generation, according to AWS. AWS says the systems are more than four times faster, with four times more memory, not just for training, but for delivering AI apps at peak demand. Additionally, thousands of UltraServers can be linked together to provide an app with up to 1 million Trainium3 chips -- 10 times the previous generation. Each UltraServer can host 144 chips, according to the company. Perhaps more importantly, AWS says the chips and systems are also 40% more energy efficient than the previous generation. While the world races to build bigger data centers powered by astronomical gigawatts of electricity, data center giant AWS is trying to make systems that drink less, not more. It is, obviously, in AWS's direct interests to do so. But in its classic, Amazon cost-conscience way, it promises that these systems save its AI cloud customers money, too. AWS customers like Anthropic (of which Amazon is also an investor), Japan's LLM Karakuri, Splashmusic, and Decart have already been using the third-gen chip and system and significantly cut their inference costs, Amazon said. AWS also presented a bit of a roadmap for the next chip, Trainium4, which is already in development. AWS promised the chip will provide another big step-up in performance and support Nvidia's NVLink Fusion high-speed chip interconnect technology. This means the AWS Trainium4-powered systems will be able to interoperate and extend their performance with Nvidia GPUs while still using Amazon's homegrown, lower-cost server rack technology. It's worth noting, too, that Nvidia's CUDA (Compute Unified Device Architecture), has become the de facto standard that all AI apps support. The Trainium4-powered systems may make it easier to woo big AI apps built with Nvidia GPUs in mind to Amazon's cloud. Amazon did not announce a timeline for Trainium4. If Amazon follows previous rollout timelines, we'll likely hear more about Trainium4 at next year's conference.
[3]
AWS AI Factories: AI-in-a-box for enterprise datacenters
If sovereignty or on-prem AI matters, the new AI Factories could be for you re:invent Many businesses and government agencies require that all sensitive data stay on-premises for legal or security reasons. If those orgs want to work with AI, they can't rely on regular public clouds, but now they can let AWS build and manage AI hardware and software in their datacenters. Announced Tuesday at the company's re:Invent conference, AWS AI Factories is a fully managed solution where enterprises provide the datacenter and power while the house of Bezos installs its hardware and software under their roof. They operate like a private AWS Region, using customers' existing datacenter space, power, and network links while AWS brings in and manages its own racks of infrastructure, including compute, storage, database, and AI services, all running locally. Customers will not need to worry about acquiring hardware, installing it, or buying or building software platforms for their AI models. All they need is the physical space in a datacenter and enough power capacity to juice all those GPUs. That will save orgs a lot of time and expertise. Customers can use AWS tools like the Amazon Bedrock foundation model builder or SageMaker machine-learning platform, as well as some high-end hardware, such as the company's Trainium3 AI accelerators and Nvidia GPUs like the current-gen B200 and GB200, or next-gen GB300 and B300. It will use a petabit-scale, non-blocking network to connect the GPUs and offer Amazon FSx for Lustre and Amazon S3 Express One Zone storage technology. AWS does not yet support NVLink Fusion, which is a high-speed, chip-to-chip interconnect, but says support will arrive in its future Trainium4 chips. Amazon didn't disclose how much this will cost, so we don't yet know whether it will be more expensive than installing one's own hardware and software from scratch. "The AI factories operate exclusively for each customer and it helps them with that separation, maintaining the security and reliability you get from AWS while also meeting stringent compliance and sovereignty requirements," AWS CEO Matt Garman said in his keynote. Garman said the work was inspired by the service's efforts to spin up private, secure AI capabilities for the Kingdom of Saudi Arabia's AI endeavors, working with the ironically named AI platform company Humain. The company is helping build an "AI Zone" in the country, which will have up to 150,000 AI chips and dedicated AWS infrastructure. According to Garman, the AWS partnership provides Saudi Arabia with high performing infrastructure, models, and AI services like SageMaker and Bedrock, while meeting the kingdom's security, privacy, and responsible AI standards. "This type of work has sparked some interest in others: large government organizations in the public sector who are interested in a similar concept," Garman said. "We sat back and asked ourselves. Could we deliver this type of AI zone to a broader set of customers. Maybe even something that could leverage customers' existing datacenters?" In building its own AI factories, AWS faces some stiff competition. Dell's AI Factory with Nvidia was introduced in early 2024, and its promises of an edge to datacenter solution have captured billions in sales. In May, the company boasted 3,000 customers for its AI Factories, and last week, Dell said it had shipped $15.6 billion in AI servers year to date [PDF]. HPE's own private AI cloud product, which is also backed by Nvidia, won adoption by more than 300 new logos during the quarter ended July 31, it announced during its last earnings call [PDF]. Lenovo has also announced [PDF] a massive upswing in the sale of its infrastructure solutions, up 24 percent year over year in the quarter ended September 30, including "high double digit growth" of AI servers. Amazon's timing may be a bit of a mismatch, as polling from analysts at Forrester and Gartner indicates that the purse strings are tightening on AI spending as customers want a solid track to a return on their investment with these systems, which can cost millions of dollars to stand up, not to mention the price to power, maintain, and operate them. Naveen Chhabra, principal analyst at Forrester, speculated that AWS has likely been building this for several quarters, and suggested AWS is playing into the challenges that customers face when deploying AI infrastructure, including cooling, long lead times for products, the piecemeal approach to architecture, and data sovereignty. According to a November 21 report that Chhabra co-wrote, the revenue achieved as a result of AI spending is lagging and customers are taking notice. "With free cash flow tightening, interest rates still high, and even OpenAI's CEO warning of a dot-com-style bubble, the sector faces a reckoning," Forrester stated. The analyst group also posited that the looming chip shortage is causing limited and uncertain access to memory across devices, while pushing vendors to rethink their architecture. Chhabra sees many customers moving AI workloads to the cloud to combat these challenges, but that won't work if you are required to keep all of your data on-prem, which is the target market for AWS AI Factories. ®
[4]
Amazon announces new AI chips, closer Nvidia ties -- but it's cloud capacity that matters most
Amazon Web Services' two-track approach to artificial intelligence came into better focus Tuesday as the world's biggest cloud pushed forward with its own custom chips and got closer to Nvidia . During Amazon 's annual AWS Re:Invent 2025 conference in Las Vegas, Amazon Web Services CEO Matt Garman unveiled Trainium3 -- the latest version of the company's in-house custom chip. It has four times more compute performance, energy efficiency, and memory bandwidth than previous generations. AWS said that early results of customers testing Trainium3 are reducing AI training and inference costs by up to 50%. Custom chips, like Trainium, are becoming more and more popular for the big tech companies that can afford to make them. And, their use cases are broadening. For example, Google's tensor processing units (TPUs), co-designed by Broadcom , have also been getting a lot of attention since last month's launch of the well-received Gemini 3 artificial intelligence model. It is powered by TPUs. There was even a report that Meta Platforms was considering TPUs in addition to Nvidia 's graphics processing units (GPUs), which are the gold standard for all-purpose AI workloads. At the same time, Amazon also announced that it's deepening its work with Nvidia. In Tuesday's keynote, Garman introduced AWS Factories, which provides on-premise AI infrastructure for customers to use in their own data centers. The service combines Trainium accelerators and Nvidia graphics processing units, which allows customers to access Nvidia's accelerated computing platform, full-stack AI software, and GPU-accelerated applications. By offering both options, Amazon aims to keep accelerating AWS cloud capacity and, in turn, revenue growth to stay on top during a time of intense competition from Microsoft 's Azure and Alphabet 's Google Cloud, the second and third place horses in the AI race, by revenue. Earlier this year, investors were concerned when second-quarter AWS revenue growth did not live up to its closest competitors. In late October's release of Q3 results, Amazon went a long way to putting those worries to rest. Amazon CEO Andy Jassy said at the time , "AWS is growing at a pace we haven't seen since 2022, re-accelerating to 20.2% YoY." He added, "We've been focused on accelerating capacity -- adding more than 3.8 gigawatts (GW) in the past 12 months." Tuesday's announcements come at a pivotal time for AWS as it tries to rapidly expand its computing capacity after a year of supply constraints that put a lid on cloud growth. As great as more efficient chips are, they don't make up for the capacity demand that the company is facing as AI adoption ramps up, which is why adding more gigawatts of capacity is what Wall Street is laser-focused on. Fortunately, Wall Street argues that the capacity headwind should flip to a tailwind. Wells Fargo said Trainium3 is "critical to supplementing Nvidia GPUs and CPUs in this capacity build" to close the gap with rivals. In a note to investors on Monday, the analysts estimate Amazon will add more than 12 gigawatts of compute by year-end 2027, boosting total AWS capacity to support as much as $150 billion in incremental annual AWS revenue if demand remains strong. In a separate note, Oppenheimer said Monday that AWS has already proven its ability to improve capacity, which has already doubled since 2022. Amazon plans to double it again by 2027. The analysts said that such an expansion could translate to 14% upside to 2026 AWS revenue and 22% upside in 2027. Analysts said each incremental gigawatt of compute added in recent quarters translated to roughly $3 billion of annual cloud revenue. Bottom line While new chips are welcome news that helps AWS step deeper into the AI chip race, Amazon's investment in capacity and when that capacity will be unlocked is what investors are more locked in on because that's how it will fulfill demand. The issue is not a demand issue; it's a supply issue. We are confident in AWS' ability to add the capacity. In fact, there's no one company in the world that could deal with this kind of logistics problem, at this scale, better than Amazon. Amazon shares surged nearly 14% to $254 each in the two sessions following the cloud and e-commerce giant's late Oct. 30 earnings print. The stock has since given back those gains and then some. As of Tuesday's close, shares were up 6.5% year to date, a laggard among its "Magnificent Seven" peers, and underperforming the S & P 500 's roughly 16% advance in 2025. (Jim Cramer's Charitable Trust is long AMZN, NVDA. See here for a full list of the stocks.) As a subscriber to the CNBC Investing Club with Jim Cramer, you will receive a trade alert before Jim makes a trade. Jim waits 45 minutes after sending a trade alert before buying or selling a stock in his charitable trust's portfolio. If Jim has talked about a stock on CNBC TV, he waits 72 hours after issuing the trade alert before executing the trade. THE ABOVE INVESTING CLUB INFORMATION IS SUBJECT TO OUR TERMS AND CONDITIONS AND PRIVACY POLICY , TOGETHER WITH OUR DISCLAIMER . NO FIDUCIARY OBLIGATION OR DUTY EXISTS, OR IS CREATED, BY VIRTUE OF YOUR RECEIPT OF ANY INFORMATION PROVIDED IN CONNECTION WITH THE INVESTING CLUB. NO SPECIFIC OUTCOME OR PROFIT IS GUARANTEED.
[5]
Amazon to fuse Nvidia's NVLink into Trainium4 accelerators
Meanwhile, Trainium3 makes its debut promising million-chip training clusters Re:Invent Amazon says that its next generation of homegrown silicon will deliver 6x higher performance thanks to a little help from its buddy Nvidia. At its Re:Invent convention in Las Vegas on Tuesday, Amazon Web Services (AWS) teased its Trainium4 accelerators, which will be among the first to embrace Nvidia's NVLink Fusion interconnect tech for chip-to-chip communications. NVLink is a high-speed interconnect that allows multiple GPUs spanning multiple systems to pool resources and behave like a single accelerator. Previously, this technology has been limited to Nvidia CPUs and GPUs, but back in May, the AI infrastructure giant announced it was opening the tech to others with the introduction of NVLink Fusion at Computex. Amazon claims that the technology will allow its Trainium4 accelerators, Graviton CPUs, and EFA networking tech to communicate seamlessly across Nvidia's MGX racks. In its current form, Nvidia's fifth-gen NVLink fabrics support up to 1.8 TB/s of bandwidth (900 GB/s in each direction) per GPU, but the company is on track to double that to 3.6 TB/s by next year. Beyond Nvidia's interconnect tech, details are somewhat vague. We're told that the new chips will deliver 3x more FLOPS at FP8, 6x the performance at FP4, and 4x the memory bandwidth. Whether those claims pertain to the individual chips or its UltraServer rack systems, Amazon hasn't said. Assuming it's the rack systems, as was the case with Trainium3, that suggests AWS's Trainium4 UltraServers could deliver upwards of 2 exaFLOPS of dense FP4 performance and 2.8 petabytes a second of memory bandwidth. That latter point is likely to be a major boon for bandwidth-bound inference workloads. Despite a rather confusing naming convention, AWS actually employs Trainium for both internal and external training and inference. Of course, the devil is in the details and we simply don't have all of them yet. Amazon made similar claims about its Trainium3 UltraServers this time last year, boasting a 4.4x uplift in compute over its Trainium2 racks. But while technically true, what we didn't know at the time was roughly half that performance would be achieved by more than doubling the number of chips from 64 to 144. Speaking of Trainium3, a year after first teasing the chips, Amazon is finally ready to bring its third generation of Trainium accelerators to the general market. According to AWS, each chip is equipped with 144 GB of HBM3E memory, good for 4.9 TB/s of memory bandwidth, and is capable of churning out just over 2.5 petaFLOPS of dense FP8 performance. However, for jobs that benefit from sparsity, like training, the chips are even more potent. Trainium3 features 16:4 structured sparsity, which effectively quadruples the chip's performance to 10 petaFLOPS for supported workloads. Amazon's Trainium3 UltraServers cram 144 of these chips connected in an all-to-all fabric using its NeuronSwitch-v1 interconnect tech, which Amazon says offers twice the chip-to-chip bandwidth. This is a marked change from Amazon's Trainium2 UltraServers, which featured 64 accelerators arranged in a 4x4x4 3D torus topology. Amazon declined to comment on how the 144 Trainium3 accelerators are connected to one another, but if we had to guess, it likely resembles the flat switched topology used in Nvidia's NVL72 or AMD's Helios rack systems. Such a move should ease the transition to NVLink Fusion next generation, but leaves Google as one of the few chip designers left using mesh topologies in large-scale AI training and inference clusters. In any case, Amazon seems confident that its new interconnect tech and EFA networking will enable it to support production deployments containing up to a million accelerators, compared to the 500,000 Trainium2 chips found in Project Rainier. Combined, each Trainium3 UltraServer features 20.7 TB of HBM3E, 706 TB/s of memory bandwidth, and between 363 and 1,452 petaFLOPS depending on whether your workload actually benefits from sparsity or not. This puts the systems roughly on par with Nvidia's latest Blackwell Ultra-based GB300 NVL72 systems - at least at FP8. At FP4, the gap widens considerably with the Nvidia system delivering more than 3x the performance. With that said, FP4 is still primarily used in inference, while higher-precision datatypes like BF16 and FP8 are preferred for training. Despite Trainium's advancements in performance, some customers still aren't ready to abandon Nvidia just yet. Because of this, Amazon has also announced the availability of new compute offerings based on Nvidia's GB300 NVL72, which join the company's existing GB200 instances. ®
[6]
Amazon introduces new Trainium3 chip offering 4x AI performance
AWS says these customers reported stronger inference performance, faster iteration, and reduced billable compute hours. AWS also offered a first look at Trainium4. The company has not announced a release window, but it confirmed that Trainium4 will deliver another large step in performance. The most notable shift is compatibility with NVIDIA NVLink Fusion, a high-speed chip-to-chip interconnect that supports large, tightly coupled AI compute clusters. This direction suggests AWS is no longer trying to position its hardware as a strict alternative to NVIDIA chips. Instead, it is building a hybrid ecosystem where AWS silicon can plug into, extend, or complement NVIDIA GPU systems. This could make it easier for companies building on CUDA, which remains the industry standard for AI development, to adopt AWS custom hardware without rebuilding their software stack. Alongside the chip announcement, AWS and NVIDIA expanded their strategic partnership. The collaboration covers interconnect technology, datacenter infrastructure, open model support, and rack-scale AI system deployment.
[7]
AWS wants to be a part of Nvidia's "AI Factories" - and it could change everything about how your business treats AI
Collaboration including some of the latest hardware around AWS has announced it will be teaming up with Nvidia to develop and build so-called "AI Factories" - hyper-powered facilities bringing together some of the most powerful hardware around to power the next generation of AI. Revealed at AWS re:Invent 2025, the news will see the two tech giants work together on AI Factories going forward, including Nvidia AI hardware and AWS' latest Trainium chips, networking tools, storage and database technology. Bringing all this together will help organizations and governments alike develop AI Factory technology to scale up their workloads and take the next step forward. AWS says this new approach offers companies a much more streamlined and effective way to scale and develop their AI projects, which can involve huge expense and amounts of time and other resources. Instead, AI Factories will deploy dedicated AWS AI infrastructure directly into the customer's data center, ensuring high customization but also security and oversight - in effect creating something like a private AWS region for the customer, the company says. In addition, AWS and Nvidia's partnership will now mean customers can build and run LLMs faster at scale, giving AWS customers access to the full stack of Nvidia's AI software. This includes the latest NVIDIA Grace Blackwell and the next-generation NVIDIA Vera Rubin platforms, with support for Nvidia's NVLink Fusion high-speed chip interconnect technology coming soon to next generation Trainium4 chips. "Large-scale AI requires a full-stack approach -- from advanced GPUs and networking to software and services that optimize every layer of the data center. Together with AWS, we're delivering all of this directly into customers' environments," said Ian Buck, vice president and general manager of Hyperscale and HPC at Nvidia. "By combining NVIDIA's latest Grace Blackwell and Vera Rubin architectures with AWS's secure, high-performance infrastructure and AI software stack, AWS AI Factories allow organizations to stand up powerful AI capabilities in a fraction of the time and focus entirely on innovation instead of integration."
[8]
Amazon unveils new AI chip in battle against Nvidia
San Francisco (United States) (AFP) - Amazon Web Services launched its in-house-built Trainium3 AI chip on Tuesday, marking a significant push to compete with Nvidia in the lucrative market for artificial intelligence computing power. The move intensifies competition in the AI chip market, where Nvidia currently dominates with an estimated 80- to 90-percent market share for products used in training large language models that power the likes of ChatGPT. Google last week caused tremors in the industry when it was reported that Facebook-parent Meta would employ Google AI chips in data centers, signaling new competition for Nvidia, currently the world's most valuable company and a bellwether for the AI investment frenzy. This followed the release of Google's latest AI model last month that was trained using the company's own in-house chips, not Nvidia's. AWS, which will make the technology available to its cloud computing clients, said its new chip is lower cost than rivals and delivers over four times the computing performance of its predecessor while using 40 percent less energy. "Trainium3 offers the industry's best price performance for large scale AI training and inference," AWS CEO Matt Garman said at a launch event in Las Vegas. Inference is the execution phase of AI, where the model stops scouring the internet for training and starts performing tasks in real-world scenarios. Energy consumption is one of the major concerns about the AI revolution, with major tech companies having to scale back or pause their net-zero emissions commitments as they race to keep up on the technology. AWS said its chip can reduce the cost of training and operating AI models by up to 50 percent compared with systems that use equivalent graphics processing units, or GPUs, mainly from Nvidia. "Training cutting-edge models now requires infrastructure investments that only a handful of organizations can afford," AWS said, positioning Trainium3 as a way to democratize access to high-powered AI computing. AWS said several companies are already using the technology, including Anthropic, maker of the Claude AI assistant and a competitor to ChatGPT-maker OpenAI. AWS also announced it is already developing Trainium4, expected to deliver at least three times the performance of Trainium3 for standard AI workloads. The next-generation chip will support Nvidia's technology, allowing it to work alongside that company's servers and hardware. Amazon's in-house chip development reflects a broader trend among cloud providers seeking to reduce dependence on external suppliers while offering customers more cost-effective alternatives for AI workloads. Nvidia puzzled industry observers last week when it responded to Google's successes in an unusual post on X, saying the company was "delighted" by the competition before adding that Nvidia "is a generation ahead of the industry."
[9]
AWS brings sovereign AI on-prem with new AI Factories alongside Trainium3 and Nvidia GB300 launches - SiliconANGLE
AWS brings sovereign AI on-prem with new AI Factories alongside Trainium3 and Nvidia GB300 launches Amazon Web Services Inc. today announced a set of artificial intelligence infrastructure announcements spanning sovereign on-premises deployments, next-generation custom AI accelerators and the most advanced Nvidia Corp. GPU instances yet offered on AWS -- all part of a push to dominate both cloud and private AI at large scale. The announcements included the launch of AWS AI Factories, the general availability of Amazon EC2 Trn3 UltraServers powered by the new Trainium3 chip and the introduction of P6e-GB300 UltraServers featuring Nvidia's latest Blackwell-based GB300 NVL72 platform. Leading the announcements is AWS AI Factories, a new offering that delivers dedicated, full-stack AWS AI infrastructure directly inside customers' existing data centers. The platform combines Nvidia accelerated computing, AWS Trainium chips, high-speed low-latency networking, energy-efficient infrastructure and core AWS AI services, including Amazon Bedrock and Amazon SageMaker. AWS AI Factories have been built primarily for governments and regulated industries and operate similarly to a private AWS Region to provide secure, low-latency access to compute, storage and AI services while ensuring strict data sovereignty and regulatory compliance. With the offering, customers can leverage their own facilities, power and network connectivity, while AWS handles deployment, operations and lifecycle management. AWS says that the result accelerates deployment timelines that would normally take years. As part of the AI Factories announcement, AWS also highlighted its deepening partnership with Nvidia around the platform, including support for Grace Blackwell and future Vera Rubin GPU architectures and future support for Nvidia NVLink Fusion interconnects in Trainium4. "Large-scale AI requires a full-stack approach -- from advanced GPUs and networking to software and services that optimize every layer of the data center," said Ian Buck, vice president and general manager of Hyperscale and HPC at Nvidia. "Together with AWS, we're delivering all of this directly into customers' environments." AWS also announced that its Amazon EC2 Trn3 UltraServers, powered by the new three-nanometer Trainium3 AI chip, are now generally available. Trn3 systems can scale up to 144 Trainium3 chips in a single UltraServer to deliver up to 4.4 times more compute performance, four times greater energy efficiency and nearly four times more memory bandwidth than Trainium2. The UltraServers are designed for next-generation workloads such as agentic AI, mixture-of-experts models and large-scale reinforcement learning, with AWS-engineered networking that delivers sub-10-microsecond chip-to-chip latency. In testing using OpenAI Group PBC's open-weight model GPT-OSS, AWS customers achieved three times higher throughput per chip and four times faster inference response times versus the previous generation. Customers including Anthropic PBC, Karakuri Ltd., Metagenomi Inc., Neto.ai Inc., Ricoh Company Ltd. and Splash Music Inc. are already reporting up to 50% reductions in training and inference costs. AWS also previewed Trainium4, which is expected to deliver major gains in FP4 and FP8 performance and memory bandwidth. Rounding out the AI infrastructure announcements, AWS introduced the new P6e-GB300 UltraServers, featuring Nvidia's GB300 NVL72 platform, making it the most advanced Nvidia GPU architecture available in Amazon EC2. The instances deliver the highest GPU memory and compute density on AWS, targeting trillion-parameter AI inference and advanced reasoning models in production. The P6e-GB300 systems run on the AWS Nitro System and integrate tightly with services such as Amazon Elastic Kubernetes Service and, in doing so, allow customers to deploy large-scale inference workloads securely and efficiently.
[10]
AWS Launches Trainium3 UltraServers, Gives a Peek Into Trainium4 | AIM
At re: Invent 2025, AWS announced the general availability of its new Amazon EC2 Trn3 UltraServers, powered by the Trainium3 chip built on 3nm technology, to help customers train and deploy AI models faster and at lower cost. The company said the new servers deliver up to 4.4x more compute performance, 4x greater energy efficiency, and almost 4x more memory bandwidth compared to the previous Trainium2 generation. Each UltraServer can scale up to 144 Trainium3 chips, offering as much as 362 FP8 petaflops of compute. Trainium3 follows AWS's earlier deployment of 500,000 Trainium2 chips in Project Rainier, created with Anthropic and described as the world's largest AI compute cluster. AWS also revealed early details of Trainium4, expected to deliver at least 6x the processing performance in FP4, along with higher FP8 performance and memory bandwidth. The next-generation chip will support NVIDIA NVLink Fusion interconnects to operate alongside NVIDIA GPUs and AWS Graviton processors in MGX racks. AWS has already deployed more than 1 million Trainium chips to date. The company says the latest performance improvements translate to faster training and lower inference latency. In internal tests using OpenAI's GPT-OSS open-weight model, Trn3 UltraServers delivered three times higher throughput per chip and four times faster response times compared to Trn2 UltraServers. Companies including Anthropic, Karakuri, Metagenomi, NetoAI, Ricoh and Splash Music are already reporting reduced training and inference costs up to 50% in some cases. AWS said its Bedrock service is already running production workloads on Trainium3. Decart, which focuses on real-time generative video, said it has achieved 4x faster frame generation at half the cost of GPUs on Trainium3. AWS noted that such capabilities could support large-scale interactive applications. The UltraServers are supported by an upgraded networking stack, including the new NeuronSwitch-v1, which provides twice the internal bandwidth, and a revised Neuron Fabric that brings inter-chip latency below 10 microseconds. The company said this reduces bottlenecks in distributed training and inference, especially for workloads such as agentic systems, mixture-of-experts architectures and reinforcement learning. UltraClusters 3.0 can connect thousands of the new servers, scaling to as many as one million Trainium chips -- 10 times the previous generation. AWS said this level of scale enables training multimodal models on trillion-token datasets and serving millions of concurrent users.
[11]
Nvidia and AWS expand partnership with specialized AI hardware and software - SiliconANGLE
Nvidia and AWS expand partnership with specialized AI hardware and software Nvidia Corp. and Amazon Web Services Inc. announced the expansion of the two companies' collaboration on new chip technology, networking, cloud infrastructure and open models and physical AI. Fueling scale-up in infrastructure and custom silicon, AWS said at its re:Invent conference today that it would support Nvidia's NVLink Fusion, a custom central processing unit and accelerator designed for scaling artificial intelligence data centers. It will be used in deploying custom silicon, including AWS's upcoming Tranium4 chips for AI inference and model training and Graviton central processing units. "GPU compute demand is skyrocketing -- more compute makes smarter AI, smarter AI drives broader use and broader use creates demand for even more compute," said Nvidia founder and Chief Executive Jensen Huang. "The virtuous cycle of AI has arrived." In conjunction, AWS is expanding its accelerating computing offerings with Nvidia Blackwell architecture, including Nvidia HGX B300 and GB300 NVL72 graphics processing units. The company said these GPUs will be added to the AWS infrastructure backbone for AI Factories, a new AI cloud offering for customers worldwide, providing secure, regionally sovereign AI infrastructure for globally situated companies. For the public sector, AI factories will help transform the federal supercomputing and AI landscape with a unified architecture. With these global datacenters, AWS plans to provide access to advanced AI services and capabilities to deploy and train massive models while maintaining absolute control of proprietary data. The partnership also expands the integration of Nvidia software with the AWS AI ecosystem. Nvidia Nemotron open models will now be available on Amazon Bedrock, the company's fully managed service providing access to a large number of foundation models. Developers can now create generative AI applications and agents using Nemotron Nano 2 and Nemotron Nano 2 VL to build specialized agent-based AI applications capable of processing text, code, images, and videos at scale. The two companies will also work together to co-engineer the software layer to accelerate data ingestion and processing for enterprise companies by combining technologies. Amazon OpenSearch Service, a managed, scalable search and analytics service, will now offer serverless GPU acceleration for vector index building, powered by Nvidia cuVS, an open-source library for GPU-accelerated vector search and data clustering. Production-ready agentic AI will gain from combining Strands Agents for agent development, Nvidia NeMo Agent Toolkit for deep profiling and performance tuning and Amazon Bedrock AgentCore providing scalable agent infrastructure. Advancing AI-powered robotics requires high-quality and diverse datasets for training foundation models for physical AI, as well as frameworks for testing and validating them in simulation before deploying to the real world. Physical AI refers to artificial intelligence systems and models designed to interact with the real world through sensing, reasoning and acting through physical machines. These machines can include robots, self-driving cars, smart buildings and intelligent assistants that can interact with the physical world. Nvidia Cosmos, world foundation models used to simulate the real world virtually for training and the production of synthetic data that's difficult to gather. The platform speeds up the process of turning small amounts of visual data into large training sets for a wide variety of scenarios. Cosmos world foundation models are now available as Nvidia NIM microservices on Amazon EKS, the company's managed Kubernetes service. This will enable real-time robotics control and simulation workloads in the cloud. The platform also includes models that comprehend real-world physics, object interactions and motion, enabling reasoning about complex situations and predicting outcomes. This capability allows for the development of AI agents that can perform tasks with a deeper understanding of the real world.
[12]
ETtech Explainer: Amazon's latest Tranium chip and what it means for the race - The Economic Times
Amazon Web Services has launched Trainium3, its new AI chip, offering faster performance, more memory, and lower energy use. The next-generation Trainium4 will add Nvidia's NVLink Fusion for even bigger, faster AI servers. These chips give AWS more control, cut costs, and strengthen its position in the cloud AI market.Amazon Web Services (AWS) has spent years creating its own AI training hardware, and its newest release, Trainium3, is touted to perform far better than earlier versions. At its annual tech conference, re:Invent 2025, AWS introduced the Trainium3 UltraServer. This machine runs on the company's advanced 3-nanometre Trainium3 chip and uses AWS's own networking systems to support large-scale AI workloads. What's special about Trainium3? AWS says the new system is over four times faster and offers four times more memory. It is built for both training and running demanding AI applications. Thousands of UltraServers can be connected to provide an application with as many as one million Trainium3 chips, ten times more than before. Each UltraServer contains 144 chips. The company also says the chips use around 40% less energy than the previous generation. As global data-centre power use continues to climb, AWS is aiming to build systems that need less electricity, not more. "AWS Trainium is a family of purpose-built AI accelerators -- Trn1, Trn2, and Trn3 -- designed to deliver scalable performance and cost efficiency for training and inference across a broad range of generative AI workloads," the company said in a blog post. Customers such as Anthropic, Japan's LLM Karakuri, SplashMusic, and Decart are already using Trainium3. According to TechCrunch, they have seen notable reductions in inference costs. Next up: Trainium4 At AWS' conference, the company also gave a first look at its next AI chip, Trainium4, which is already in development. AWS said the new chip will use a technology called "NVLink Fusion". This allows very fast communication between different chips and is one of Nvidia's key innovations. Nvidia has been encouraging other chip makers to use NVLink, and AWS now joins Intel and Qualcomm in adopting it. Trainium4 is expected to deliver a big performance boost and support NVLink Fusion, letting the system work smoothly with Nvidia GPUs. At the same time, it will continue to use Amazon's more affordable server rack design. The technology will also help AWS build larger AI servers that can communicate efficiently, which is important when training huge AI models. Through this partnership, customers will gain access to AWS's 'AI Factories', private AI infrastructure in their own data centres, for faster performance and readiness. However, Amazon has not announced a release date for Trainium4. If it follows previous patterns, more details may appear at next year's conference. What this means These new chips could make AWS a more attractive option for large businesses seeking to train and deploy advanced models. Early tests suggest Trainium3 can cut costs by up to 50% compared to traditional GPU-based systems, making large-scale AI more affordable. Nvidia has long dominated top-tier AI hardware, but Trainium3 gives Amazon a strong alternative that is cheaper while still delivering high performance. By developing its own chips, AWS reduces its reliance on external supplies, such as Nvidia GPUs, for major AI projects. This, in turn, means that Amazon gains more control over pricing and availability and can offer customers a lower-cost option without sacrificing capability, strengthening its position in the cloud AI market. Investors responded quickly as well -- the rise in Amazon's share price reflects confidence in its AI strategy and expectations of continued cloud growth. If Trainium3 sees wide adoption, analysts believe it will support stronger profits and reinforce Amazon's reputation as a leading technology company.
[13]
AWS Tranium3 AI Is 'The Best Inference Platform In The World,' CEO Says
Amazon Web Services CEO Matt Garman says the company's newly launched Trainium3 AI accelerator chips are the "best inference platform in the world," while its AI openness strategy is helping to win enterprise customer market share. "We make it so if you want to pull in Gemini models from Google, you can use those with our AgentCore," said Garman in an interview with CRN. "We think that innovation happens in lots of different places, not just inside the walls of Amazon. We want our customers to be able to benefit from that ecosystem of innovation," said Garman. "It is a difference in philosophies of how we operate, and our competitors don't always operate that same way." [Related: AWS' Nova 2 AI Models Launched At re:Invent 2025 As CEO Touts New Innovation ] Thousands of AWS customers, developers and partners flocked to Las Vegas this week to attend AWS re:Invent 2025. The Seattle-based $132 billion cloud giant launched a slew of innovation at the conference, including the general availably of its new Trainium3 chips for AI training and interfering. "Trainium3 is actually going to be the best inference platform in the world," Garman told CRN. AWS sees Bedrock with Trainium3 as the world's leading inference engine and a business that will one day possibly be as large as Amazon EC2. Garman and his team expect Trainium3 to be a major hit with enterprise customers as it will provide better price-performance options. "[Trainium3] will be the most efficient, most effective, the best cost performance, the lowest latency and the best throughput," Garman said. In an interview with CRN, Garman takes a deep dive into AWS' new Trainium3 AI chips, "supercharging" its agentic AI development offering Kiro, and why AWS' AI openness strategy will win versus competitors like Microsoft and Google Cloud.
[14]
Amazon Is 'All-In
Amazon has ramped up the ASIC race by showcasing Trainium3 server configurations and next-gen Trainium4 ASICs, bringing massive performance and efficiency gains. Amazon's Trainium3 UltraServers & Next-Gen Trainium4 Chips Showcases the Firm's 'All-In' Commitment in the AI Race The race of custom silicon from Big Tech is indeed reaching newer levels, as we are seeing massive advancements from firms like Google, Meta, and now Amazon. At the AWS re: Invent 2025, Amazon has given an insight into what customers should expect from the firm in the realm of ASICs, and one of the bigger developments from the company is the introduction of Trainium3 UltraServers, which basically an AI system scaling up into 144 chips in a single cluster, which has brought in "up to 4.4x more compute performance, 4x greater energy efficiency, and almost 4x more memory bandwidth" compared to the previous generation. Trn3 UltraServers pack up to 144 Trainium3 chips into a single integrated system, delivering up to 4.4x more compute performance than Trainium2 UltraServers. This allows you to tackle AI projects that were previously impractical or too expensive by training models faster, cutting time from months to weeks, serving more inference requests from users simultaneously, and reducing both time-to-market and operational costs. The Trainium3 UltraServers server features the newer NeuronSwitch-v1 technology, which debuts with upgraded bandwidth and fabric networking. It's an alternative to NVIDIA's NVLink solution, offered by Amazon. Instead, here, the idea is to interconnect Trainium ASICs into a massive 1 million-chip cluster, which is claimed to bring the ability to train "trillion-token datasets" for inferencing capabilities. The UltraServers are an indicator that ASIC manufacturers are aggressively expanding their compute portfolio, given the compute constraints companies are facing. Amazon has also given us a look into next-gen Trainium4 ASICs, which are said to feature 6x higher FP4 performance and a massive increase in memory specifications. More importantly, Trainium4 will now support NVIDIA's NVLink technology as well, which means that customers looking to scale up their existing infrastructure by adding the Trainium stack, combined with Team Green's compute portfolio, can do so easily. Amazon has reported massive 'external interest' around its custom AI chips, with companies like Anthropic reporting reduced training costs. It appears that Amazon is 'all in' when it comes to the race for ASICs, and following Google's recent TPU announcements, it seems the retail giant is not holding back when it comes to advancing its compute portfolio. Follow Wccftech on Google to get more of our news coverage in your feeds.
[15]
AWS says custom AI chip line becomes multibillion-dollar business, unveils plans for Trainium 4 - The Economic Times
Las Vegas: Amazon Web Services' custom AI chip line, Trainium, is now a multibillion-dollar business, with its Trainium 3 UltraServers available and work underway on the next-generation chip, Trainium 4. Speaking at the company's annual event, AWS re:Invent 2025 in Las Vegas, CEO Matt Garman said that Trainium is already a multibillion-dollar business, and that 1 million Trainium chips have been deployed so far. Trainium is AWS's in-house chip used for both training and inference, positioning the company to compete with Nvidia, the market leader in AI training. Customers using Trainium 3 include Anthropic, Karakuri, Metagenomics, Neto.ai, Ricoh, and Splashmusic. According to AWS, Trainium can reduce the cost of training and inference by up to 50%. In addition, the company is doubling down on AI agents. Garman said that the advent of agents represents a key inflection point, with the true value of AI still waiting to be unlocked. He added that reaching the age of agents requires reimagining every process -- and that demands powerful AI at the lowest possible cost. AWS's work on AI chips is a step in this direction. The company has also partnered with Nvidia, and says it is expanding its compute portfolio with P6e-GB300 UltraServers, featuring Nvidia's most advanced GPU architecture on AWS Cloud. The company has also expanded Nova, AWS's suite of AI models, introducing four new models focused on reasoning, multimodal processing, conversational AI, and code generation. (The reporter is attending AWS re:Invent 2025 in Las Vegas at the invitation of AWS)
[16]
Amazon to use Nvidia tech in AI chips, roll out new servers
Amazon's AWS cloud computing unit on Tuesday said it will adopt key Nvidia technology in future generations of its artificial intelligence computing chips as the firm ramps up efforts to get major AI customers using its services. AWS, or Amazon Web Services, said it will adopt a technology called "NVLink Fusion" in a future chip, with no specified release date, known as Trainium4. The NVLink technology creates speedy connections between different kinds of chips and is one of Nvidia's crown jewels. The companies made the announcement as part of AWS' annual week-long cloud computing conference in Las Vegas, which draws some 60,000 people. Amazon is expected to also show off new versions of its Nova AI model, initially unveiled last year. Nvidia has been pushing to get other chip firms to adopt its NVLink technology, with Intel, Qualcomm and now AWS on board. The technology will help AWS build bigger AI servers that can recognize and communicate with one another faster, a critical factor in training large AI models, in which thousands of machines must be strung together. As part of the Nvidia partnership, customers will have access to what AWS is calling AI Factories, exclusive AI infrastructure inside their own data centers for greater speed and readiness. "Together, Nvidia and AWS are creating the compute fabric for the AI industrial revolution - bringing advanced AI to every company, in every country, and accelerating the world's path to intelligence," Nvidia CEO Jensen Huang said in a statement. Separately, Amazon said it is rolling out new servers based on a chip called Trainium3. The new servers, available on Tuesday, each contain 144 chips and have more than four times more computing power than AWS' previous generation of AI, while using 40% less power, Dave Brown, vice president of AWS compute and machine learning services, told Reuters. Brown did not give absolute figures on power or performance, but said AWS aims to compete with rivals - including Nvidia - based on price. "We've got to prove to them that we have a product that gives them the performance that they need and get a right price point so they get that price-performance benefit," Brown said. "That means that they can say, 'Hey, yeah, that's the chip I want to go and use.'"
[17]
Amazon to use Nvidia tech in AI chips, roll out new servers
LAS VEGAS, Dec 2 (Reuters) - Amazon.com's AWS cloud computing unit on Tuesday said it will adopt key Nvidia technology in future generations of its artificial intelligence computing chips as the firm ramps up efforts to get major AI customers using its services. AWS, or Amazon Web Services, said it will adopt a technology called "NVLink Fusion" in a future chip, with no specified release date, known as Trainium4. The NVLink technology creates speedy connections between different kinds of chips and is one of Nvidia's crown jewels. The companies made the announcement as part of AWS' annual week-long cloud computing conference in Las Vegas, which draws some 60,000 people. Amazon is expected to also show off new versions of its Nova AI model, initially unveiled last year. Nvidia has been pushing to get other chip firms to adopt its NVLink technology, with Intel, Qualcomm and now AWS on board. The technology will help AWS build bigger AI servers that can recognize and communicate with one another faster, a critical factor in training large AI models, in which thousands of machines must be strung together. As part of the Nvidia partnership, customers will have access to what AWS is calling AI Factories, exclusive AI infrastructure inside their own data centers for greater speed and readiness. "Together, Nvidia and AWS are creating the compute fabric for the AI industrial revolution - bringing advanced AI to every company, in every country, and accelerating the world's path to intelligence," Nvidia CEO Jensen Huang said in a statement. Separately, Amazon said it is rolling out new servers based on a chip called Trainium3. The new servers, available on Tuesday, each contain 144 chips and have more than four times more computing power than AWS' previous generation of AI, while using 40% less power, Dave Brown, vice president of AWS compute and machine learning services, told Reuters. Brown did not give absolute figures on power or performance, but said AWS aims to compete with rivals - including Nvidia - based on price. "We've got to prove to them that we have a product that gives them the performance that they need and get a right price point so they get that price-performance benefit," Brown said. "That means that they can say, 'Hey, yeah, that's the chip I want to go and use.'" (Reporting by Stephen Nellis in San Francisco and Greg Bensinger in Las Vegas; Editing by Paul Simao)
[18]
Trainium 3 explained: Amazon's new AI chip and its NVIDIA-ready roadmap
AWS unveils Trainium 3 to deliver faster, more efficient frontier model training Amazon's newest AI chip arrives at a moment when the global demand for compute is rising faster than the hardware ecosystem can keep up. Trainium 3 is not just a faster successor. It is the centerpiece of a strategy that aims to make AWS a core destination for training frontier-scale models while reducing the industry's overreliance on GPUs. The announcement also teased something even more consequential: a roadmap that brings Amazon's hardware closer to Nvidia's world instead of competing against it from the sidelines. Also read: ChatGPT Ads: Sam Altman's dangerous road to boost OpenAI profits, will it work? Model sizes are ballooning, data pipelines are scaling, and training runs now stretch into millions of GPU-hours. For most companies, access to the required hardware is the single biggest bottleneck. AWS wants to close the gap with a chip built specifically for AI training workloads, not adapted from general computing tasks. Trainium 3 is manufactured on a 3-nanometer process and delivers up to four times the performance of its predecessor while using significantly less power. In practice, this means faster iteration cycles for anyone building large models and lower energy costs for organizations running long multistage training jobs. AWS also introduced the UltraServer, a dense system that houses 144 of these chips and can be linked with thousands of others in massive clusters. This kind of scale is designed to support everything from enterprise models to experimental systems that push the limits of today's AI research. AWS has tried for years to establish itself as a viable alternative to Nvidia hardware, but the market reality is clear. Developers are deeply tied to GPU-optimized frameworks, toolchains, and workflows. Replacing Nvidia outright is neither easy nor realistic. With Trainium 3 and the roadmap behind it, AWS is shifting toward a hybrid approach. The next generation, Trainium 4, will support Nvidia's high-speed NVLink Fusion interconnect. That matters because it enables mixed clusters where Trainium chips and Nvidia GPUs work together instead of in separate environments. It also reduces the friction for teams that want to explore non-GPU accelerators but aren't ready to overhaul their entire stack. Compatibility becomes a bridge, not a threat. Also read: Better than VAR? FIFA World Cup 2026 will have more accurate tech This move positions AWS differently in the AI infrastructure race. It signals that the company understands the importance of interoperability and wants to attract developers by meeting them halfway. Rather than building a walled garden, AWS is trying to expand the range of hardware choices for customers who want performance, flexibility, and lower costs. For cloud buyers, this opens up practical advantages. Workloads tuned for GPUs can continue running on familiar infrastructure, while exploratory or large-scale training tasks can shift to Trainium-based clusters that promise better efficiency. For enterprises, it offers a way to scale without fighting for scarce GPUs or paying premium prices in secondary markets. If Trainium 3 delivers on its claims, it could push other cloud providers to invest more aggressively in custom silicon. It also intensifies competition around energy efficiency, a metric that will be central as AI growth collides with sustainability concerns. More significantly, the Nvidia-friendly roadmap hints at a future where cloud platforms become modular hardware ecosystems rather than single-vendor silos. The AI industry has spent years chasing raw power. The next phase will value flexibility just as much and AWS is betting that customers want both. Trainium 3 is the hardware expression of that bet, and Trainium 4's Nvidia compatibility shows how AWS intends to win developers without forcing them to abandon what already works. At a time when every major player is trying to secure its place in the AI supply chain, Amazon's newest chip positions AWS not as a challenger on the outskirts, but as a platform aiming to sit at the center of how frontier models are built.
Share
Share
Copy Link
Amazon Web Services announced AI Factories for on-premises deployment and launched its Trainium3 chip at re:Invent 2025. The cloud giant revealed Trainium4 will support Nvidia's NVLink Fusion interconnect technology. These moves address data sovereignty concerns while AWS races to expand cloud capacity, adding over 12 gigawatts by 2027 to maintain its lead against Microsoft Azure and Google Cloud.
Amazon Web Services unveiled AWS AI Factories at re:Invent 2025, a fully managed solution that brings AI infrastructure directly into corporate and government data centers
1
.
Source: SiliconANGLE
The service addresses data sovereignty requirements by allowing organizations to maintain complete control over their data while AWS handles hardware installation and management
3
. Customers provide the physical space and power, while AWS deploys its hardware and software, creating what functions as a private AWS Region on-premises3
.The on-premises AI solution emerged from AWS's work with Saudi Arabia's AI Zone, which will feature up to 150,000 AI chips and dedicated infrastructure
3
. AWS AI Factories combine Amazon's technology with Nvidia hardware, offering customers access to Blackwell GPUs or Amazon's Trainium3 chip, alongside AWS networking, storage, and services like Amazon Bedrock and SageMaker1
. This approach mirrors similar offerings from Microsoft Azure, which has been deploying AI Factories in its global data centers for OpenAI workloads1
.AWS formally launched its Trainium3 UltraServer, powered by 3-nanometer AI chips that deliver significant improvements over previous generations
2
.
Source: The Register
The AI training chip offers four times more compute performance, memory, and energy efficiency compared to Trainium2
2
. Each UltraServer hosts 144 chips, and thousands can be linked to provide up to 1 million Trainium3 chips for a single application—10 times the previous generation's capacity2
.Early customers testing Trainium3 have reduced AI training and inference costs by up to 50%
4
. Companies like Anthropic, Japan's Karakuri, Splashmusic, and Decart are already using the third-generation chip to significantly cut their inference expenses2
. Each chip features 144 GB of HBM3E memory with 4.9 TB/s of memory bandwidth, capable of delivering over 2.5 petaFLOPS of dense FP8 performance5
. The 40% improvement in energy efficiency addresses growing concerns about data center power consumption as AI workloads expand2
.AWS teased Trainium4, currently in development, which will support NVLink Fusion interconnect technology for seamless communication with Nvidia GPUs
2
.Source: Market Screener
This integration marks a significant deepening of the AWS and Nvidia partnership, allowing Trainium4-powered systems to interoperate with Nvidia hardware while using Amazon's lower-cost server rack technology
2
. The move could make it easier to attract AI applications built with Nvidia's CUDA platform, which has become the de facto standard2
.Amazon claims Trainium4 will deliver 3x more FLOPS at FP8, 6x the performance at FP4, and 4x the memory bandwidth
5
. The accelerators will work with Graviton CPUs and EFA networking technology across Nvidia's MGX racks5
. While AWS hasn't announced a timeline, following previous patterns suggests more details will emerge at next year's re:Invent conference2
.Related Stories
Beyond new AI chips, AWS's primary focus remains rapidly expanding cloud computing capacity to maintain its lead against Microsoft Azure and Google Cloud
4
. AWS accelerated to 20.2% year-over-year growth in Q3, adding more than 3.8 gigawatts in the past 12 months4
. Wells Fargo analysts estimate AWS will add more than 12 gigawatts of compute by year-end 2027, potentially supporting up to $150 billion in incremental annual revenue if demand remains strong4
.Oppenheimer analysts noted that each incremental gigawatt of compute added in recent quarters translated to roughly $3 billion of annual cloud revenue
4
. The capacity expansion could translate to 14% upside to 2026 AWS revenue and 22% upside in 20274
. However, AWS faces stiff competition in the AI infrastructure market, with Dell's AI Factory boasting 3,000 customers and $15.6 billion in AI server shipments year to date, while HPE gained over 300 new customers for its private AI cloud product3
.Forrester analysts warn that AI spending faces a reckoning as revenue from AI investments lags and customers demand clearer returns on investments that can cost millions to deploy and operate
3
. Despite these headwinds, AWS's dual strategy of custom AI chips and deepening Nvidia integration positions the company to serve diverse customer needs while managing capacity constraints that have limited growth.Summarized by
Navi
[2]
[3]
[4]
[5]