3 Sources
3 Sources
[1]
Amazon is building AI "factories" to run Trainium and GPU racks inside customer data centers
Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. One of the biggest challenges in rolling out AI across enterprise environments is fully tapping into each organization's unique requirements, data sets, and existing infrastructure. If you're a startup with no legacy systems to plug into, every new AI product can look like a perfect fit. Established companies, however, don't have that luxury. That's why I was intrigued by several of the announcements Amazon's AWS cloud division made at its annual re:Invent conference in Las Vegas. While other headlines may get louder buzz, the new AWS Factories initiative and the Nova Forge AI model customization platform feel far more consequential for the many organizations still wrestling with how to implement AI in a clearly beneficial, measurable way. Toss in new control and evaluation tools for creating and deploying agents - something most companies need in order to trust agent output before deploying them internally - and AWS is offering a trio of features squarely aimed at meeting the customization needs of existing enterprises. The most compelling news may be AWS Factories. What stood out is that AWS almost downplayed it, framing it as an expansion of a service that had previously been limited to a small set of customers. AWS Factories allows companies to stand up AWS-powered AI infrastructure within their own on-prem environments. Organizations can install AI racks built with either AWS Trainium accelerators or the latest Nvidia GPUs, alongside the full AWS AI software stack. Practically speaking, this gives highly regulated industries a new and meaningful degree of flexibility. Philosophically, it's a bigger shift. Just last year, AWS signaled no interest in extending its custom AI stack into customer data centers. Market reality intervened. On-prem AI workloads are common - as my recent study, "The Future of AI is Hybrid," confirmed - and interest goes well beyond regulated sectors. Meanwhile, major cloud and model providers have already introduced on-prem options, making AWS look slightly late. In truth, though, the movement toward hybrid AI that spans local data centers and public cloud resources is barely underway. AWS is still early enough to capture a very large opportunity. And by letting enterprises run workloads on AWS custom silicon in their own facilities, the company has leapfrogged even Google, which only recently announced plans to sell its TPU accelerators to third parties. What makes AWS Factories even more interesting is that Amazon also unveiled plans to work first with Google Cloud (and next year with Microsoft Azure) to ease multi-cloud adoption. For decades, "multi-cloud" was practically a taboo term inside AWS, and the company was slow to embrace on-prem hybrid services when that trend began to accelerate. Seeing AWS actively smoothing the path to hybrid, multi-cloud, and hybrid-AI environments is remarkable. The company's new Nova Forge offering is intriguing on many levels as well. After some early missteps with its own models, Amazon has continued to build out its Nova foundation model lineup (several new versions were also introduced at the show) underscoring its commitment. More importantly, Nova Forge offers a new pathway for enterprises to use their own data to produce highly customized AI models trained for their specific needs. Instead of simply fine-tuning an existing model, Nova Forge provides a mechanism to fully train a custom frontier model without the astronomical cost and complexity of starting from scratch. It lets organizations insert their own data into multiple early training stages through pre-written "recipes," adjusting open weights as the process runs. The result goes far beyond typical RAG-style fine-tuning and unlocks advanced capabilities, including reinforcement learning as the model continues to evolve in real use. Of course, we're not just in an AI era but an agentic AI era, so AWS also announced a slate of agent-focused tools. One of the most important updates extends the existing Bedrock AgentCore framework. The new AgentCore Evaluations continuously monitor agent behavior to ensure they do what they're intended to - and avoid what they shouldn't. Paired with a new security-centric AgentCore extension, these features should give enterprises more confidence that the billions of agents Amazon (and many others) expect to see in corporate environments can be trusted. Continuing the customization theme, organizations can tailor agent evaluation parameters so outputs and actions meet their own expectations and compliance standards. Alongside those customization efforts, AWS rolled out its usual flood of news. On the silicon front, it formally launched the Trainium 3 accelerator; introduced a new rack design showcasing AWS's custom chip-to-chip, rack-to-rack, and datacenter-to-datacenter networking; and even teased Trainium 4. The company also showed off fully autonomous "frontier agents" meant to turbocharge software developer productivity, and more. All told, it was yet another firehose of new announcements at re:Invent and yet another example of how quickly developments in AI and agents continue to occur. But at the broader level, Amazon signaled a subtle shift in posture. Several announcements made it clear that AWS recognizes its role within a larger technology ecosystem and is focused on making it easier for companies that rely on multiple vendors (as most every company does!) to integrate AWS solutions alongside them. And that, is an important step forward. Bob O'Donnell is the founder and chief analyst of TECHnalysis Research, LLC a technology consulting firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on Twitter @bobodtech
[2]
Amazon is testing out private on-premises 'AI Factories'
They're designed to respond to strict data sovereignty/privacy requirements Amazon Web Services has revealed more information about its AI Factories - full-stack AI infrastructure that sits inside a customer's own data center. This means customers would provide the facility and the power, with Amazon's cloud division providing and managing AI systems, in a way likening AI Factories to a private AWS region. Besides giving organizations more control against data sovereignty, security, or regulatory requirements, it also ensures they have access to hardware options, such as Nvidia's Blackwell GPUs or Amazon's Trainium3 accelerators. Why would a customer want to increase pressure on itself by becoming responsible for location and power? It's simple - certain enterprises and governments want access to advanced AI, but they're limited in terms of the data they can send off-prem. Building independent AI infrastructure is slow and costly, but AWS says it can deploy these systems in months, helping customers to avoid large capex burdens. With AWS managing the entire AI environment exclusively for the one customer, data stays local and hardware will not be shared with others. The shift to on-prem infrastructure is an interesting reverse from the cloud push we've seen in recent years, with enterprises largely worried about sensitive data, AI training, and national security. "By combining NVIDIA's latest Grace Blackwell and Vera Rubin architectures with AWS's secure, high-performance infrastructure and AI software stack, AWS AI Factories allow organizations to stand up powerful AI capabilities in a fraction of the time and focus entirely on innovation instead of integration," Hyperscale and HPC VP and GM for Nvidia, Ian Buck, commented. But Amazon isn't alone in pushing the concept of AI Factories. Microsoft has Azure Local to support sovereignty requirements, comprising Microsoft-managed hardware installed within a customer's facility.
[3]
AI infrastructure: AWS unveils AI Factories and Nova - SiliconANGLE
AWS showcases AI factories, models, chips and agents to drive its core infrastructure strategy Amazon Web Services Inc. is accelerating a new era of data-center modernization as enterprises reshape how they deploy artificial intelligence and AI infrastructure at scale. With its new AI Factories model, AWS is bringing full-stack systems directly into existing customer data centers, narrowing the divide between cloud innovation and on-prem control. The shift signals a market hungry for speed, sovereignty and hands-on oversight of high-value AI workloads -- and a cloud provider intent on meeting that demand head-on. "AI Factories is a big conversation" said Julia White (pictured), vice president and chief marketing officer at AWS. "What we're bringing forward is a very opinionated AWS AI factory that gives customers that ability to build out the AI infrastructure they need, particularly for highly regulated sovereign needs on a very large scale. We've had almost 20 years of know-how of how to do this at scale better than anybody. Fusing that with this kind of unique customer optimization and how we do it is the genesis of that idea." White spoke with John Furrier at AWS re:Invent, during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed how the company's various AI announcements this week fit with its infrastructure strategy for the enterprise. Along with the news surrounding AI Factories, AWS also unveiled a major initiative to expand its Nova foundation model platform with the launch of Nova Forge, a "first-of-its-kind" service to train and build custom frontier AI models. The move by AWS was a key step into the world of frontier model reasoning, advanced problem-solving capabilities for AI that moves from basic information retrieval to problem-solving and logical deduction. The focus was on helping customers fine-tune models for desired results, White explained. "There's limitations to how much you can do with fine-tuning," she said. "How could we fundamentally change that? That was the invention, the invention of Nova Forge, which is the first-ever ability for customers to take foundation models, Nova, and bring their own data and mix it with Amazon-provided data and actually start training the model." Model training still requires strong compute. This week, AWS announced general availability for Amazon EC2 Trn3 UltraServers, powered by the new three-nanometer Trainium3 AI chip. AWS also previewed Trainium4, which was expected to deliver major gains in FP4 and FP8 performance and memory bandwidth. "We're seeing our third generation show up and...just absolutely crush that price/performance promise for our customers," White said. "Each one of these generations is big, it's not incremental, big step function changes. Obviously, that's a huge driver of what people can do. Because these are our chips, we can just absolutely optimize every aspect of what we do across the infrastructure from top to bottom." From AI factories to frontier models and chips, the central message from AWS is that it is building AI infrastructure to support any enterprise need. That includes the proliferating field of agents, which businesses are beginning to build and implement for key tasks throughout organizations, White noted. "Go back to the very beginning of AWS and cloud," she said. "When AWS was invented, a small business could have the same technology capabilities as a big enterprise. The same point of this agentic approach that we have is we're showing people what amazing looks like from an agent outcome. We're giving every single person, big, small, otherwise, the tools to build whatever they might need." Here's the complete video interview, part of SiliconANGLE's and theCUBE's coverage of AWS re:Invent:
Share
Share
Copy Link
AWS introduced AI Factories at its re:Invent conference, bringing complete AI infrastructure directly into customer facilities. The service deploys AWS-managed racks with Trainium3 accelerators or Nvidia Blackwell GPUs alongside Nova Forge for custom model training. This shift addresses data sovereignty and regulatory requirements while AWS also announced multi-cloud partnerships with Google Cloud and Microsoft Azure.
AWS launched AI Factories at its annual re:Invent conference in Las Vegas, marking a strategic shift toward deploying complete AI infrastructure inside customer data centers
1
. The service allows organizations to install AWS-powered AI racks built with either Trainium3 accelerators or the latest Nvidia GPUs, alongside the full AWS AI software stack within their own on-premises environments1
. This approach gives highly regulated industries meaningful flexibility while maintaining control over sensitive data that cannot leave their facilities2
.Source: TechSpot
The AI Factories model represents AWS bringing what Julia White, vice president and chief marketing officer at AWS, described as "a very opinionated AWS AI factory" that leverages nearly 20 years of know-how in building infrastructure at scale
3
. Customers provide the facility and power, while Amazon's cloud division provides and manages AI systems exclusively for that one customer, ensuring data stays local and hardware is not shared with others2
.
Source: SiliconANGLE
The push for on-premises AI stems from enterprises and governments wanting access to advanced AI capabilities while facing strict data sovereignty, security, and regulatory requirements
2
. Building independent AI infrastructure is slow and costly, but AWS claims it can deploy these systems in months, helping customers avoid large capital expenditure burdens2
. Ian Buck, Hyperscale and HPC VP and GM for Nvidia, noted that combining Nvidia's Grace Blackwell and Vera Rubin architectures with AWS's secure, high-performance infrastructure allows organizations to stand up powerful AI capabilities in a fraction of the time2
.This shift marks a reversal from the cloud-first push seen in recent years, with enterprises increasingly worried about sensitive data, AI training, and national security
2
. The movement toward hybrid cloud AI that spans local data centers and public cloud resources is barely underway, positioning AWS early enough to capture a significant opportunity1
.Alongside AI Factories, AWS unveiled Nova Forge, a service for training and building custom frontier AI models tailored to specific enterprise needs
3
. Instead of simply fine-tuning existing foundation models, Nova Forge provides a mechanism to fully train a custom model without the astronomical cost and complexity of starting from scratch1
. The service lets organizations insert their own data into multiple early training stages through pre-written "recipes," adjusting open weights as model training progresses1
.White explained that Nova Forge addresses fundamental limitations in fine-tuning: "There's limitations to how much you can do with fine-tuning. How could we fundamentally change that? That was the invention, the invention of Nova Forge, which is the first-ever ability for customers to take foundation models, Nova, and bring their own data and mix it with Amazon-provided data and actually start training the model"
3
. The result goes far beyond typical RAG-style fine-tuning and unlocks advanced capabilities, including reinforcement learning as the model continues to evolve in real use1
.Related Stories
AWS announced plans to work first with Google Cloud and next year with Microsoft Azure to ease multi-cloud adoption
1
. For decades, "multi-cloud" was practically taboo inside AWS, making this active smoothing of the path to hybrid and multi-cloud environments remarkable1
. AWS is not alone in this spaceβMicrosoft has Azure Local to support sovereignty requirements, comprising Microsoft-managed hardware installed within customer facilities2
.On the hardware front, AWS formally launched the Trainium3 accelerator with general availability for Amazon EC2 Trn3 UltraServers, powered by the new three-nanometer chip
3
. White emphasized the price-performance advantage: "We're seeing our third generation show up and just absolutely crush that price/performance promise for our customers. Each one of these generations is big, it's not incremental, big step function changes"3
. AWS also previewed Trainium4, expected to deliver major gains in FP4 and FP8 performance and memory bandwidth3
. By letting enterprises run workloads on AWS custom silicon in their own facilities, the company has leapfrogged even Google, which only recently announced plans to sell its TPU AI accelerators to third parties1
.AWS also announced a slate of agent-focused tools, extending the existing Bedrock AgentCore framework
1
. The new AgentCore Evaluations continuously monitor agent behavior to ensure they perform intended tasks while avoiding unintended actions1
. Paired with a new security-centric AgentCore extension, these features give enterprises more confidence that the billions of agents Amazon and others expect to see in corporate environments can be trusted1
. Organizations can tailor agent evaluation parameters so outputs and actions meet their own expectations and compliance standards, continuing the customization theme1
.Summarized by
Navi
[1]
03 Dec 2024β’Technology

02 Dec 2025β’Technology

17 Jul 2025β’Technology
