AWS AI Factories Bring On-Premises AI Infrastructure

AWS Brings Full-Stack AI Systems to Enterprise Facilities

AWS launched AI Factories at its annual re:Invent conference in Las Vegas, marking a strategic shift toward deploying complete AI infrastructure inside customer data centers 1

. The service allows organizations to install AWS-powered AI racks built with either Trainium3 accelerators or the latest Nvidia GPUs, alongside the full AWS AI software stack within their own on-premises environments 1

. This approach gives highly regulated industries meaningful flexibility while maintaining control over sensitive data that cannot leave their facilities 2

Source: TechSpot

The AI Factories model represents AWS bringing what Julia White, vice president and chief marketing officer at AWS, described as "a very opinionated AWS AI factory" that leverages nearly 20 years of know-how in building infrastructure at scale 3

. Customers provide the facility and power, while Amazon's cloud division provides and manages AI systems exclusively for that one customer, ensuring data stays local and hardware is not shared with others 2

Source: SiliconANGLE

Addressing Data Sovereignty and Regulatory Requirements

The push for on-premises AI stems from enterprises and governments wanting access to advanced AI capabilities while facing strict data sovereignty, security, and regulatory requirements 2

. Building independent AI infrastructure is slow and costly, but AWS claims it can deploy these systems in months, helping customers avoid large capital expenditure burdens 2

. Ian Buck, Hyperscale and HPC VP and GM for Nvidia, noted that combining Nvidia's Grace Blackwell and Vera Rubin architectures with AWS's secure, high-performance infrastructure allows organizations to stand up powerful AI capabilities in a fraction of the time 2

This shift marks a reversal from the cloud-first push seen in recent years, with enterprises increasingly worried about sensitive data, AI training, and national security 2

. The movement toward hybrid cloud AI that spans local data centers and public cloud resources is barely underway, positioning AWS early enough to capture a significant opportunity 1

Nova Forge Enables Custom Frontier AI Models

Alongside AI Factories, AWS unveiled Nova Forge, a service for training and building custom frontier AI models tailored to specific enterprise needs 3

. Instead of simply fine-tuning existing foundation models, Nova Forge provides a mechanism to fully train a custom model without the astronomical cost and complexity of starting from scratch 1

. The service lets organizations insert their own data into multiple early training stages through pre-written "recipes," adjusting open weights as model training progresses 1

White explained that Nova Forge addresses fundamental limitations in fine-tuning: "There's limitations to how much you can do with fine-tuning. How could we fundamentally change that? That was the invention, the invention of Nova Forge, which is the first-ever ability for customers to take foundation models, Nova, and bring their own data and mix it with Amazon-provided data and actually start training the model" 3

. The result goes far beyond typical RAG-style fine-tuning and unlocks advanced capabilities, including reinforcement learning as the model continues to evolve in real use 1

Multi-Cloud Strategy and Hardware Advances

AWS announced plans to work first with Google Cloud and next year with Microsoft Azure to ease multi-cloud adoption 1

. For decades, "multi-cloud" was practically taboo inside AWS, making this active smoothing of the path to hybrid and multi-cloud environments remarkable 1

. AWS is not alone in this space—Microsoft has Azure Local to support sovereignty requirements, comprising Microsoft-managed hardware installed within customer facilities 2

On the hardware front, AWS formally launched the Trainium3 accelerator with general availability for Amazon EC2 Trn3 UltraServers, powered by the new three-nanometer chip 3

. White emphasized the price-performance advantage: "We're seeing our third generation show up and just absolutely crush that price/performance promise for our customers. Each one of these generations is big, it's not incremental, big step function changes" 3

. AWS also previewed Trainium4, expected to deliver major gains in FP4 and FP8 performance and memory bandwidth 3

. By letting enterprises run workloads on AWS custom silicon in their own facilities, the company has leapfrogged even Google, which only recently announced plans to sell its TPU AI accelerators to third parties 1

Agent Development and Enterprise Trust

AWS also announced a slate of agent-focused tools, extending the existing Bedrock AgentCore framework 1

. The new AgentCore Evaluations continuously monitor agent behavior to ensure they perform intended tasks while avoiding unintended actions 1

. Paired with a new security-centric AgentCore extension, these features give enterprises more confidence that the billions of agents Amazon and others expect to see in corporate environments can be trusted 1

. Organizations can tailor agent evaluation parameters so outputs and actions meet their own expectations and compliance standards, continuing the customization theme 1

AWS unveils AI Factories to deploy full-stack systems inside customer data centers

AWS Brings Full-Stack AI Systems to Enterprise Facilities

Addressing Data Sovereignty and Regulatory Requirements

Nova Forge Enables Custom Frontier AI Models

Multi-Cloud Strategy and Hardware Advances

Agent Development and Enterprise Trust

References

Amazon is building AI "factories" to run Trainium and GPU racks inside customer data centers

Amazon is testing out private on-premises 'AI Factories'

AI infrastructure: AWS unveils AI Factories and Nova - SiliconANGLE

Related Stories

AWS Unveils Major AI Advancements at re:Invent 2023: New Chips, Models, and Platform Tools

AWS CEO Matt Garman declares AI agents bigger than internet as Amazon stakes claim in AI race

AWS Unveils Bedrock AgentCore: A Game-Changer for Enterprise AI Agents

Recent Highlights

OpenAI Releases GPT-5.4, New AI Model Built for Agents and Professional Work

Anthropic sues Pentagon over supply chain risk label after refusing autonomous weapons use

OpenAI secures $110 billion funding round as questions swirl around AI bubble and profitability

Recent Highlights

Today's Top Stories

Google Maps unveils Ask Maps with Gemini AI and 3D Immersive Navigation in biggest update

Google uses AI and 5 million news reports to predict flash floods across 150 countries

Perplexity launches Personal Computer, an AI agent that runs 24/7 on your Mac mini

AI autocomplete covertly shifts human opinions on social issues, even when users ignore suggestions