Microsoft unveils Maia 200 AI chip with 3x performance edge over Amazon Trainium and Google's TPU

Reviewed byNidhi Govil

7 Sources

Share

Microsoft has launched its second-generation custom AI chip, the Maia 200, claiming superior performance over competitors Amazon Trainium and Google's TPU. Built to optimize AI inference workloads, the chip packs over 100 billion transistors and delivers 10 petaflops in 4-bit precision. Already deployed in Azure data centers, Maia 200 powers OpenAI's GPT-5.2 models and Microsoft 365 Copilot while offering 30% better performance per dollar.

Microsoft Launches Second-Generation Custom AI Chip for Inference Workloads

Microsoft has announced the Maia 200, its second-generation custom AI chip designed specifically to optimize AI inference operations. The chip follows the Maia 100 released in 2023 and represents a substantial leap in performance capabilities

1

. Built on TSMC's advanced 3-nanometer process, the Maia 200 packs over 100 billion transistors and delivers more than 10 petaflops in FP4 precision and approximately 5 petaflops of 8-bit performance

2

3

. Scott Guthrie, executive vice president of Microsoft's Cloud and AI division, emphasized that "one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future"

1

.

Source: TechCrunch

Source: TechCrunch

Performance Benchmarks Position Maia 200 Ahead of Amazon Trainium and Google's TPU

Microsoft is positioning the Maia 200 as a direct alternative to NVIDIA and competing cloud providers' custom silicon. The company claims the chip delivers 3 times the FP4 performance of third-generation Amazon Trainium chips and FP8 performance above Google's seventh-generation TPU

2

5

. This marks a strategic shift from Microsoft's approach when it first launched the Maia 100 in 2023 and avoided direct comparisons with Amazon's and Google's AI cloud capabilities

2

. Each Maia 200 chip also packs more high-bandwidth memory than competing hardware from Amazon Web Services or Google

4

. The chip's architecture relies on Ethernet cables rather than the InfiniBand standard that NVIDIA sells following its 2020 Mellanox acquisition

4

.

Reducing AI Operational Costs Through Efficient Inference Systems

As AI companies mature, AI inference costs have become an increasingly critical component of overall operating expenses, driving renewed focus on optimization strategies

1

. The Maia 200 addresses this challenge by offering 30% better performance per dollar compared to the latest generation hardware in Microsoft's current fleet

2

5

. Microsoft can achieve high performance by wiring up to 6,144 of the Maia 200 chips together, which reduces energy usage and total cost of ownership

4

. The chip includes a memory system that can help keep an AI model's weights and data local, meaning less hardware is needed to run a model

3

.

Deployment in Azure Data Centers to Power Large AI Models

Microsoft is already deploying Maia 200 chips in its Azure US Central data center region near Des Moines, Iowa, with the US West 3 region near Phoenix planned next and additional locations to follow

2

4

5

. The chip is already powering OpenAI's GPT-5.2 models, Microsoft 365 Copilot, and Microsoft Foundry

2

5

. Microsoft's Superintelligence team, led by Mustafa Suleyman, will be the first to use the new chip for generating synthetic data and reinforcement training to improve next-generation large language models (LLMs)

2

3

.

Opening Access Through Software Development Kit for Developers

Microsoft is inviting developers, academics, AI labs, and open-source model project contributors to an early preview of the Maia 200 software development kit

2

4

. This represents a strategic opening for outside developers to optimize their models for Maia 200, with Guthrie noting there would be "wider customer availability in the future"

3

4

. While the Maia 100 was never made available for cloud clients to rent, Microsoft appears to be positioning the Maia 200 for broader adoption

4

.

Cloud Providers Race to Build Custom Silicon and Reduce NVIDIA Dependence

The Maia 200 launch reflects a growing trend among tech giants developing custom silicon to lessen dependence on NVIDIA, whose cutting-edge GPUs have become increasingly pivotal to AI companies' success

1

. Google has refined its tensor processing units for nearly a decade, while Amazon's Trainium line is now in its third generation, with Trainium4 already announced and set to integrate with NVIDIA's NVLink 6 and MGX rack architecture

2

5

. Cloud providers face surging demand from generative AI model developers such as Anthropic and OpenAI, as well as from companies building AI agents and other products on top of popular models

4

. Data center operators are trying to increase computing prowess while keeping power consumption in check, with the hardware cost of running AI models becoming a bigger ongoing concern than the one-time expense of training them

4

5

. Microsoft makes the case that its tight integration between chips, AI models, and applications like Copilot and GitHub Copilot gives it a competitive edge despite entering the custom silicon race later than rivals

5

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Β© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo