7 Sources
7 Sources
[1]
Microsoft announces powerful new chip for AI inference
Microsoft has announced the launch of its latest chip, the Maia 200, which the company describes as a silicon workhorse designed for scaling AI inference. The 200, which follows the company's Maia 100 released in 2023, has been technically outfitted to run powerful AI models at faster speeds and with more efficiency, the company has said. Maia comes equipped with over 100 billion transistors, delivering over 10 petaflops in 4-bit precision and approximately 5 petaflops of 8-bit performance -- a substantial increase over its predecessor. Inference refers to the computing process of running a model, in contrast with the compute required to train it. As AI companies mature, inference costs have become an increasingly important part of their overall operating cost, leading to renewed interest in ways to optimize the process. Microsoft is hoping that the Maia 200 can be part of that optimization, making AI businesses run with less disruption and lower power use. "In practical terms, one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future," the company said. Microsoft's new chip is also part of a growing trend of tech giants turning to self-designed chips as a way to lessen their dependence on NVIDIA, whose cutting-edge GPUs have become increasingly pivotal to AI companies' success. Google, for instance, has its TPU, the tensor processing units -- which aren't sold as chips but as compute power made accessible through its cloud. Then there's Amazon Trainium, the e-commerce giant's own AI accelerator chip, which just launched its latest version, the Trainium3, in December. In each case, the TPUs can be used to offload some of the compute that would otherwise be assigned to NVIDIA GPUs, lessening the overall hardware cost. With Maia, Microsoft is positioning itself to compete with those alternatives. In its press release Monday, the company noted that Maia delivers 3x the FP4 performance of third generation Amazon Trainium chips, and FP8 performance above Google's seventh generation TPU. Microsoft says that Maia is already hard at work fueling the company's AI models from its Superintelligence team. It has also been supporting the operations of Copilot, its chatbot. As of Monday, the company said it has invited a variety of parties -- including developers, academics, and frontier AI labs -- to use its Maia 200 software development kit in their workloads.
[2]
Microsoft's latest AI chip goes head-to-head with Amazon and Google
Microsoft is announcing a successor to its first in-house AI chip today, the Maia 200. Built on TSMC's 3nm process, Microsoft says its Maia 200 AI accelerator "delivers 3 times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google's seventh generation TPU." Each Maia 200 chip has more than 100 billion transistors, which are all designed to handle large-scale AI workloads. "Maia 200 can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future," says Scott Guthrie, executive vice president of Microsoft's Cloud and AI division. Microsoft will use Maia 200 to host OpenAI's GPT-5.2 model and others for Microsoft Foundry and Microsoft 365 Copilot. "Maia 200 is also the most efficient inference system Microsoft has ever deployed, with 30 percent better performance per dollar than the latest generation hardware in our fleet today," says Guthrie. Microsoft's performance flex over its close Big Tech competitors is different to when it first launched the Maia 100 in 2023 and didn't want to be drawn into direct comparisons with Amazon's and Google's AI cloud capabilities. Both Google and Amazon are working on next-generation AI chips, though. Amazon is even working with Nvidia to integrate its upcoming Trainium4 chip with NVLink 6 and Nvidia's MGX rack architecture. Microsoft's Superintelligence team will be the first to use its Maia 200 chips, and the company is also inviting academics, developers, AI labs, and open-source model project contributors to an early preview of the Maia 200 software development kit. Microsoft is starting to deploy these new chips today in its Azure US Central data center region, with additional regions to follow.
[3]
Microsoft says its newest AI chip Maia 200 is 3 times more powerful than Google's TPU and Amazon's Trainium processor
Microsoft's Maia 200 chip is being integrated into its Azure cloud infrastructure (Image credit: Microsoft) Microsoft has revealed its new Maia 200 accelerator chip for artificial intelligence (AI) that is three times more powerful than hardware from rivals like Google and Amazon, company representatives say. This newest chip will be used in AI inference rather than training, powering systems and agents used to make predictions, provide answers to queries and generate outputs based on new data that's fed to them. Maia 200 is already being deployed in Microsoft's U.S. central data center region, with the company set to use the chips to generate synthetic data and in reinforcement training to improve next-generation large language models (LLMs). The AI accelerator will also be used to power Microsoft Foundry and 365 Copilot AI, and be part of the infrastructure that the company can provide through its Azure cloud platform. The new chip delivers performance of more than 10 petaflops (10floating point operations per second), Scott Guthrie, cloud and AI executive vice president at Microsoft, said in a blog post. This is a measure of performance in supercomputing, where the most powerful supercomputers in the world can reach more than 1,000 petaflops of power. The new chip achieved this performance level in a data representation category known as "4-bit precision (FP4)" -- a highly compressed model designed to accelerate AI performance. Maia 200 also delivers 5 PFLOPS of performance in 8-bit precision (FP8). The difference between the two is that FP4 is far more energy efficient but less accurate. "In practical terms, one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future," Guthrie said in the blog post. "This means Maia 200 delivers 3 times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google's seventh generation TPU." Chips ahoy Maia 200 could potentially be used for specialist AI workloads, such as running larger LLMs in the future. So far, Microsoft's Maia chips have only been used in the Azure cloud infrastructure to run large-scale workloads for Microsoft's own AI services, notably Copilot. However, Guthrie noted there would be "wider customer availability in the future," signaling other organizations could tap into Maia 200 via the Azure cloud, or the chips could potentially one day be deployed in standalone data centers or server stacks. Guthrie said that Microsoft boasts 30% better performance per dollar over existing systems thanks to the use of the 3-nanometer process made by the Taiwan Semiconductor Manufacturing Company (TSMC), the most important fabricator in the world, allowing for 100 billion transistors per chip. This essentially means that Maia 200 could be more cost-effective and efficient for the most demanding AI workloads than existing chips. Maia 200 has a few other features alongside better performance and efficiency. It includes a memory system, for instance, which can help keep an AI model's weights and data local, meaning you would need less hardware to run a model. It's also designed to be quickly integrated into existing data centers. Maia 200 should enable AI models to run faster and more efficiently. This means Azure OpenAI users, such as scientists, developers and corporations, could see better throughput and speeds when developing AI applications and using the likes of GPT-4 in their operations. This next-generation AI hardware is unlikely to disrupt everyday AI and chatbot use for most people in the short term, as Maia 200 is designed for data centers rather than consumer-grade hardware. However, end users could see the impact of Maia 200 in the form of faster response times and potentially more advanced features from Copilot and other AI tools built into Windows and Microsoft products. Maia 200 could also provide a performance boost to developers and scientists who use AI inference via Microsoft's platforms. This, in turn, could lead to improvements in AI deployment on large-scale research projects and elements like advanced weather modeling, biological or chemical systems and compositions.
[4]
Microsoft reveals second generation of its AI chip in effort to bolster cloud business
Microsoft announced the next generation of its artificial intelligence chip, a potential alternative to leading processors from Nvidia and to offerings from cloud rivals Amazon and Google. The Maia 200 comes two years after Microsoft said it had developed its first AI chip, the Maia 100, which was never made available for cloud clients to rent. Scott Guthrie, Microsoft's executive vice president for cloud and AI, said in a blog post Monday that, for the new chip, there will be "wider customer availability in the future." Guthrie called the Maia 200 "the most efficient inference system Microsoft has ever deployed." Developers, academics, AI labs and people contributing to open-source AI models can apply for a preview of a software development kit. Microsoft said its superintelligence team, led by Mustafa Suleyman, will use the new chip. The Microsoft 365 Copilot add-on for commercial productivity software bundles and the Microsoft Foundry service, for building on top of AI models, will use it as well. Cloud providers face surging demand from generative AI model developers such as Anthropic and OpenAI and from companies building AI agents and other products on top of the popular models. Data center operators and infrastructure providers are trying to increase their computing prowess while keeping power consumption in check. Microsoft is outfitting its U.S. Central region of data centers with Maia 200 chips, and they'll arrive at the U.S. West 3 region after that, with additional locations to follow. The chips use Taiwan Semiconductor Manufacturing Co.'s 3 nanometer process. Four are connected together inside each server. They rely on Ethernet cables, rather than the InfiniBand standard. Nvidia sells InfiniBand switches following its 2020 Mellanox acquisition. The chip offers 30% higher performance than alternatives for the same price, Guthrie wrote. Microsoft said each Maia 200 packs more high-bandwidth memory than a third-generation Trainium AI chip from Amazon Web Services or from Google's seventh-generation tensor processing unit. Microsoft can achieve high performance by wiring up to 6,144 of the Maia 200 chips together, reducing energy usage and total cost of ownership, Guthrie wrote. In 2023, Microsoft demonstrated that its GitHub Copilot coding assistant could run on Maia 100 processors.
[5]
Microsoft unveils Maia 200 AI chip, claiming performance edge over Amazon and Google
Microsoft on Monday announced Maia 200, the second generation of its custom AI chip, claiming it's the most powerful first-party silicon from any major cloud provider. The company says Maia 200 delivers three times the performance of Amazon's latest Trainium chip on certain benchmarks, and exceeds Google's most recent tensor processing unit (TPU) on others. The chip is already running workloads at Microsoft's data center near Des Moines, Iowa. Microsoft says Maia 200 is powering OpenAI's GPT-5.2 models, Microsoft 365 Copilot, and internal projects from its Superintelligence team. A second deployment at a data center near Phoenix is planned next. It's part of the larger trend among cloud giants to build their own custom silicon for AI rather than rely solely on Nvidia. Google has been refining its TPUs for nearly a decade, and Amazon's Trainium line is now in its third generation, with a fourth already announced. Microsoft first revealed its custom chip ambitions in late 2023, when it unveiled Maia 100 at its Ignite conference. Despite entering the race late, Microsoft makes the case that its tight integration between chips, AI models, and applications like Copilot gives it an edge. The company says Maia 200 offers 30% better performance-per-dollar than its current hardware. Maia 200 also builds on the first-generation chip with a more specific focus on inference, the process of running AI models after they've been trained. The chip competition among the cloud giants has intensified as the cost of running AI models becomes a bigger concern. Training a model is a one-time expense, but serving it to millions of users is a big ongoing expense. All three companies are betting that custom chips tuned for their own workloads will be cheaper than buying solely from Nvidia. Microsoft is also opening the door to outside developers. The company announced a software development kit that will let AI startups and researchers optimize their models for Maia 200. Developers and academics can sign up for an early preview starting today.
[6]
Microsoft unveils Maia 200, its 'powerhouse' accelerator looking to unlock the power of large-scale AI
* Microsoft unveils Maia 200 AI hardware * Maia 200 reportedly offers more performance and efficiency than AWS and GCP rivals * Microsoft will use it to help improve Copilot in-house, but it will be available for customers too Microsoft has revealed Maia 200, its "next major milestone" in supporting the next generation of AI and inference technology. The company's new hardware, the successor to the Maia 100, will "dramatically shift the economics of largescale AI," offering a significant upgrade in terms of performance and efficiency as it to stake a claim in the market. The launch will also look to push Microsoft Azure as a great place to run AI models faster and more efficiently, as it looks to take on its great rivals Amazon Web Services and Google Cloud. Microsoft Maia 200 Microsoft says Maia 200 contains over 100 billion transistors built on TSMC 3nm process with native FP8/FP4 tensor cores, a redesigned memory system with 216GB HBM3e at 7TB/s, and 272MB of on chip SRAM. This all contributes to the ability to deliver over 10 PFLOPS in 4-bit precision (FP4) and around 5 PFLOPS of 8-bit (FP8) performance - easily enough to run even the largest AI models around today, and with space to grow as the technology evolves. Microsoft says Maia 200 delivers 3x the FP4 performance of the third generation Amazon Trainium hardware, and FP8 performance above Google's seventh generation TPU, making it the company's most efficient inference system yet. And thanks to its optimized design, which sees the memory subsystem centered on narrow-precision datatypes, a specialized DMA engine, on-die SRAM, and a specialized NoC fabric for highβbandwidth data movement, Maia 200 is able to keep more of a model's weights and data local, meaning fewer devices are required to run a model. Microsoft is already using the new hardware to power its AI workloads in Microsoft Foundry and Microsoft 365 Copilot, with wider customer availability coming soon. It is also rolling out Maia 200 to its US Central datacenter region now, with further deployments coming to its US West 3 datacenter region near Phoenix, Arizona soon, and additional regions set to follow. For those looking to get an early look, Microsoft is inviting academics, developers, frontier AI labs and open source model project contributors to sign up for a preview of the new Maia 200 software development kit (SDK) now. Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button! And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
[7]
Microsoft unveils the Maia 200 chip and aims to take on Nvidia's software ecosystem
On Monday Microsoft introduced the Maia 200, the second generation of its in-house artificial intelligence chip, alongside a software suite designed to compete with Nvidia's Cuda environment. The chip, now up and running at a data center in Iowa, marks another step in Microsoft's bid to reduce its technological dependence on Nvidia while strengthening its control of the AI value chain. A second deployment site is planned in Arizona in the coming months. Built for compute-intensive cloud workloads, the Maia 200 is manufactured by TSMC using 3-nanometer technology and includes HBM memory, though in a less advanced version than what is expected in upcoming Nvidia chips. Microsoft is also betting on the integration of substantial SRAM, inspired by rivals such as Cerebras and Groq, to optimize performance for interactive AI uses, notably chatbots. Microsoft describes Maia 200 as the third most powerful silicon among all hyperscalers, with FP4 performance three times higher than Amazon Trainium's third generation and FP8 performance higher than Google's seventh-generation TPU. Alongside this, the company is rolling out new software tools, including Triton, an open-source project co-developed with OpenAI and presented as a direct alternative to Cuda, one of Nvidia's main competitive advantages. The announcement comes amid a rapidly reshaping AI semiconductor sector. As Cerebras has signed a $10bn agreement with OpenAI, Nvidia recently reached a partnership estimated at $20bn with the start-up Groq. Against this backdrop of strategic alliances, Microsoft is stepping up its vertical integration, aiming to position itself not only as a cloud infrastructure provider but also as a central player in chip design and software development in artificial intelligence.
Share
Share
Copy Link
Microsoft has launched its second-generation custom AI chip, the Maia 200, claiming superior performance over competitors Amazon Trainium and Google's TPU. Built to optimize AI inference workloads, the chip packs over 100 billion transistors and delivers 10 petaflops in 4-bit precision. Already deployed in Azure data centers, Maia 200 powers OpenAI's GPT-5.2 models and Microsoft 365 Copilot while offering 30% better performance per dollar.
Microsoft has announced the Maia 200, its second-generation custom AI chip designed specifically to optimize AI inference operations. The chip follows the Maia 100 released in 2023 and represents a substantial leap in performance capabilities
1
. Built on TSMC's advanced 3-nanometer process, the Maia 200 packs over 100 billion transistors and delivers more than 10 petaflops in FP4 precision and approximately 5 petaflops of 8-bit performance2
3
. Scott Guthrie, executive vice president of Microsoft's Cloud and AI division, emphasized that "one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future"1
.
Source: TechCrunch
Microsoft is positioning the Maia 200 as a direct alternative to NVIDIA and competing cloud providers' custom silicon. The company claims the chip delivers 3 times the FP4 performance of third-generation Amazon Trainium chips and FP8 performance above Google's seventh-generation TPU
2
5
. This marks a strategic shift from Microsoft's approach when it first launched the Maia 100 in 2023 and avoided direct comparisons with Amazon's and Google's AI cloud capabilities2
. Each Maia 200 chip also packs more high-bandwidth memory than competing hardware from Amazon Web Services or Google4
. The chip's architecture relies on Ethernet cables rather than the InfiniBand standard that NVIDIA sells following its 2020 Mellanox acquisition4
.As AI companies mature, AI inference costs have become an increasingly critical component of overall operating expenses, driving renewed focus on optimization strategies
1
. The Maia 200 addresses this challenge by offering 30% better performance per dollar compared to the latest generation hardware in Microsoft's current fleet2
5
. Microsoft can achieve high performance by wiring up to 6,144 of the Maia 200 chips together, which reduces energy usage and total cost of ownership4
. The chip includes a memory system that can help keep an AI model's weights and data local, meaning less hardware is needed to run a model3
.Microsoft is already deploying Maia 200 chips in its Azure US Central data center region near Des Moines, Iowa, with the US West 3 region near Phoenix planned next and additional locations to follow
2
4
5
. The chip is already powering OpenAI's GPT-5.2 models, Microsoft 365 Copilot, and Microsoft Foundry2
5
. Microsoft's Superintelligence team, led by Mustafa Suleyman, will be the first to use the new chip for generating synthetic data and reinforcement training to improve next-generation large language models (LLMs)2
3
.Related Stories
Microsoft is inviting developers, academics, AI labs, and open-source model project contributors to an early preview of the Maia 200 software development kit
2
4
. This represents a strategic opening for outside developers to optimize their models for Maia 200, with Guthrie noting there would be "wider customer availability in the future"3
4
. While the Maia 100 was never made available for cloud clients to rent, Microsoft appears to be positioning the Maia 200 for broader adoption4
.The Maia 200 launch reflects a growing trend among tech giants developing custom silicon to lessen dependence on NVIDIA, whose cutting-edge GPUs have become increasingly pivotal to AI companies' success
1
. Google has refined its tensor processing units for nearly a decade, while Amazon's Trainium line is now in its third generation, with Trainium4 already announced and set to integrate with NVIDIA's NVLink 6 and MGX rack architecture2
5
. Cloud providers face surging demand from generative AI model developers such as Anthropic and OpenAI, as well as from companies building AI agents and other products on top of popular models4
. Data center operators are trying to increase computing prowess while keeping power consumption in check, with the hardware cost of running AI models becoming a bigger ongoing concern than the one-time expense of training them4
5
. Microsoft makes the case that its tight integration between chips, AI models, and applications like Copilot and GitHub Copilot gives it a competitive edge despite entering the custom silicon race later than rivals5
.Summarized by
Navi
[1]
[3]
1
Policy and Regulation

2
Technology

3
Technology
