Microsoft deploys Maia 200 AI chip but won't stop buying from Nvidia, Nadella confirms

Reviewed by Nidhi Govil

27 Sources

[1]

TechCrunch

Microsoft won't stop buying AI chips from Nvidia, AMD, even after launching its own, Nadella says

Microsoft this week deployed its first crop of its homegrown AI chips in one of its data centers, with plans to roll out more in the coming months, it says. The chip, named the Maia 200, is designed to be what Microsoft calls an "AI inference powerhouse" as Microsoft describes, meaning it's optimized for the compute-intensive work of running AI models in production. The company released some impressive processing-speed specs for Maia, saying it outperforms Amazon's latest Trainium chips and Google's latest Tensor Processing Units (TPU). All of the cloud giants are turning to their own AI chip designs in part because of the difficulty, and expense, of obtaining the latest and greatest from Nvidia -- a supply crunch that shows no signs of abating. But even with its own state-of-the-art, high-performance chip in hand, Microsoft CEO Satya Nadella said the company will still be buying chips made by others. "We have a great partnership with Nvidia, with AMD. They are innovating. We are innovating," he explained. "I think a lot of folks just talk about who's ahead. Just remember, you have to be ahead for all time to come." He added: "Because we can vertically integrate doesn't mean we just only vertically integrate," meaning building its own systems from top to bottom, without using wares from other vendors. That said, Maia 200 will be used by Microsoft's own so-called Superintelligence team, the AI specialists building the software giant's own frontier models. That's according to Mustafa Suleyman, the former Google DeepMind co-founder who now leads the team. Microsoft is working on its own models to perhaps one day lessen its reliance on OpenAI, Anthropic, and other model makers. The Maia 200 chip will also support OpenAI's models running on Microsoft's Azure cloud platform, the company says. But, by all accounts, securing access to the most advanced AI hardware is still a challenge for everyone, paying customers and internal teams alike. So in a post on X, Suleyman clearly relished sharing the news that his team gets first dibs. "It's a big day," he wrote when the chip launched. "Our Superintelligence team will be the first to use Maia 200 as we develop our frontier AI models."

[2]

TechCrunch

Microsoft announces powerful new chip for AI inference

Microsoft has announced the launch of its latest chip, the Maia 200, which the company describes as a silicon workhorse designed for scaling AI inference. The 200, which follows the company's Maia 100 released in 2023, has been technically outfitted to run powerful AI models at faster speeds and with more efficiency, the company has said. Maia comes equipped with over 100 billion transistors, delivering over 10 petaflops in 4-bit precision and approximately 5 petaflops of 8-bit performance -- a substantial increase over its predecessor. Inference refers to the computing process of running a model, in contrast with the compute required to train it. As AI companies mature, inference costs have become an increasingly important part of their overall operating cost, leading to renewed interest in ways to optimize the process. Microsoft is hoping that the Maia 200 can be part of that optimization, making AI businesses run with less disruption and lower power use. "In practical terms, one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future," the company said. Microsoft's new chip is also part of a growing trend of tech giants turning to self-designed chips as a way to lessen their dependence on NVIDIA, whose cutting-edge GPUs have become increasingly pivotal to AI companies' success. Google, for instance, has its TPU, the tensor processing units -- which aren't sold as chips but as compute power made accessible through its cloud. Then there's Amazon Trainium, the e-commerce giant's own AI accelerator chip, which just launched its latest version, the Trainium3, in December. In each case, the TPUs can be used to offload some of the compute that would otherwise be assigned to NVIDIA GPUs, lessening the overall hardware cost. With Maia, Microsoft is positioning itself to compete with those alternatives. In its press release Monday, the company noted that Maia delivers 3x the FP4 performance of third generation Amazon Trainium chips, and FP8 performance above Google's seventh generation TPU. Microsoft says that Maia is already hard at work fueling the company's AI models from its Superintelligence team. It has also been supporting the operations of Copilot, its chatbot. As of Monday, the company said it has invited a variety of parties -- including developers, academics, and frontier AI labs -- to use its Maia 200 software development kit in their workloads.

[3]

The Verge

Microsoft's latest AI chip goes head-to-head with Amazon and Google

Microsoft is announcing a successor to its first in-house AI chip today, the Maia 200. Built on TSMC's 3nm process, Microsoft says its Maia 200 AI accelerator "delivers 3 times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google's seventh generation TPU." Each Maia 200 chip has more than 100 billion transistors, which are all designed to handle large-scale AI workloads. "Maia 200 can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future," says Scott Guthrie, executive vice president of Microsoft's Cloud and AI division. Microsoft will use Maia 200 to host OpenAI's GPT-5.2 model and others for Microsoft Foundry and Microsoft 365 Copilot. "Maia 200 is also the most efficient inference system Microsoft has ever deployed, with 30 percent better performance per dollar than the latest generation hardware in our fleet today," says Guthrie. Microsoft's performance flex over its close Big Tech competitors is different to when it first launched the Maia 100 in 2023 and didn't want to be drawn into direct comparisons with Amazon's and Google's AI cloud capabilities. Both Google and Amazon are working on next-generation AI chips, though. Amazon is even working with Nvidia to integrate its upcoming Trainium4 chip with NVLink 6 and Nvidia's MGX rack architecture. Microsoft's Superintelligence team will be the first to use its Maia 200 chips, and the company is also inviting academics, developers, AI labs, and open-source model project contributors to an early preview of the Maia 200 software development kit. Microsoft is starting to deploy these new chips today in its Azure US Central data center region, with additional regions to follow.

[4]

Tom's Hardware

Microsoft introduces newest in-house AI chip -- Maia 200 is faster than other bespoke Nvidia competitors, built on TSMC 3nm with 216GB of HBM3e

30% more performance per dollar than Maia 100, and faster than Amazon or Google. Microsoft has introduced its newest AI accelerator, the Microsoft Azure Maia 200. The new in-house AI chip is the next generation of Microsoft's Maia GPU line, a server chip designed for inferencing AI models with ludicrous speeds and feeds to outperform the custom offerings from hyperscaler competitors Amazon and Google. Maia 200 is labelled Microsoft's "most efficient inference system" ever deployed, with all of its press releases splitting time between praising its big performance numbers and stressing Microsoft's lip service to environmentalism. Microsoft claims the Maia 200 gives 30% more performance per dollar than the first-gen Maia 100, an impressive feat considering the new chip also technically advertizes a 50% higher TDP than its predecessor. Maia 200 is built on TSMC's 3nm process node, containing 140 billion transistors. The chip can hit up to a claimed 10 petaflops of FP4 compute, three times higher than Amazon's Trainium3 competition. The Maia 200 also carries 216 GB of HBM3e memory onboard with 7 TB/s of HBM bandwidth, joined by 272MB of on-die SRAM. As can be seen above, the Maia 200 offers a clear lead in raw compute power compared to the Amazon in-house competition, and raises an interesting conversation next to Nvidia's top dog GPU. Obviously, to compare the two as direct competitors is a fool's errand; no outside customers can purchase the Maia 200 directly, the Blackwell B300 Ultra is tuned for much higher-powered use-cases than the Microsoft chip, and the software stack for Nvidia launches it miles ahead of any other contemporaries. However, the Maia 200 does beat the B300 in efficiency, a big win in a day where public opinion against AI's environmental effects is steadily mounting. The Maia 200 operates at almost half of B300's TDP (750W vs 1400W), and if it's anything like the Maia 100, it will operate beneath it's theoretical maximum TDP; Maia 100 was designed to be a 700W chip, but Microsoft claims it was limited to 500W in operation. Maia 200 is tuned for FP4 and FP8 performance, focusing in on serving customers that are inferencing AI models hungry for FP4 performance, rather than more complex operations. A lot of Microsoft's R&D budget for the chip seems to have been put into the memory hierarchy that exists within its 272MB of high-efficiency SRAM bank, which is partitioned into "multi‑tier Cluster‑level SRAM (CSRAM) and Tile‑level SRAM (TSRAM)", accommodating increased operating efficiency and a philosophy of spreading workloads intelligently and evenly across all HBM and SRAM dies. It's difficult to measure Maia 200's improvements over its predecessor Maia 100, as Microsoft's official stat sheets for both chips have nearly zero overlap or shared measurements. All we can say this early is that Maia 200 will run hotter than Maia 100 did, and that it is apparently 30% better on a performance-per-dollar metric. Maia 200 has already been deployed in Microsoft's US Central Azure data center, with future deployments announced for US West 3 in Phoenix, AZ, and more to come as Microsoft receives more chips. The chip will be part of Microsoft's heterogeneous deployment, operating in tandem with other different AI accelerators as well. Maia 200, originally codenamed Braga, made waves for its heavily delayed development and release. The chip was meant to have a 2025 release and deployment, maybe even beating B300 out of the gates, but this was not meant to be. Microsoft's next hardware release isn't certain, but it will likely be fabricated on Intel Foundry's 18A process, per reports in October. Microsoft's efficiency-first messaging surrounding the Maia 200 follows its recent trends of stressing the corporation's supposed concern for communities near its data centers, taking great lengths to deafen the backlash to the AI boom. Microsoft CEO Satya Nadella recently spoke at the World Economic Forum on how if companies cannot help the public see the supposed perks of AI development and data center buildout, they risk losing "social permission" and creating a dreaded AI bubble. Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

[5]

The Register

Microsoft looks to drive down AI infra costs with Maia 200

Inference-optimized chip 30% cheaper than any other AI silicon on the market today, Azure's Scott Guthrie claims Microsoft on Monday unveiled a new in-house AI accelerator to rival Nvidia's Blackwell GPUs. Fabbed on TSMC's N3 process node, Redmond's second-gen Maia 200 accelerator packs 144 billion transistors capable of churning out a collective 10 petaFLOPS of FP4 performance. That puts the chip in direct contention with Nvidia's first-generation Blackwell GPUs, like the B200 -- at least in terms of inference. According to Scott Guthrie, EVP of cloud and AI at Microsoft, the chip has been "specifically optimized for inferencing very large models, including both reasoning and chain of thought." Compared to training, inference is much more sensitive to memory bandwidth. For each token (think words or punctuation) generated, the entirety of the model's active weights needs to be streamed from memory. Because of this, memory bandwidth puts an upper bound on how interactive -- that's how many tokens per second per user -- a system can generate. To address this, Maia 200 has been equipped with 216GB of high-speed memory spread across what appears to be six HBM3e stacks, good for a claimed 7TB/s of bandwidth. To put that in perspective, Nvidia's B200 GPUs offer between 180 GB and 192 GB of HBM3e with up to 8 TB/s of bandwidth each. More recent iterations of Blackwell increase this to 288 GB but bandwidth remains the same. Optimizing for inference efficiency Microsoft is also keen to point out just how much more cost- and power-efficient Maia 200 is than competing accelerators. "Maia is 30 percent cheaper than any other AI silicon on the market today," Guthrie said in a promotional video. At 750 watts, the chip uses considerably less power than either Nvidia's chips, which can chew through more than 1,200 watts each. This is low enough that Microsoft says Maia can be deployed in either air- or liquid-cooled datacenters. However, it's important to remember that Maia is an inference chip. So while it may compare favorably to Nvidia's older Blackwell parts, it's not nearly as versatile. Diving into the chip's speeds and feeds, we see Microsoft has made some significant concessions in order to maximize performance per watt. The chip's tile tensor unit (TTU), which Microsoft calls a tensor core, supports only FP8, FP6, and FP4 datatypes in hardware. So while we still see a 2x jump in FLOPS from FP8 to FP4, workloads requiring 16- or 32-bit precision incur a stiff performance penalty, as they have to be computed on the chip's tile vector processors (TVPs). The good news is most LLM inference is now done at lower precisions than BF16. In fact, it's not uncommon for model weights to be stored in a 4-bit block floating point precision like NVFP4 or MXFP4, while the actual activations and KV caches (the model's short-term memory) are computed at a higher precision like MXFP8 in order to maintain accuracy. Nonetheless, Microsoft isn't lying about this chip being an inference accelerator. Despite some advancements in ultra-low precision training, most GenAI models are still trained at higher precisions, with BF16 still the most common. All of this is to say, while Maia 200 may be Microsoft's most competitive AI chip to date, don't expect Redmond to cut back its Nvidia GPU orders any time soon, especially with Rubin promised to deliver a 5x uplift in inference performance compared to either Blackwell or Maia 200 launching later this year. Designed to scale Maia 200 doesn't just deliver more performance and memory than last-gen, it's also designed to scale to support massive multi-trillion parameter models. Each Maia 200 is equipped with 2.8 TB/s of bidirectional bandwidth (1.4 TB/s in each direction), which enables it to pool its compute and memory resources across clusters of up to 6,144 chips. That works out to 61 exaFLOPS of AI compute and 1.3 petabytes of HBM3e. This is achieved using an integrated Ethernet network on chip (NoC), which by our estimates either has 56 200 Gbps or 112 100 Gbps SerDes each. Running atop this is Microsoft's own AI transport layer protocol. As strange as this might sound at a time when Nvidia is pushing NVLink Fusion and AMD UALink, it's not the first time we've seen Ethernet used this way. AMD is tunneling UALink over Ethernet on its MI455X series chips, and you may recall that Intel used Ethernet for chip-to-chip communications on its Gaudi family of AI accelerators. As for Microsoft's scale-up topology, the cloud giant says it's using a two-tier scale-up domain, which involves Ethernet packet switches. To us this sounds like a two-layer fat tree topology normally associated with scale-out networks. To avoid performance bottlenecks in larger clusters, Microsoft can dynamically partition the Maia 200's 272 MB of SRAM into cluster level (CSRAM) and tile-level (TSRAM) pools. The CSRAM pool functions as a buffer for collective communications, avoiding unnecessary data transfers between the speedy on chip memory and HBM. The TSRAM, meanwhile, serves as a cache for intermediary matrix multiplication operations and attention kernels. We've reached out to Microsoft for clarification on both its scale-up topology; we'll let you know if we hear anything back. In any case, Microsoft's networking is clearly designed to ensure it can run even the largest frontier models for its customers, and that includes OpenAI's GPT-5.2. Maia 200 is already running in Microsoft's Central region in Des Moines, Iowa, with plans to bring it to its West 3 region in Phoenix and other locations in the near future. Alongside the new chips, Microsoft has also launched an SDK in preview to provide prospective customers the tools they need to start integrating the chip into their workflows - sign up to request access here. The company says the chip will support both PyTorch and Triton kernels, which should lower the barrier to adoption. ®

[6]

Live Science

Microsoft says its newest AI chip Maia 200 is 3 times more powerful than Google's TPU and Amazon's Trainium processor

Microsoft's Maia 200 chip is being integrated into its Azure cloud infrastructure (Image credit: Microsoft) Microsoft has revealed its new Maia 200 accelerator chip for artificial intelligence (AI) that is three times more powerful than hardware from rivals like Google and Amazon, company representatives say. This newest chip will be used in AI inference rather than training, powering systems and agents used to make predictions, provide answers to queries and generate outputs based on new data that's fed to them. Maia 200 is already being deployed in Microsoft's U.S. central data center region, with the company set to use the chips to generate synthetic data and in reinforcement training to improve next-generation large language models (LLMs). The AI accelerator will also be used to power Microsoft Foundry and 365 Copilot AI, and be part of the infrastructure that the company can provide through its Azure cloud platform. The new chip delivers performance of more than 10 petaflops (10floating point operations per second), Scott Guthrie, cloud and AI executive vice president at Microsoft, said in a blog post. This is a measure of performance in supercomputing, where the most powerful supercomputers in the world can reach more than 1,000 petaflops of power. The new chip achieved this performance level in a data representation category known as "4-bit precision (FP4)" -- a highly compressed model designed to accelerate AI performance. Maia 200 also delivers 5 PFLOPS of performance in 8-bit precision (FP8). The difference between the two is that FP4 is far more energy efficient but less accurate. "In practical terms, one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future," Guthrie said in the blog post. "This means Maia 200 delivers 3 times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google's seventh generation TPU." Chips ahoy Maia 200 could potentially be used for specialist AI workloads, such as running larger LLMs in the future. So far, Microsoft's Maia chips have only been used in the Azure cloud infrastructure to run large-scale workloads for Microsoft's own AI services, notably Copilot. However, Guthrie noted there would be "wider customer availability in the future," signaling other organizations could tap into Maia 200 via the Azure cloud, or the chips could potentially one day be deployed in standalone data centers or server stacks. Guthrie said that Microsoft boasts 30% better performance per dollar over existing systems thanks to the use of the 3-nanometer process made by the Taiwan Semiconductor Manufacturing Company (TSMC), the most important fabricator in the world, allowing for 100 billion transistors per chip. This essentially means that Maia 200 could be more cost-effective and efficient for the most demanding AI workloads than existing chips. Maia 200 has a few other features alongside better performance and efficiency. It includes a memory system, for instance, which can help keep an AI model's weights and data local, meaning you would need less hardware to run a model. It's also designed to be quickly integrated into existing data centers. Maia 200 should enable AI models to run faster and more efficiently. This means Azure OpenAI users, such as scientists, developers and corporations, could see better throughput and speeds when developing AI applications and using the likes of GPT-4 in their operations. This next-generation AI hardware is unlikely to disrupt everyday AI and chatbot use for most people in the short term, as Maia 200 is designed for data centers rather than consumer-grade hardware. However, end users could see the impact of Maia 200 in the form of faster response times and potentially more advanced features from Copilot and other AI tools built into Windows and Microsoft products. Maia 200 could also provide a performance boost to developers and scientists who use AI inference via Microsoft's platforms. This, in turn, could lead to improvements in AI deployment on large-scale research projects and elements like advanced weather modeling, biological or chemical systems and compositions.

[7]

CNBC

Microsoft reveals second generation of its AI chip in effort to bolster cloud business

Microsoft announced the next generation of its artificial intelligence chip, a potential alternative to leading processors from Nvidia and to offerings from cloud rivals Amazon and Google. The Maia 200 comes two years after Microsoft said it had developed its first AI chip, the Maia 100, which was never made available for cloud clients to rent. Scott Guthrie, Microsoft's executive vice president for cloud and AI, said in a blog post Monday that, for the new chip, there will be "wider customer availability in the future." Guthrie called the Maia 200 "the most efficient inference system Microsoft has ever deployed." Developers, academics, AI labs and people contributing to open-source AI models can apply for a preview of a software development kit. Microsoft said its superintelligence team, led by Mustafa Suleyman, will use the new chip. The Microsoft 365 Copilot add-on for commercial productivity software bundles and the Microsoft Foundry service, for building on top of AI models, will use it as well. Cloud providers face surging demand from generative AI model developers such as Anthropic and OpenAI and from companies building AI agents and other products on top of the popular models. Data center operators and infrastructure providers are trying to increase their computing prowess while keeping power consumption in check. Microsoft is outfitting its U.S. Central region of data centers with Maia 200 chips, and they'll arrive at the U.S. West 3 region after that, with additional locations to follow. The chips use Taiwan Semiconductor Manufacturing Co.'s 3 nanometer process. Four are connected together inside each server. They rely on Ethernet cables, rather than the InfiniBand standard. Nvidia sells InfiniBand switches following its 2020 Mellanox acquisition. The chip offers 30% higher performance than alternatives for the same price, Guthrie wrote. Microsoft said each Maia 200 packs more high-bandwidth memory than a third-generation Trainium AI chip from Amazon Web Services or from Google's seventh-generation tensor processing unit. Microsoft can achieve high performance by wiring up to 6,144 of the Maia 200 chips together, reducing energy usage and total cost of ownership, Guthrie wrote. In 2023, Microsoft demonstrated that its GitHub Copilot coding assistant could run on Maia 100 processors.

[8]

TechSpot

Microsoft built a 750W AI chip to challenge Nvidia's dominance, claims 3x performance gains over Amazon

Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. Editor's take: Nvidia became the world's most overvalued company as Big Tech players scrambled to buy every GPU and AI accelerator they could get. Many of these corporations are now turning their attention to developing their own accelerators, with Microsoft reportedly leading the pack in efficiency and performance. Microsoft recently announced Maia 200, a new AI accelerator specifically designed for inference workloads. According to Redmond, Maia 200 can deliver "dramatic" improvements for AI applications and is already deployed in select US data centers on the Azure platform. The company highlighted the chip's impressive specifications: Maia 200 is built on TSMC's 3nm process, features native FP8/FP4 tensor cores, a new memory system with 216 GB of HBM3e VRAM, and a massive 272 MB on-chip SRAM cache. Microsoft claims that Maia 200 offers the highest performance among all custom silicon designs currently used by other hyperscalers. The chip is said to be up to three times more powerful than Amazon's third-generation Trainium at 4-bit precision (FP4) and surpasses Google's seventh-generation TPU at 8-bit precision (FP8). It is also more efficient, delivering 30 percent better performance per dollar compared with Microsoft's previous accelerator, Maia 100. Maia 200 is currently deployed in Microsoft's US data center region in Iowa, with additional regions expected to come online soon. The chip integrates seamlessly with the Azure cloud platform and is also being used to generate "synthetic" data for training next-generation AI models. Concerned about the potential feedback-loop effects, major corporations are exploring alternative data streams as they anticipate that human-generated content will eventually be fully consumed by large language models and other machine learning tools. Microsoft confirms that Maia 200 is a massive chip, with over 140 billion transistors contained within a 750 W TDP envelope. Performance is rated at over 10 petaFLOPS at FP4 and over five petaFLOPS at FP8. The SoC is capable of running today's most powerful AI models and has been designed to support even larger models in the future. The chip also features a new network design for moving vast amounts of data. Based on standard Ethernet technology, the solution includes a custom transport layer and an integrated NIC for improved performance and reliability. In practical terms, the network interface in each Maia 200 SoC can reach 2.8 TB/s of bidirectional bandwidth. Finally, Microsoft is inviting developers and AI startups to sign up for the official Maia 200 software development kit once it becomes available. The SDK includes a compiler, PyTorch support, low-level programming tools, a Maia simulator, and more.

[9]

GeekWire

Microsoft unveils Maia 200 AI chip, claiming performance edge over Amazon and Google

Microsoft on Monday announced Maia 200, the second generation of its custom AI chip, claiming it's the most powerful first-party silicon from any major cloud provider. The company says Maia 200 delivers three times the performance of Amazon's latest Trainium chip on certain benchmarks, and exceeds Google's most recent tensor processing unit (TPU) on others. The chip is already running workloads at Microsoft's data center near Des Moines, Iowa. Microsoft says Maia 200 is powering OpenAI's GPT-5.2 models, Microsoft 365 Copilot, and internal projects from its Superintelligence team. A second deployment at a data center near Phoenix is planned next. It's part of the larger trend among cloud giants to build their own custom silicon for AI rather than rely solely on Nvidia. Google has been refining its TPUs for nearly a decade, and Amazon's Trainium line is now in its third generation, with a fourth already announced. Microsoft first revealed its custom chip ambitions in late 2023, when it unveiled Maia 100 at its Ignite conference. Despite entering the race late, Microsoft makes the case that its tight integration between chips, AI models, and applications like Copilot gives it an edge. The company says Maia 200 offers 30% better performance-per-dollar than its current hardware. Maia 200 also builds on the first-generation chip with a more specific focus on inference, the process of running AI models after they've been trained. The chip competition among the cloud giants has intensified as the cost of running AI models becomes a bigger concern. Training a model is a one-time expense, but serving it to millions of users is a big ongoing expense. All three companies are betting that custom chips tuned for their own workloads will be cheaper than buying solely from Nvidia. Microsoft is also opening the door to outside developers. The company announced a software development kit that will let AI startups and researchers optimize their models for Maia 200. Developers and academics can sign up for an early preview starting today.

[10]

Interesting Engineering

Microsoft debuts Maia 200 AI chip promising 3x inference performance

Microsoft has introduced the Maia 200, its second-generation in-house AI chip, as competition intensifies around the cost of running large models. Unlike earlier hardware pushes that focused on training, the new chip targets inference, the continuous process of serving AI responses to users. Inference has become a growing expense for AI companies. As chatbots and copilots scale to millions of users, models must run nonstop. Microsoft says Maia 200 is designed for that shift. The chip comes online this week at a Microsoft data center in Iowa. A second deployment is planned for Arizona. Maia 200 builds on the Maia 100, which Microsoft launched in 2023. The new version delivers a major performance jump. Microsoft says the chip packs more than 100 billion transistors and produces over 10 petaflops of compute at 4-bit precision. At 8-bit precision, it reaches roughly 5 petaflops. Those figures target real-world workloads rather than training benchmarks. Inference demands speed, stability, and power efficiency. Microsoft says a single Maia 200 node can run today's largest AI models while leaving room for future growth. The chip's design reflects how modern AI services operate. Chatbots must respond quickly even when user traffic spikes. To handle that demand, Maia 200 includes a large amount of SRAM, a fast memory type that reduces delays during repeated queries. Several newer AI hardware players rely on memory-heavy designs. Microsoft appears to have adopted that approach to improve responsiveness at scale. Maia 200 also serves a strategic purpose. Major cloud providers reportedly want to reduce their reliance on NVIDIA, whose GPUs dominate AI infrastructure. While NVIDIA still leads in performance, its hardware and software stack shapes pricing and availability across the industry. Google already offers its tensor processing units through its cloud. Amazon Web Services promotes its Trainium and Inferentia chips. Microsoft now joins that group with Maia. The company made direct comparisons. Microsoft says Maia 200 delivers three times the FP4 performance of Amazon's third-generation Trainium chips. It also claims stronger FP8 performance than Google's latest TPU. Like NVIDIA's upcoming Vera Rubin processors, Maia 200 is manufactured by Taiwan Semiconductor Manufacturing Co using 3-nanometer technology. It also uses high-bandwidth memory, though an older generation than NVIDIA's next chips. Microsoft paired the chip launch with new developer tools. The company aims to narrow a gap that has long favored NVIDIA software. One key tool is Triton, an open-source framework that helps developers write efficient AI code. OpenAI has made major contributions to the project. Microsoft positions Triton as an alternative to CUDA, NVIDIA's dominant programming platform. Maia 200 already runs inside Microsoft's own AI services. The company says it supports models from its Superintelligence team and helps power Copilot. Microsoft has also invited developers, academics, and frontier AI labs to test the Maia 200 software development kit. With Maia 200, Microsoft signals a broader shift in AI infrastructure. Faster chips still matter. Control over software and deployment now matters just as much.

[11]

TechRadar

Microsoft unveils Maia 200, its 'powerhouse' accelerator looking to unlock the power of large-scale AI

* Microsoft unveils Maia 200 AI hardware * Maia 200 reportedly offers more performance and efficiency than AWS and GCP rivals * Microsoft will use it to help improve Copilot in-house, but it will be available for customers too Microsoft has revealed Maia 200, its "next major milestone" in supporting the next generation of AI and inference technology. The company's new hardware, the successor to the Maia 100, will "dramatically shift the economics of largescale AI," offering a significant upgrade in terms of performance and efficiency as it to stake a claim in the market. The launch will also look to push Microsoft Azure as a great place to run AI models faster and more efficiently, as it looks to take on its great rivals Amazon Web Services and Google Cloud. Microsoft Maia 200 Microsoft says Maia 200 contains over 100 billion transistors built on TSMC 3nm process with native FP8/FP4 tensor cores, a redesigned memory system with 216GB HBM3e at 7TB/s, and 272MB of on chip SRAM. This all contributes to the ability to deliver over 10 PFLOPS in 4-bit precision (FP4) and around 5 PFLOPS of 8-bit (FP8) performance - easily enough to run even the largest AI models around today, and with space to grow as the technology evolves. Microsoft says Maia 200 delivers 3x the FP4 performance of the third generation Amazon Trainium hardware, and FP8 performance above Google's seventh generation TPU, making it the company's most efficient inference system yet. And thanks to its optimized design, which sees the memory subsystem centered on narrow-precision datatypes, a specialized DMA engine, on-die SRAM, and a specialized NoC fabric for high‑bandwidth data movement, Maia 200 is able to keep more of a model's weights and data local, meaning fewer devices are required to run a model. Microsoft is already using the new hardware to power its AI workloads in Microsoft Foundry and Microsoft 365 Copilot, with wider customer availability coming soon. It is also rolling out Maia 200 to its US Central datacenter region now, with further deployments coming to its US West 3 datacenter region near Phoenix, Arizona soon, and additional regions set to follow. For those looking to get an early look, Microsoft is inviting academics, developers, frontier AI labs and open source model project contributors to sign up for a preview of the new Maia 200 software development kit (SDK) now. Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button! And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

[12]

SiliconANGLE

Microsoft's next-gen Maia 200 chip promises massive performance boost for AI workloads

Microsoft's next-gen Maia 200 chip promises massive performance boost for AI workloads Microsoft Corp. today unveiled the second generation of its custom artificial intelligence processor in Maia 200, boldly claiming that it's the most powerful chip offered by any public cloud infrastructure provider so far. The company said Maia 200 offers three-times the compute performance of Amazon Web Services Inc.'s most advanced Trainium processor on certain popular AI benchmarks, while exceeding Google LLC's newest tensor processing unit on some others. Maia 200 is the successor to the Maia 100 chip that was launched in 2023, and has been designed to run the most powerful large language models at faster speeds and with greater energy efficiency, Microsoft said. It features more than 100 billion transistors that combine to deliver over 10 petaflops with 4-bit precision, and is capable of approximately 5 petaflops of 8-bit performance, making it a significant improvement on its original custom silicon. Microsoft said the chip is designed primarily for AI inference workloads. Inference is the process of running trained AI models, which has become an increasingly more important element of AI operating costs as the technology matures. According to Microsoft, the Maia 200 is already running workloads at the company's data center in Des Moines, Iowa, including Copilot 365, OpenAI Group PBC's GPT-5.2 models and various internal projects within its AI Superintelligence team. The company plans to deploy additional Maia 200 chips at its Phoenix data center facility in the coming weeks. The launch of Maia 200 accelerates a trend among the major cloud providers to develop their own custom AI processors as an alternative to Nvidia Corp.'s graphics processing units, which still process the vast majority of the world's AI workloads. Google kicked off the trend with its TPUs almost a decade ago, and Amazon's Trainium chips are currently in their third generation, with a fourth coming up. Microsoft was later to the party, with the first Maia chips only arriving in 2023 during that year's Ignite conference. But although it's playing catch up, Microsoft says that Maia 200's tight integration with AI models and applications such as Copilot provide it with an advantage on many workloads. The latest generation is more specifically focused on inference, rather than training, which helps it to achieve 30% better price performance-per-dollar than the original Maia chip. Cost has become an important differentiator for public cloud providers due to the massive amounts of cash they're burning through to fuel their AI projects. While training models is only a one-time expense, inference costs are continuous, and so Microsoft, Google and Amazon are all trying to undercut the price of Nvidia's GPUs. In addition, Microsoft said it's opening the door to third-party developers, offering a new software development kit that will help them to optimize their AI models for Maia 200. It's available in early preview from today.

[13]

Guru3D

Microsoft Maia 200 AI accelerator debuts, targets Azure inference scale

Microsoft has introduced Maia 200, its latest in-house AI accelerator designed for large-scale inference deployments inside Azure. The move reinforces Microsoft's broader strategy of controlling more of the AI stack -- from silicon and server design through to the software toolchain used to run and optimize models in production. Initial rollout is linked to Azure's US Central region, with expansion to other facilities referenced as Microsoft ramps capacity. Maia 200 is fabricated on TSMC's 3 nm process, placing it among the current generation of leading-edge datacenter chips. Public reporting consistently describes the design as a very large die in the >100B-transistor range, with some coverage citing more than 140B transistors. While Microsoft hasn't broadly published deep architectural diagrams in the reporting, the specifications that do appear emphasize inference-friendly math formats and minimizing data movement bottlenecks that can dominate real-world throughput. Memory configuration is one of the most frequently cited aspects. Maia 200 is reported to ship with 216 GB of HBM3e delivering about 7 TB/s of bandwidth, backed by around 272 MB of on-die SRAM. That SRAM capacity stands out because it can reduce pressure on external memory by keeping commonly reused working data close to the compute units -- an advantage when running modern transformer inference where reuse patterns and bandwidth saturation become limiting factors. Power consumption figures cited in coverage place the accelerator around 750 W, in line with the thermal and electrical design points typical of current AI servers. Microsoft is also publishing competitive precision comparisons around FP4 and FP8, the formats commonly used to improve inference efficiency. Summaries of Microsoft's claims position Maia 200 as multiple-times faster than Amazon's Trainium (Gen 3) in FP4, and ahead of Google's TPU v7 in FP8. These comparisons depend on methodology -- model choice, kernels, batch sizing, and system limits all matter -- but the messaging signals Microsoft's confidence in Maia 200 as a hyperscaler-class alternative rather than a purely internal experiment. Beyond the chip, Maia 200 is presented as part of a full platform design. Coverage describes an Ethernet-centric scaling approach, with trays linking four Maia chips using direct connections, then expanding outward toward large clusters -- figures up to 6,144 accelerators are cited. Microsoft is also pairing the hardware announcement with software-stack work that leans on open tools such as Triton, aligning with its goal of reducing reliance on proprietary accelerator ecosystems and making it easier to target Microsoft's own silicon in production environments.

[14]

Dataconomy

Microsoft will continue buying from Nvidia despite custom chip launch

Microsoft deployed its first batch of homegrown Maia 200 AI chips in one data center this week to handle AI inference tasks amid Nvidia supply shortages, with CEO Satya Nadella affirming continued purchases from Nvidia and AMD. The company plans to roll out additional Maia 200 chips in the coming months. Microsoft describes the Maia 200 as an "AI inference powerhouse." This design optimizes the chip for the compute-intensive work of running AI models in production environments. Microsoft released processing-speed specifications showing the Maia 200 outperforms Amazon's latest Trainium chips and Google's latest Tensor Processing Units, or TPUs. Cloud giants including Microsoft pursue their own AI chip designs due to the difficulty and expense of obtaining Nvidia's latest chips. A persistent supply crunch persists without signs of easing. Other providers face similar constraints in acquiring high-performance Nvidia hardware for their data centers. Nadella emphasized Microsoft's ongoing reliance on external suppliers despite the in-house development. He stated, "We have a great partnership with Nvidia, with AMD. They are innovating. We are innovating." Nadella addressed competition in the sector, saying, "I think a lot of folks just talk about who's ahead. Just remember, you have to be ahead for all time to come." He clarified the company's strategy with the remark, "Because we can vertically integrate doesn't mean we just only vertically integrate," referring to building systems entirely in-house without external components. The Maia 200 supports Microsoft's Superintelligence team, which focuses on developing the company's frontier AI models. Mustafa Suleyman leads this team. Suleyman previously co-founded Google DeepMind. The team uses the chips to advance Microsoft's internal AI capabilities. Microsoft develops these models to reduce dependence on external providers. The company currently partners with OpenAI and Anthropic, along with other model makers, for advanced AI systems.

[15]

TweakTown

Microsoft's Maia 200 AI accelerator has 216GB of memory, outperforms Amazon and Google chips

TL;DR: Microsoft's Maia 200 AI accelerator, built on TSMC's 3nm process with advanced FP4 and FP8 tensor cores, delivers over 10 petaFLOPS and 30% better performance-per-dollar than previous hardware. Deployed in Azure datacenters, it powers GPT-5.2 and supports scalable AI workloads with up to 6,144 clustered accelerators. Microsoft has unveiled its latest AI accelerator for inference and token generation, the Maia 200. Built on TSMC's 3nm process with native FP8 and FP4 tensor cores, and an overhauled memory system featuring 216GB of HBM3e at 7 TB/s and 272MB of on-chip SRAM, Microsoft described the Maia 200 as the "most performant, first-party silicon from any hyperscaler." And with that, Microsoft claims three times the FP4 performance of Amazon's third-generation Trainium accelerator, with faster FP8 performance than Google's seventh-gen TPU. Microsoft adds that the Maia 200 is its most efficient AI inference accelerator to date, boasting 30% better performance-per-dollar than the "latest generation hardware" it currently deploys. Set to become part of the company's AI infrastructure, Maia 200 will power the latest GPT-5.2 models from OpenAI, Microsoft Foundry, and Microsoft 365 Copilot. The company will also leverage Maia 200 to train next-gen in-house models using synthetic data. Microsoft has confirmed that Maia 200 has already been deployed in its US Central datacenter near Des Moines, Iowa, and its US West 3 datacenter near Phoenix, Arizona, seamlessly integrating with Azure. A closer look at the Maia 200's specs reveals that its TSMC 3nm process packs over 140 billion transistors and delivers over 10 petaFLOPS in 4-bit precision (FP4) and over 5 petaFLOPS in 8-bit precision (FP8), while carrying a 750W SoC TDP. And of course, multiple Maia 200 chips deliver scalable performance with Microsoft confirming clusters of up to 6,144 accelerators - each with 2.8 TB/s of bidirectional bandwidth. As part of the announcement, Microsoft is formally inviting developers, AI startups, and academics to begin using Maia 200 through a preview program of its SDK. "Our Maia AI accelerator program is designed to be multi-generational," Scott Guthrie, Microsoft Executive Vice President for Cloud and AI, writes. "As we deploy Maia 200 across our global infrastructure, we are already designing for future generations and expect each generation will continually set new benchmarks for what's possible and deliver ever better performance and efficiency for the most important AI workloads."

[16]

Gadgets 360

Microsoft Introduces Maia 200 Chipset for AI Inference

It delivers more than 10 petaflops performance in 4-bit precision Microsoft unveiled its newest artificial intelligence (AI) accelerator, Maia 200 chip, on Monday. It is a purpose-built chipset design for faster AI inference, and is said to cut the cost of running large language models (LLMs) at scale. The new enterprise-focused processor is the successor to the Maia 100, which was launched in 2023. The Maia 200 is currently being deployed in Microsoft's Azure cloud data centres, starting in the US. The company highlighted that its new chip will power the latest models, such as OpenAI's GPT 5.2. Microsoft Unveils Maia 200 Chipset for AI Workloads In a blog post, the Redmond-based tech giant announced and detailed its latest AI chipset. Maia 200 is built on Taiwan Semiconductor Manufacturing Corporation's (TSMC) 3nm process, and each chip contains more than 140 billion transistors. Microsoft said the chips will also feature a custom memory and communication architecture tailored specifically for inference workloads. The advanced design helps maximise the speed at which the chip can process data and keep AI models "fed" with information. A key part of Maia 200's performance comes from its support for low-precision compute formats such as 4-bit (FP4) and 8-bit (FP8) operations. These formats allow AI models to generate responses more quickly and with lower energy use compared with traditional higher-precision computing. Microsoft said Maia 200 delivers in excess of 10 petaFLOPS (quadrillions of floating-point operations per second) in FP4 mode and over 5 petaFLOPS in FP8 mode, making it well-suited for modern LLMs and other AI systems that are used in real-time applications. Maia 200 also includes 216GB of high-bandwidth memory (HBM3e) with 7TBps bandwidth and 272MB of on-chip SRAM. High-bandwidth memory lets the chip quickly access and move large amounts of data, which is a common bottleneck in AI workloads. The addition of on-chip SRAM helps reduce delays when models need frequent access to smaller, critical data sets, improving responsiveness for inference tasks. At the system level, Microsoft has designed Maia 200 to scale efficiently across large clusters. Each chip supports 2.8TBps bi-directional bandwidth, and groups of up to 6,144 accelerators can be connected together using standard Ethernet networking. This scalable architecture allows data centre operators to deploy many Maia 200 chips in a rack or across nodes, increasing the throughput available for demanding AI services while keeping power use and costs under control. One of the central goals behind Maia 200 is to improve performance per dollar, a key metric for inference infrastructure where organisations pay for both compute and energy. Microsoft said Maia 200 delivers around 30 percent better performance per dollar than the hardware the company currently uses in its fleet. Microsoft is currently previewing a Maia software development kit (SDK) that includes tools such as a Triton compiler, PyTorch support, an optimised kernel library and low-level programming support, enabling developers to build and tune models for the Maia 200 platform.

[17]

Observer

Microsoft's Maia Chip Targets A.I. Inference as Big Tech Rethinks Training

The new Maia 200 highlights a growing industry shift toward cheaper, faster A.I. inference at scale. Microsoft this week (Jan. 26) unveiled its latest in-house A.I. chip, Maia 200, calling it "the most efficient inference system" the company has ever built. Microsoft claims the chip outperforms rival Big Tech processors such as Amazon's Trainium 3 and Google's TPU v7 on key benchmarks, while delivering 30 percent better performance per dollar than its existing Azure hardware fleet. Sign Up For Our Daily Newsletter Sign Up Thank you for signing up! By clicking submit, you agree to our <a href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime. See all of our newsletters Maia 200 is a custom application-specific integrated circuit (ASIC) designed primarily for A.I. inference rather than training. The distinction is critical. Training is the process by which an A.I. model learns: engineers feed it vast amounts of data and iteratively adjust its parameters until it becomes better at recognizing patterns and making predictions. Training is computationally intensive, costly and typically done only periodically. Inference, by contrast, is what happens once the model is deployed. Every time a user asks Microsoft Copilot a question, receives a response from a chatbot like GPT, or generates an image, the model is performing inference -- using what it has already learned to produce an output. This happens on an enormous scale, often millions or billions of times per day. In simple terms, training builds the brain; inference is the brain at work. The Maia chip is fabricated on TSMC's 3-nanometer process and contains more than 140 billion transistors. Its architecture is optimized for low-precision compute formats such as FP4 and FP8, which modern A.I. models increasingly favor for inference workloads. It is designed to keep large language models constantly supplied with data and to generate tokens efficiently, without expending energy on features that are unnecessary for inference. Alongside Maia 200, Microsoft also previewed a full software development kit (SDK). Together, the chip and SDK form a vertically integrated system aimed at reducing Microsoft's dependence on Nvidia's CUDA ecosystem, long considered the dominant platform in A.I. infrastructure. Maia 200 will power inference workloads for OpenAI's GPT-5.2, Microsoft 365 Copilot and synthetic data generation pipelines used by Microsoft's Superintelligence team. The chip is already running in Azure data centers, beginning with the U.S. Central region near Des Moines, Iowa, with additional regions planned. Microsoft's move reflects a broader shift across the industry. Major model builders, including Google, Amazon, Meta and OpenAI, are increasingly designing their own chips to reduce reliance on Nvidia, whose top-end GPUs -- such as the H100 and B200 -- reportedly cost between $30,000 and $70,000 each and consume enormous amounts of power. OpenAI, for example, is developing a custom A.I. chip in partnership with Broadcom, with OpenAI leading design and Broadcom handling manufacturing. Google relies heavily on its in-house TPUs within Google Cloud, tightly integrated with its TensorFlow framework and optimized for running trained models at scale. Anthropic uses Google's TPUs to train and operate its Claude models, while Meta is reportedly in advanced talks with Google over a multibillion-dollar deal to use TPUs for its own workloads. Amazon, meanwhile, offers Trainium chips for training and Inferentia chips for inference, targeting AWS customers with large-scale but cost-sensitive A.I. workloads. Together, these efforts signal a growing push by tech giants to control more of the A.I. stack -- from software to silicon -- as compute becomes one of the industry's most strategic bottlenecks.

[18]

Dataconomy

Microsoft unveils Maia 200 chip to supercharge AI inference

Microsoft announced the Maia 200 chip on Monday to scale AI inference, featuring over 100 billion transistors for faster speeds and higher efficiency than the 2023 Maia 100. The Maia 200 delivers more than 10 petaflops in 4-bit precision and approximately 5 petaflops in 8-bit performance. Microsoft calls it a silicon workhorse engineered specifically for AI inference tasks. This process involves running trained AI models to generate outputs, distinct from the training phase that builds those models. As AI operations expand, inference now accounts for a growing share of total computing expenses, driving efforts to streamline it. The company positions the Maia 200 to reduce operational disruptions and power consumption in AI deployments. A single node equipped with the chip handles the largest current AI models while leaving capacity for substantially bigger ones ahead. Microsoft stated, "In practical terms, one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future." This release aligns with a pattern among major technology firms developing custom processors to decrease dependence on Nvidia's graphics processing units, which dominate AI workloads. Nvidia GPUs have grown central to AI success, prompting alternatives to manage hardware expenses. Google offers tensor processing units, or TPUs, not as standalone chips but as cloud-based compute resources. Amazon provides Trainium AI accelerator chips, with the third-generation Trainium 3 released in December. These options allow companies to shift some workloads away from Nvidia hardware, cutting overall costs. Microsoft claims the Maia 200 outperforms competitors in key metrics. It achieves three times the FP4 performance of Amazon's third-generation Trainium chips. Its FP8 performance surpasses that of Google's seventh-generation TPUs, as detailed in the company's Monday press release. The chip already supports internal AI efforts. It powers models developed by Microsoft's Superintelligence team. Operations for the Copilot chatbot also rely on Maia 200 hardware. As of Monday, Microsoft extended invitations to external users. Developers, academics, and frontier AI labs can now access the Maia 200 software development kit to integrate into their workloads.

[19]

Microsoft rolls out next generation of its AI chips, takes aim at Nvidia's software

The new "Maia 200" chip comes online this week in a data center in Iowa, with plans for a second location in Arizona, Microsoft said. It is the second generation of an AI chip called Maia that Microsoft introduced in 2023. Microsoft on Monday unveiled the second generation of its in-house artificial intelligence chip, along with software tools that take aim at one of Nvidia's biggest ⁠competitive advantages with developers. The new "Maia 200" chip comes online this week in a data center in Iowa, with plans for a second location in Arizona, Microsoft said. It is the second generation of an AI chip called Maia that Microsoft introduced in 2023. The Maia 200 comes as major cloud computing firms such as Microsoft, Alphabet's Google and Amazon.com's Amazon Web Services - some of Nvidia's biggest customers - are producing their own chips that increasingly compete with Nvidia. Google, in particular, has garnered interest from major Nvidia customers such as Meta Platforms, which is working closely with Google to close one of the biggest software gaps between Google and Nvidia's AI chip offerings. For its part, Microsoft said that along with the new Maia chip, it will be offering a package of software tools to program it. That includes Triton, an open-source software tool with major contributions from ChatGPT creator OpenAI that takes on the same tasks as Cuda, the Nvidia software that many Wall Street analysts say is Nvidia's biggest competitive advantage. Like Nvidia's forthcoming flagship "Vera Rubin" chips introduced earlier this month, Microsoft's Maia 200 is made by Taiwan Semiconductor Manufacturing Co using 3-nanometer chipmaking technology and will use high-bandwidth memory chips, albeit an older and slower generation than Nvidia's forthcoming chips. But Microsoft has also taken a page from the playbook ⁠of some of Nvidia's ⁠rising competitors by packing the Maia 200 chip with a significant amount of what is known as SRAM, a type of memory that can provide speed advantages for chatbots and other AI systems when they field requests from a large number of users. Cerebras Systems, which recently inked a $10 billion deal with OpenAI to supply computing power, leans heavily on that technology, as does Groq, the startup that Nvidia licensed technology from in a reported $20 billion deal.

[20]

CRN

Microsoft Takes On AWS, Google And Nvidia With Maia 200 AI Chip Launch

Microsoft claims that the newly launched Maia 200 AI accelerator chip is 'the most performant, first-party silicon from any hyperscaler.' It's the latest in Microsoft's efforts to lower its reliance on third-party silicon vendors such as Intel, AMD and Nvidia. Microsoft is taking on Amazon and Google with the launch of its Maia 200 AI accelerator chip, calling the processor "the most performant, first-party silicon from any hyperscaler." In announcing the inference-focused Maia 200 launch on Monday, the Redmond, Wash.-based tech giant claimed that the processor outperforms homegrown AI chips from Amazon Web Services and Google across several key measures, including low-precision numerical formats that are important for a growing number of AI inference workloads. [Related: Intel Hires Qualcomm Executive To Lead GPU Engineering For Data Centers] The Maia 200 is the latest in Microsoft's efforts to lower its reliance on third-party silicon vendors such as Intel, AMD and Nvidia. The latter company has dominated the AI infrastructure market with an increasingly vertical set of software and hardware solutions that leaves customers and partners with fewer customization options for its fastest GPUs. Microsoft said it has already deployed Maia 200 systems in its U.S. Central region near Des Moines, Iowa, with the U.S. West 3 near Phoenix, Ariz., on deck for the next available region. More regions are expected to follow. The systems are powering Microsoft Copilot and Microsoft Foundry workloads, according to Microsoft. They are also being used to run advanced AI models, including OpenAI's latest GPT-5.2 models, as well as those in development by Microsoft's Superintelligence team, which is led by Microsoft AI CEO Mustafa Suleyman. Scott Guthrie, executive vice president of Microsoft's cloud and AI group, said in a blog post that with the Maia 200 the company was able to achieve "higher utilization, faster time to production and sustained improvements in performance-per-dollar and per-watt at cloud scale." This is thanks to efforts by Microsoft's silicon development programs to "validate as much of the end-to-end system as possible ahead of final silicon availability." Microsoft claimed that the Maia 200 can achieve nearly 10,200 teraflops of 8-bit floating-point (FP4) performance. That makes the chip four times more powerful than Amazon Web Services' Trainium3 chip. The company also said Maia 200 can reach just over 5,000 teraflops of 8-bit floating-point performance (FP8), which gives it a 9 percent advantage over Google's seventh-generation TPU and over double that of Trainium3. Using HBM3E high-bandwidth memory, the Maia 200 comes with 216 GB of memory and a memory bandwidth of 7 TBps in contrast to the 144 GB and 4.9 TBps of Trainium3 and to the 192 GB and 7.4 TBps of TPU v7. The chip also supports a scale-up bandwidth of 2.8 TBps versus the 2.56 TBps maximum of Trainium 3 and 1.2 TBps of TPU v7. What Microsoft didn't address in the Monday blog post is the total performance and other specs of a server rack housing the Maia 200 chips. These kinds of details can illuminate how much performance a rack full of AI chips can deliver and how much power it requires. AWS, for instance, said its Trn3 UltraServers can pack up to 144 Trainium3 chips to deliver up to 362 petaflops of FP8 performance. Google, on the other hand, said its TPU v7 pod features 9,216 seventh-generation TPUs to deliver 42.5 exaflops of FP8. A Microsoft spokesperson did not respond to a request for similar details. While Microsoft also didn't provide any competitive comparisons on energy consumption or cost, the company said Maia 200 provides 30 percent more performance-per-dollar than the first-generation Maia 100. And it's achieving that improvement with Maia 200's 750-watt thermal design power, which is only 50 watts higher than the maximum power envelope of its predecessor. (Microsoft has said that it provisions Maia 100 for 500 watts.) What will ultimately determine how Maia 200 competes against AWS' Trainium3 and Google's TPU v7 are two-fold: how much it costs for customers to run their workloads, and how effectively they can effectively utilize each cloud service provider's software stack.

[21]

Benzinga

Microsoft Escalates AI Arms Race With Next Gen Chips - Microsoft (NASDAQ:MSFT)

Microsoft Corp. (NASDAQ:MSFT) unveiled an upgraded version of its homegrown AI chip on Monday, pairing it with new developer tools to directly compete with Nvidia Corp.'s (NASDAQ:NVDA) strongest moat. Microsoft built Maia 200 as an inference-focused accelerator on Taiwan Semiconductor Manufacturing Company Ltd.'s (NYSE:TSM) 3nm process with native FP8 (8-bit Floating Point)/FP4 tensor cores. The chip pairs a redesigned memory system, 216GB of HBM3E (High Bandwidth Memory 3 Extended) delivering 7 TB/s and 272MB of on-chip Static Random-Access Memory (SRAM). Microsoft said Maia 200 delivers three times the FP4 performance of Amazon.com, Inc.'s (NASDAQ:AMZN) third-generation Trainium and FP8 performance above Alphabet Inc.'s (NASDAQ:GOOGL) Google seventh-generation Tensor Processing Unit (TPU), while improving performance per dollar by 30% versus the latest hardware in its fleet. Microsoft placed Maia 200 inside its heterogeneous AI infrastructure to serve multiple models, including OpenAI's latest GPT-5.2 models, and to support Microsoft Foundry and Microsoft 365 Copilot. The Microsoft Superintelligence team will use Maia 200 for synthetic data generation and reinforcement learning to improve next-generation in-house models. Microsoft deployed Maia 200 in its U.S. Central datacenter region near Des Moines, Iowa, and plans to bring it next to the U.S. West 3 region near Phoenix, Arizona, with additional regions to follow. Microsoft said Maia 200 integrates seamlessly with Azure and will be supported by a preview of the Maia SDK, a full developer toolkit designed to help build and optimize models for the chip. The company said the SDK includes PyTorch integration, a Triton compiler, optimized kernel libraries, and access to Maia's low-level programming language, giving developers both fine-grained control and the ability to easily port models across different hardware accelerators. MSFT Price Action: Microsoft shares were up 1.67% at $473.72 at the time of publication on Monday, according to Benzinga Pro data. Image via Shutterstock MSFTMicrosoft Corp $473.091.53% Overview AMZNAmazon.com Inc $239.15-0.01% GOOGAlphabet Inc $335.742.23% GOOGLAlphabet Inc $335.292.24% NVDANVIDIA Corp $186.38-0.69% TSMTaiwan Semiconductor Manufacturing Co Ltd $333.74-0.34% Market News and Data brought to you by Benzinga APIs

[22]

Motley Fool

Microsoft Releases Powerful New AI Chip to Take on Nvidia

There's no denying that Nvidia's (NVDA +1.69%) graphics processing units (GPUs) are tops when it comes to artificial intelligence (AI) processing. Unfortunately, being the king of the hill means there's always someone trying to take your crown. Microsoft (MSFT +2.35%) just announced the debut of a powerful new AI chip, the latest move in the company's bid to become a greater force in the AI landscape. A chip off the old block In a blog post released on Monday, Scott Guthrie, Microsoft's executive vice president of Cloud + AI, introduced Maia 200, the company's latest chip designed specifically for AI inference. He calls Maia "a breakthrough inference accelerator engineered to dramatically improve the economics of AI token generation." The Maia 200 has more high-bandwidth memory, offering three times the performance of Amazon's (AMZN +2.50%) third-generation Trainium chip and above that of Alphabet's (GOOGL +0.71%) (GOOG +0.71%) seventh-generation Ironwood Tensor Processing Unit (TPU). Guthrie called Maia "the most performant, first-party silicon from any hyperscaler." The processor provides both performance and bang for the buck, being "tailored for large-scale AI workloads while also delivering efficient performance per dollar." Maia also includes a reconfigured memory system designed to prevent bottlenecks when feeding data into the AI model. It's also Microsoft's most efficient inference chip "ever deployed, with 30% better performance per dollar" than similarly priced alternatives. One of the most significant benefits for Microsoft is that the Maia 200 has been designed to provide peak efficiency when powering Copilot and Azure OpenAI. It is also being deployed to data centers running Microsoft 365 Copilot and Foundry, the company's cloud-based AI offerings. By using its homegrown AI chips, Microsoft is working to reduce the cost of running AI workloads amid pressure to contain rising energy outlays. Microsoft said there would be "wider customer availability in the future" for the Maia 200, unlike the previous version, which was never made available to the public. To that end, the company is making its Software Development Kit (SDK) available to developers, AI start-ups, and academics, hoping to give customers a reason to switch. Will Maia "chip" away at Nvidia's lead? Maia is the latest in a string of chips released by Nvidia's rivals to decrease their dependence on its GPUs. Despite rising competition, Nvidia still maintains a dominant 92% share of the data center GPU market, according to IoT Analytics. While Maia may offer benefits for running Microsoft's inference workloads, Nvidia's GPUs still provide the greatest degree of computational horsepower and the flexibility needed to run both inference and AI training. That said, if Microsoft can deliver more affordable AI options to its cloud customers while reducing its own power consumption, it can lower expenses and boost profits. Furthermore, at 34 times earnings, Microsoft is attractively priced compared to a multiple of 47 for Nvidia. Don't get me wrong. I think both Microsoft and Nvidia are frontrunners in the AI revolution -- which is why I own both stocks.

[23]

PYMNTS

Microsoft Ramps Up AI Chip Race With Google and Amazon | PYMNTS.com

In its official announcement, Microsoft said Maia 200 delivers up to three times higher inference performance than competing chips from Amazon and Google on certain internal benchmarks. The company framed the launch as part of a broader effort to lower the cost of running AI services such as Copilot and large language models across Azure. The move places Microsoft more squarely in the race among hyperscalers to design proprietary silicon, a strategy aimed at improving efficiency, reducing dependence on third-party chip suppliers and gaining tighter control over AI economics as usage scales. What Maia 200 Is Designed to Do Maia 200 is built specifically for AI inference. Training typically requires large bursts of compute during development, while inference represents a continuous operational expense as models respond to user prompts in real time. Microsoft's design choices reflect that distinction. Microsoft mentioned that Maia 200 is optimized for low-precision inference formats, particularly FP4 and FP8, which are widely used to reduce compute and energy requirements while maintaining acceptable output quality for many large models. Microsoft said the chip delivers up to 10 petaFLOPS of FP4 performance, alongside improved memory bandwidth to keep data close to compute and reduce latency. In simpler terms, the goal is to process more AI queries faster and at lower cost. Microsoft said Maia 200 offers roughly 30% better performance per dollar compared with the hardware it previously relied on for inference workloads in its data centers. Those savings become meaningful as AI usage grows across enterprise applications and consumer-facing services. Microsoft has deployed Maia 200 first in its U.S. Central data center near Des Moines, with subsequent rollouts planned for other regions, including Phoenix. The company has also released development tools to allow software teams to optimize workloads for the chip, signaling that Maia 200 is intended to support a growing portion of Microsoft's AI infrastructure rather than a limited internal experiment. Performance Claims and the Inference Race Microsoft's most attention-grabbing claim is that Maia 200 delivers up to three times higher inference performance than Amazon's third-generation Trainium chip and higher FP8 performance than Google's latest TPU. However, the scope of that comparison is narrow and warrants careful framing. As Microsoft stated, the comparisons are based on internal benchmarks focused on low-precision inference workloads. The company does not claim superiority across all AI tasks, nor does it extend the three-times figure to every competing chip or use case. LiveScience attributed the performance comparisons directly to Microsoft and noted that the claims apply to specific inference scenarios rather than general-purpose AI performance. The Decoder also emphasized that Microsoft has not released third-party validation of the results. There are currently no standardized benchmark results, such as MLPerf submissions, that independently compare Maia 200 with Amazon or Google chips across real-world workloads. This distinction matters. Amazon and Google have spent years refining their own custom AI processors and continue to invest heavily in inference optimization. Performance can vary significantly depending on model architecture, precision format and deployment environment. The Maia 200 launch also reflects Microsoft's broader vertical-integration strategy. As AI demand strains global chip supply and drives up costs, owning more of the hardware stack gives cloud providers greater control over pricing, performance and product road maps.

[24]

Motley Fool

Does This New Chip Threaten Nvidia?

Microsoft's new Maia accelerator captures the growing competitive pressure on Nvidia. Microsoft (MSFT +0.89%) recently introduced a new in-house AI (artificial intelligence) accelerator called Maia 200, positioning it as a chip built for high-performance inference workloads inside Azure. This likely has many investors wondering about this chip's potential impact on Nvidia (NVDA 0.65%), the AI chip designer whose graphics processing units (GPUs) and software stack have become central to today's AI computing buildout. In short, while Microsoft's Maia 200 is notable, it's not the chip itself that Nvidia investors should be worried about, but rather the pace at which deep-pocketed cloud providers are building substitutes for some of the work that has flowed to Nvidia's GPUs. Can Nvidia keep capturing a large share of AI spending even as more inference capacity gets supplied by custom silicon inside the biggest cloud platforms? Meet Microsoft's Maia 200 Microsoft talked a big game on Monday, when it announced Maia 200. It's "a breakthrough inference accelerator engineered to dramatically improve the economics of AI token generation," the company said in a blog post about Maia 200. The chip excels in both performance and cost-efficiency, as it's "tailored for large-scale AI workloads while also delivering efficient performance per dollar," the company said. The company compared Maia 200 directly with other hyperscaler chips, saying it has three times the FP4 performance of Amazon's third-generation Trainium, and FP8 performance exceeding Alphabet's seventh-generation TPU. For those that aren't well-versed in technical chip talk, Microsoft broke it down plainly: "Maia 200 is also the most efficient inference system Microsoft has ever deployed, with 30% better performance per dollar than the latest generation hardware in our fleet today." Capturing its power, the company said it will even help power the latest GPT models from OpenAI. Nvidia's platform still leads The Maia 200 is likely primarily part of Microsoft's effort to run its cloud more efficiently, but it's unlikely to replace the company's reliance on Nvidia (though it may reduce it). Further, it's worth noting that Nvidia's advantage goes beyond raw compute. It pairs GPUs with networking and a software stack that many developers already build around, which is hard for a single in-house chip to replicate across a broad set of workloads. On Nvidia's most recent earnings call, CEO Jensen Huang addressed the rise of AI ASICs (Application-Specific Integrated Circuits), or custom-designed chips designed for efficient, specialized AI tasks. He said that, unlike ASICs, the appeal of Nvidia's platform to cloud customers is that its solutions are extremely diverse and resilient, able to be deployed across virtually all models. In other words, while chips like the Maia 200 can help cloud providers reduce some of their reliance on Nvidia, they're usually only useful for very specific tasks. In the meantime, it's not like Nvidia is struggling. In the third quarter of fiscal 2026 (the quarter ending on Oct. 26, 2025), Nvidia reported revenue of $57.0 billion, up 62% year over year. That growth rate was higher than the prior quarter, when revenue rose 56% year over year. Data center revenue remains the primary driver for Nvidia. In the third quarter, it reported data center revenue of $51.2 billion, up 66% year over year. A reason to be skeptical Still, investors are paying about 46 times earnings for Nvidia as of this writing. That's a high valuation given the rising risk that deep-pocketed hyperscalers like Microsoft, Alphabet, and Amazon will slowly erode Nvidia's market opportunity. While I don't think Nvidia is at risk of any sudden disruption, it could face eroding pricing power if these major cloud providers gradually find ways to use more in-house technology in place of Nvidia's offerings. Ultimately, Maia 200 -- and other cloud providers' custom solutions -- likely won't break Nvidia's platform advantage, but it does tighten the competitive backdrop that investors are paying for. If alternatives like this slowly erode Nvidia's pricing power, the company may struggle to justify its high valuation.

[25]

CXOToday

Microsoft Launches New Chip Maia 200 for Enhancing AI Inference

Microsoft joins others like Google and Amazon that are creating their own chips to reduce Over 18 months after launching its first custom AI chip that could compete with Nvidia, Microsoft has now announced a successor in the form of the Maia 200, which it claims would become the silicon workhorse in the realm of AI inference - a stage where an AI model processes new, unseen data to generate content, or make decisions. The focus of this new AI accelerator is inference -- the production deployment of AI models in applications. Its architecture combines high compute performance with a newly designed memory system and a scalable networking architecture, says Microsoft in media statement. "With these new chips, Microsoft is equipping its global cloud infrastructure for the next generation of AI workloads. Maia 200 is specifically designed for compute‑intensive AI inference and integrates seamlessly into Microsoft Azure," the company says. Claims it's ahead of other chips used for inference In a blog post, Scott Guthrie, Microsoft's executive VP of Cloud and AI, notes that Maia 200 is an accelerator built on TSMC's 3nm process with native FP8/FP4 tensor cores, a redesigned memory system with 216GB HBM3e at 7 TB/s and 272MB of on-chip SRAM, plus data movement engines that keep massive models fed, fast and highly utilized. "This makes Maia 200 the most performant, first-party silicon from any hyperscaler, with three times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google's seventh generation TPU. Maia 200 is also the most efficient inference system Microsoft has ever deployed, with 30% better performance per dollar than the latest generation hardware in our fleet today," Guthrie says. Maia comes equipped with over 100 billion transistors, delivering over 10 petaflops in 4-bit precision and approximately 5 petaflops of 8-bit performance. Given that companies are now seeking more computing power to run the model than train it, this launch could become a crucial part of the company's overall operating cost models. "In practical terms, one Maia 200 node can effortlessly run today's largest models, with plenty of headroom for even bigger models in the future," the company noted adding that it will initially be deployed in U.S. regions of the Microsoft Azure cloud and will be used for AI models from the Microsoft Superintelligence team. Alongside the introduction of Maia 200, Microsoft is also previewing the Maia Software Development Kit (SDK) which supports common AI frameworks and helps developers optimize their models specifically for deployment on Maia systems. The new Maia SDK includes a Triton compiler, PyTorch support, NPL programming, as well as a simulator and cost calculator (You can watch Scott Guthrie's video explainer here). The latest development suggests that Microsoft is seeking to be on top of the trend where tech giants are self-designing chips to reduce their dependence on Nvidia, who developed their GPUs as a cutting-edge solution to AI success. Google has already come out with its TPUs that are sold as compute power made accessible via the cloud. Amazon too had come out with Trainium which has been positioned as an accelerator chip. The Trainium 3 was launched last month. The latest Maia 200 launch brings Microsoft into this pantheon of companies that provide alternatives to the GPUs. Besides the Superintelligence team at Microsoft, the new chip is also supporting Copilot ops.

[26]

Market Screener

Microsoft unveils the Maia 200 chip and aims to take on Nvidia's software ecosystem

On Monday Microsoft introduced the Maia 200, the second generation of its in-house artificial intelligence chip, alongside a software suite designed to compete with Nvidia's Cuda environment. The chip, now up and running at a data center in Iowa, marks another step in Microsoft's bid to reduce its technological dependence on Nvidia while strengthening its control of the AI value chain. A second deployment site is planned in Arizona in the coming months. Built for compute-intensive cloud workloads, the Maia 200 is manufactured by TSMC using 3-nanometer technology and includes HBM memory, though in a less advanced version than what is expected in upcoming Nvidia chips. Microsoft is also betting on the integration of substantial SRAM, inspired by rivals such as Cerebras and Groq, to optimize performance for interactive AI uses, notably chatbots. Microsoft describes Maia 200 as the third most powerful silicon among all hyperscalers, with FP4 performance three times higher than Amazon Trainium's third generation and FP8 performance higher than Google's seventh-generation TPU. Alongside this, the company is rolling out new software tools, including Triton, an open-source project co-developed with OpenAI and presented as a direct alternative to Cuda, one of Nvidia's main competitive advantages. The announcement comes amid a rapidly reshaping AI semiconductor sector. As Cerebras has signed a $10bn agreement with OpenAI, Nvidia recently reached a partnership estimated at $20bn with the start-up Groq. Against this backdrop of strategic alliances, Microsoft is stepping up its vertical integration, aiming to position itself not only as a cloud infrastructure provider but also as a central player in chip design and software development in artificial intelligence.

[27]

Digit

Maia 200 explained: Microsoft's custom chip for AI acceleration

Custom AI accelerators like Maia 200 reduce cloud inference costs As artificial intelligence shifts from experimental demos to everyday products, the real pressure point is no longer training models but running them at scale. Every AI search result, chatbot reply, or copilot suggestion relies on inference, the stage where a trained model produces answers in real time. To meet this demand, Microsoft has introduced Maia 200, a custom AI accelerator purpose built for inference inside Azure data centers. Maia 200 is not designed to replace general purpose GPUs across the board. Instead, it reflects a more focused philosophy. Microsoft is optimizing for the workloads that matter most to its cloud business today: serving large language models efficiently, reliably, and at massive scale. The result is a chip that prioritizes throughput, memory efficiency, and cost per query over headline grabbing training benchmarks. Also read: US vs China AI development: Mistral calls it a "fairy tale," Google sees 6-Month lead Built for inference at cloud scale The defining characteristic of Maia 200 is its emphasis on low precision computing. During inference, models can often operate at lower numerical precision without sacrificing output quality. Maia 200 takes advantage of this by supporting formats like FP4 and FP8, allowing it to process more operations per second while consuming less power. This makes the chip especially well suited for AI services that must respond instantly to millions of users. Memory architecture plays an equally critical role. Maia 200 pairs its compute units with a large pool of high bandwidth memory, enabling large models or substantial model shards to sit close to the processor. This reduces latency and avoids the performance penalties that come from constantly fetching data from slower memory tiers. A sizeable on-chip SRAM further accelerates frequently accessed data, keeping the compute pipelines busy. Data movement has clearly been treated as a first class design concern. Dedicated engines manage how data flows between memory, compute units, and the network. In large deployments, Maia 200 connects through a high bandwidth Ethernet-based interconnect that allows thousands of chips to operate as a single logical system. This is essential for serving today's largest models, which often need to be distributed across multiple accelerators. Why Maia 200 matters to Microsoft's AI strategy Beyond the technical details, Maia 200 is significant because of what it says about Microsoft's long term AI ambitions. For years, cloud providers depended almost entirely on third party accelerators. While those chips remain central to Azure, Maia 200 signals a shift toward tighter vertical integration. By designing its own silicon, Microsoft can optimize hardware, system design, and software together around real production workloads. Software support is a key part of this equation. Maia 200 is designed to work with established frameworks such as PyTorch, minimizing friction for developers. Custom compilers and optimization tools handle the heavy lifting behind the scenes, translating models into forms that run efficiently on the hardware. For most users, the goal is that Maia 200 simply feels like faster and cheaper AI, without requiring code rewrites. The economic implications are substantial. Inference costs dominate the operating expenses of large scale AI services. Even modest gains in performance per watt or performance per dollar can translate into massive savings when multiplied across billions of requests. Microsoft says Maia 200 delivers a meaningful improvement in efficiency compared to earlier infrastructure, helping make AI services more sustainable and scalable. Maia 200 is not about winning benchmark battles or replacing GPUs outright. It represents specialization. As AI infrastructure matures, different chips are emerging for different roles. Training, inference, and edge deployment each have distinct requirements. Maia 200 is Microsoft's answer to the inference problem, quietly powering the AI experiences that are becoming part of everyday digital life.

Twitter

Facebook

Copy Link

Microsoft has begun deploying its second-generation Maia 200 AI chip in Azure data centers, claiming 30% better performance per dollar and superior specs compared to Amazon and Google's custom silicon. Despite this milestone, CEO Satya Nadella confirmed the company will continue purchasing AI hardware from Nvidia and AMD, emphasizing the value of multi-vendor partnerships in the rapidly evolving AI infrastructure landscape.

Microsoft Deploys Maia 200 AI Chip in Azure Data Centers

Microsoft has started deploying its in-house AI chip, the Maia 200, in its Azure data centers this week, marking a significant step in the company's effort to reduce AI operational costs and compete with custom silicon from tech giants 1

. Built on TSMC's 3nm process, the chip packs more than 100 billion transistors and delivers over 10 petaflops of FP4 performance, positioning it as what Microsoft calls an "AI inference powerhouse" 2

. The Maia 200 represents the successor to Microsoft's first-generation Maia 100 released in 2023, with substantial improvements designed specifically for AI inference workloads rather than training 3

Source: Interesting Engineering

The chip's technical specifications reveal a focus on efficiency and scale. With 216GB of HBM3e memory providing 7TB/s of bandwidth and 272MB of on-die SRAM, the Maia 200 can handle large AI models with room for even bigger models in the future 4

. Microsoft claims the chip delivers 30% better performance per dollar than the latest generation AI hardware in its fleet, operating at 750 watts—nearly half the power consumption of Nvidia's Blackwell B300 Ultra 5

Outperforming Amazon and Google's Custom Silicon

Microsoft has positioned the Maia 200 to directly compete with Amazon's Trainium and Google's TPU offerings. The company claims its chip delivers three times the FP4 performance of third-generation Amazon Trainium chips and FP8 performance above Google's seventh-generation TPU 2

. This performance advantage matters as cloud providers race to reduce reliance on Nvidia and offer more cost-effective AI infrastructure to customers. The chip's design prioritizes inference—the compute-intensive work of running AI models in production—which has become an increasingly important cost factor as AI companies mature 2

The Maia 200's architecture includes specialized tile tensor units supporting FP8, FP6, and FP4 datatypes in hardware, though this comes with trade-offs for workloads requiring higher precision 5

. Microsoft has equipped each chip with 2.8TB/s of bidirectional bandwidth, enabling clusters of up to 6,144 chips that can pool compute and memory resources—totaling 61 exaflops of AI compute and 1.3 petabytes of HBM3e memory 5

Source: Motley Fool

Nadella Confirms Continued Nvidia and AMD Partnerships

Despite launching its own state-of-the-art AI chip, Microsoft CEO Satya Nadella made clear the company won't abandon its hardware partnerships. "We have a great partnership with Nvidia, with AMD. They are innovating. We are innovating," Nadella explained, adding that "you have to be ahead for all time to come" 1

. He emphasized that vertical integration doesn't mean building systems exclusively from in-house components, signaling Microsoft's strategy to maintain a heterogeneous AI hardware approach across its Azure data centers 4

Source: Observer

This multi-vendor strategy reflects the ongoing supply constraints for cutting-edge AI hardware and the rapid pace of innovation in the sector. With Nvidia's Rubin chips promising a 5x uplift in inference performance compared to Blackwell or Maia 200 launching later this year, Microsoft's approach hedges against being locked into any single technology path 5

Superintelligence Team Gets First Access

Microsoft's Superintelligence team, led by former Google DeepMind co-founder Mustafa Suleyman, will be the first to use the Maia 200 as they develop frontier AI models 1

. This priority access signals Microsoft's ambition to build its own large AI models and potentially reduce dependence on OpenAI, Anthropic, and other model makers. The chip will also support OpenAI's models running on Azure, including hosting OpenAI's GPT-5.2 model and powering Microsoft 365 Copilot 3

Microsoft has invited developers, academics, frontier AI labs, and open-source contributors to an early preview of the Maia 200 software development kit 2

. The company is deploying these chips initially in its Azure US Central data center region, with additional rollouts planned for US West 3 in Phoenix, Arizona, and other locations as supply increases 4

. The Maia 200's development faced delays—originally codenamed Braga, it was meant for a 2025 release but arrived later than planned 4

References

Summarized by

Navi

[1]

TechCrunch

Microsoft won't stop buying AI chips from Nvidia, AMD, even after launching its own, Nadella says

[2]

TechCrunch

Microsoft announces powerful new chip for AI inference

[3]

The Verge

Microsoft's latest AI chip goes head-to-head with Amazon and Google

[4]

Tom's Hardware

Microsoft introduces newest in-house AI chip -- Maia 200 is faster than other bespoke Nvidia competitors, built on TSMC 3nm with 216GB of HBM3e

[5]

The Register

Microsoft looks to drive down AI infra costs with Maia 200

Recent Highlights

Today's Top Stories

Nvidia's DLSS 5 AI graphics feature sparks fierce gamer backlash over uncanny valley visuals

Nvidia unveiled DLSS 5, calling it the biggest breakthrough in computer graphics since ray tracing. But the AI-powered feature sparked immediate backlash from gamers and developers who say it creates an uncanny, homogenized look that erases artistic intent. The technology uses generative AI to overhaul lighting and textures in real-time, transforming character faces and environments in games like Resident Evil Requiem and Hogwarts Legacy.

35 Sources

Technology

11 hrs ago

OpenAI releases GPT-5.4 mini and nano, built for speed over size in AI software engineering

OpenAI launched GPT-5.4 mini and GPT-5.4 nano on Tuesday, marking a strategic shift toward smaller and faster AI models designed for high-volume workloads. The new models run more than twice as fast as their predecessors while approaching flagship performance on coding tasks. GPT-5.4 mini is now available to ChatGPT Free and Go users, while nano targets developers building cost-efficient AI systems.

5 Sources

Technology

7 hrs ago

Nvidia NemoClaw transforms OpenClaw into secure enterprise platform for AI agents

Nvidia launched NemoClaw at GTC 2026, adding enterprise-grade security and privacy controls to OpenClaw, the fastest-growing open-source AI agent platform. The single-command installation includes OpenShell runtime for sandboxing, policy-based guardrails, and local Nemotron models, addressing critical vulnerabilities that prevented businesses from deploying autonomous AI agents in production environments.

8 Sources

Technology

15 hrs ago

Spielberg draws line on AI in film as Hollywood battles over creative control and job security

Steven Spielberg declared he's never used AI in filmmaking and opposes the technology replacing creative individuals, speaking at SXSW 2026. His stance highlights growing tensions as studios experiment with AI tools while unions negotiate protections and copyright battles intensify across the industry.

4 Sources

Technology

23 hrs ago

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

The Outpost

News

About

Microsoft deploys Maia 200 AI chip but won't stop buying from Nvidia, Nadella confirms

Microsoft Deploys Maia 200 AI Chip in Azure Data Centers

Outperforming Amazon and Google's Custom Silicon

Nadella Confirms Continued Nvidia and AMD Partnerships

Superintelligence Team Gets First Access

References

Microsoft won't stop buying AI chips from Nvidia, AMD, even after launching its own, Nadella says

Microsoft announces powerful new chip for AI inference

Microsoft's latest AI chip goes head-to-head with Amazon and Google

Microsoft introduces newest in-house AI chip -- Maia 200 is faster than other bespoke Nvidia competitors, built on TSMC 3nm with 216GB of HBM3e

Microsoft looks to drive down AI infra costs with Maia 200

Related Stories

Microsoft's AI Chip Strategy: Shifting Away from Nvidia and AMD

Microsoft Leads AI Race with Massive Nvidia Chip Acquisition

Microsoft's In-House AI Chip Faces Significant Delays and Performance Challenges

Recent Highlights

Google Maps unveils Ask Maps with Gemini AI and 3D Immersive Navigation in biggest update

AI chatbots help plan violent attacks as safety guardrails fail, new investigation reveals

OpenAI secures $110 billion funding round as questions swirl around AI bubble and profitability

Recent Highlights

Today's Top Stories

Nvidia's DLSS 5 AI graphics feature sparks fierce gamer backlash over uncanny valley visuals

OpenAI releases GPT-5.4 mini and nano, built for speed over size in AI software engineering

Nvidia NemoClaw transforms OpenClaw into secure enterprise platform for AI agents

Spielberg draws line on AI in film as Hollywood battles over creative control and job security