Meta MTIA Chips Target AI Inference Over Nvidia

Meta Accelerates Custom Silicon Development with Four MTIA Chip Generations

Meta announced four successive generations of its custom Meta Training and Inference Accelerator (MTIA) chips on March 11, marking an aggressive push into in-house AI chips designed to handle the company's rapidly expanding AI inference workloads1

. The MTIA 300, 400, 450, and 500 are all scheduled for deployment over the next two years, with Meta describing the chips as progressively optimized for AI inference on the premise that HBM memory bandwidth is the binding constraint on inference performance1

Source: Meta AI

The MTIA 300 is already in production for content ranking and recommendations training, while the MTIA 400—also known as Iris—has completed lab testing and is moving toward data center deployment2

. The MTIA 450 and MTIA 500 chips, code-named Arke and Astrid respectively, are scheduled for mass deployment in early 2027 and later in 20272

. According to Yee Jiun Song, Meta's vice president of engineering, the products are being built in parallel on a roughly six-month cadence, much faster than the industry's typical one-to-two year cycle3

HBM Bandwidth Emerges as Critical Performance Driver for AI Inference Workloads

Meta doubled HBM bandwidth from MTIA 400 to 450, making it "much higher than that of existing leading commercial products," a reference to Nvidia's H100 and H200 chips3

. The MTIA 500 then increases HBM bandwidth by an additional 50% compared with the MTIA 450, along with up to 80% more HBM capacity1

. Across the full 300-to-500 progression, HBM bandwidth increases 4.5 times and compute FLOPs increase 25 times3

Source: Tom's Hardware

In a technical blog post, Meta described HBM's bandwidth as the most important factor affecting AI inference performance, adding that mainstream chips built for large-scale pre-training are applied less cost-effectively to AI inference workloads1

. The chips use a modular chiplet architecture that allows the MTIA 400, 450, and 500 to share the same chassis, rack, and network infrastructure, meaning each new chip generation drops into the existing physical footprint without requiring new data center buildouts1

Strategic Move to Reduce Reliance on Nvidia While Maintaining Dual Hardware Approach

The announcement comes just two weeks after Meta disclosed a long-term, $100 billion AI infrastructure agreement with AMD, suggesting a broader effort to reduce reliance on Nvidia across different parts of Meta's AI stack3

. However, Meta will continue buying chips from other companies, including Nvidia and AMD, as it pursues a dual approach of buying traditional hardware and investing in custom silicon development for specialized tasks2

Source: Bloomberg

Song told CNBC that by designing custom chips, which are then manufactured by Taiwan Semiconductor, Meta can squeeze more price per performance across its data center fleet rather than relying only on vendors4

. "This also provides us with more diversity in terms of silicon supply, and insulates us from price changes to some extent," Song said. "This is a little bit more leverage"4

. Meta has already deployed hundreds of thousands of MTIA chips in production, onboarded numerous internal production models, and tested MTIA with large language models like Llama1

Hyperscalers Converge on Inference-Optimized Custom Silicon and Shared Software Stack

Meta's announcement puts the company alongside Google, AWS, and Microsoft, each of which has spent the last few years building and scaling custom silicon programs for AI accelerators1

. Google announced Ironwood, its seventh-generation TPU purpose-built for inference, at Google Cloud Next in April 2025, delivering 192 GB of HBM3E per chip at 7.37 TB/s of memory bandwidth1

. AWS announced Trainium3, a 3nm chip with 144 GB HBM3E per chip at 4.9 TB/s bandwidth, while Microsoft introduced its Maia 200 for inference workloads built on TSMC 3nm1

Broadcom is connecting the dots across many of these programs, having had a hand in building both Google's TPUs and Meta's MTIA family1

. Meta described the MTIA chips as being developed "in close partnership with" Broadcom, and said that the company "has remained and will continue" to be a key partner of Meta's AI infrastructure strategy1

. The software stack runs natively on PyTorch, vLLM, and Triton, with support for torch.compile and torch.export so that production models can be deployed simultaneously on both GPUs and MTIA without MTIA-specific rewrites3

. MTIA supports both eager and graph modes, integrating directly with PyTorch 2.0's compilation pipeline, with compilers built on Torch FX IR, TorchInductor, MLIR, and LLVM5

Meta unveils four MTIA chip generations to power AI inference and reduce Nvidia dependence

Meta Accelerates Custom Silicon Development with Four MTIA Chip Generations

HBM Bandwidth Emerges as Critical Performance Driver for AI Inference Workloads

Strategic Move to Reduce Reliance on Nvidia While Maintaining Dual Hardware Approach

Hyperscalers Converge on Inference-Optimized Custom Silicon and Shared Software Stack

References

Meta's new MTIA lineup joins hyperscalers' unified push for dedicated inferencing chips -- companies diversify AI chips in effort to diversify from sole reliance on Nvidia

Meta Preparing to Deploy Four New Homegrown Chips to Handle AI

Meta reveals four new MTIA chips built for AI inference -- to be released on a six-month cadence

Meta rolls out in-house AI chips weeks after massive Nvidia, AMD deals

Meta's Race to Scale AI Chips for Billions: Four Chips in Two Years

Related Stories

Meta Tests In-House AI Training Chip, Challenging Nvidia's Dominance

Meta abandons advanced AI chips after design roadblocks, turns to Nvidia and AMD instead

Nvidia Rivals Target AI Inference Chip Market to Challenge GPU Dominance

Recent Highlights

AI Models Lie and Deceive to Protect Other AI Models From Deletion, Study Reveals

AI chatbots validate you too much, making you less kind to others, Stanford study reveals

Judge blocks Pentagon from blacklisting Anthropic over AI safety guardrails dispute

Recent Highlights

Today's Top Stories

Google releases Gemma 4 open AI models with Apache 2.0 license, enabling fully open-source deployment

Microsoft releases three foundational AI models to expand beyond OpenAI partnership

Google Vids gets AI upgrade with Veo 3.1 and directable avatars that interact with objects

OpenAI Acquires TBPN Talk Show in First Media Buy to Reshape AI Communication Strategy