Meta unveils four new MTIA chips to power AI inference and reduce reliance on Nvidia

Reviewed byNidhi Govil

6 Sources

Share

Meta announced four successive generations of its in-house Meta Training and Inference Accelerator chips, developed with Broadcom and scheduled for deployment through 2027. The MTIA 300 is already running in production, while the MTIA 500 promises 25 times more compute power and 4.5 times greater HBM bandwidth. This aggressive push into custom silicon development reflects Meta's strategy to diversify hardware sources and cut costs in the expensive AI race.

Meta Accelerates Custom Silicon Development with Four New MTIA Chips

Meta has announced four successive generations of its in-house Meta AI chips, marking an aggressive expansion of its custom silicon development efforts to power AI workloads across its platforms. The Meta Training and Inference Accelerator (MTIA) family now includes MTIA 300, 400, 450, and 500, all developed in partnership with Broadcom and manufactured by Taiwan Semiconductor Manufacturing Co.

1

2

. The MTIA 300 is already in production for ranking and recommendation models, while MTIA 400 has completed lab testing and is moving toward deployment

2

. The MTIA 450 and MTIA 500 chips are scheduled for mass deployment in early 2027 and later in 2027, respectively

1

.

Source: Meta AI

Source: Meta AI

Dramatic Performance Gains Target AI Inference Bottlenecks

From MTIA 300 through to MTIA 500, HBM bandwidth increases 4.5 times, and compute FLOPs increase 25 times

1

. The MTIA 450 doubles the HBM bandwidth of MTIA 400, which Meta describes as "much higher than that of existing leading commercial products"—a direct reference to Nvidia's H100 and H200 accelerators

1

. The MTIA 500 adds another 50% HBM bandwidth on top of MTIA 450, along with up to 80% more HBM capacity, reaching 516 gigabytes total

5

. These in-house AI chips are specifically architected for AI inference rather than training, addressing the reality that HBM bandwidth, not raw FLOPs, represents the main bottleneck during the decode phase of transformer inference

1

.

Source: Tom's Hardware

Source: Tom's Hardware

Chiplet Architecture Enables Six-Month Development Cadence

The MTIA chips utilize a chiplet-based design, with the MTIA 300 comprising one compute chiplet, two network chiplets, and several HBM stacks

5

. The more advanced MTIA 500 carries out calculations using four logic chiplets surrounded by multiple stacks of HBM memory

5

. Critically, MTIA 400, 450, and 500 all utilize the same chassis, rack, and network infrastructure, meaning each new chip generation drops into the existing physical footprint for easy interchange

1

. This modularity enables MTIA's roughly six-month chip cadence, which is much faster than the industry's typical one-to-two year cycle

1

. Yee Jiun Song, Meta's vice president of engineering, emphasized that silicon programs must keep pace with AI development, which has "accelerated at a pace that has kind of blown everyone's minds" even in recent months

2

.

Source: Bloomberg

Source: Bloomberg

Strategic Move to Reduce Dependence on Nvidia and Control Costs

The aggressive push into custom silicon reflects Meta's effort to diversify its hardware sources, reduce reliance on outside chipmakers, and bring down costs amid a fast-moving and expensive AI race

2

. Song told CNBC that by designing custom chips, Meta can squeeze more price per performance across its data center performance fleet and gain "more diversity in terms of silicon supply, and insulates us from price changes to some extent"

3

. Meta has already deployed hundreds of thousands of MTIA chips across its apps for inference on organic content and ads

1

. However, Meta will continue buying chips from other companies, including Nvidia and AMD, pursuing a dual approach of traditional hardware purchases alongside custom silicon for specialized tasks

2

. This announcement comes just two weeks after Meta disclosed a long-term, $100 billion AI infrastructure agreement with AMD

1

.

Software Integration Designed for Seamless Deployment

The software stack runs natively on PyTorch, vLLM, and Triton, with support for torch.compile and torch.export so that production models can be deployed simultaneously on both GPUs and MTIA without MTIA-specific rewrites

1

4

. Meta-specific compilers translate high-level graph representations into optimized device code, with the graph compiler built on Torch FX IR and TorchInductor . The kernel compiler and lower-level backends are based on Triton, MLIR, and LLVM

4

. The MTIA chips include hardware acceleration for FlashAttention and mixture-of-experts feed-forward network computation, plus custom low-precision data types co-designed for inference

1

. MTIA 450 supports MX4, delivering six times the MX4 FLOPs of FP16/BF16

1

. The Hoot Collective Communications Library (HCCL) manages data flow between processors, leveraging built-in network chiplets for efficient communication and offloading collective operations to dedicated message engines

4

.

Specialized Focus on Inference and Ranking Workloads

The MTIA 300 was deployed a few weeks ago and is intended to help train smaller AI models that underpin Meta's core ranking and recommendation tasks, including showing people relevant content and online ads within Facebook and Instagram

3

. The upcoming chips—MTIA 400, 450, and 500—are intended for more cutting-edge generative AI inference tasks like creating images and videos based on people's written prompts

3

. Song clarified that the chips will not be used for training giant large language models

3

. "We're not building for the general market, so our chips don't need to be as general purpose," Song explained. "We can cut out things that we don't need, which really allows us to drive down cost"

2

. The economics of chipmaking remain challenging, with products typically taking two years to go from design to production and costing billions of dollars, but custom chips can pay off at scale with high utilization rates

2

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo