Microsoft releases three in-house AI models in direct challenge to OpenAI partnership

Reviewed byNidhi Govil

16 Sources

Share

Microsoft unveiled three foundational AI models—MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2—marking its first major independent release since renegotiating its OpenAI partnership. The models handle speech-to-text transcription, voice generation, and image creation, positioning Microsoft as a direct competitor to its $13 billion investment partner while expanding its proprietary capabilities in the crowded AI market.

Microsoft Launches In-House AI Models After OpenAI Contract Renegotiation

Microsoft has released three foundational AI models that generate text, voice, and images, signaling a strategic shift toward independence from its longstanding OpenAI partnership

1

. The announcement marks the first publicly released output from the MAI Superintelligence team, formed in November 2025 under Mustafa Suleyman, CEO of Microsoft AI

4

. Six months after renegotiating a contract that previously barred independent frontier AI development, Microsoft now competes directly with the partner it spent $13 billion cultivating

4

.

Source: CXOToday

Source: CXOToday

The three models—MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2—are available through Microsoft Foundry and MAI Playground, with future plans to integrate MAI-Image-2 into Bing and PowerPoint . These in-house AI models do not carry OpenAI's name anywhere on the label, representing a clear departure from Microsoft's previous reliance on its partner's technology

4

.

Source: Gadgets 360

Source: Gadgets 360

MAI-Transcribe-1 Delivers Enterprise-Grade Speech Recognition

The speech-to-text model claims the lowest word error rate across 25 languages on the FLEURS benchmark, averaging 3.8 percent

4

. Microsoft reports that MAI-Transcribe-1 outperforms OpenAI's Whisper-large-v3 on all 25 languages, Google's Gemini 3.1 Flash on 22 of 25, and ElevenLabs' Scribe v2 on 15 of 25

4

. The model runs 2.5 times faster than Microsoft's Azure Fast offering and operates at approximately 50 percent lower GPU cost than leading alternatives

3

.

Pricing starts at $0.36 per hour of audio, positioning it competitively in the LLM market

1

. Suleyman told The Verge that the transcription model runs at "half the GPU cost of the other state-of-the-art models" and was built by a team of just 10 people

5

. The model handles noisy real-world conditions such as call centers and conference rooms, with Microsoft testing integrations with Copilot and Teams

5

.

Voice and Image Models Complete Multimodal AI Stack

MAI-Voice-1 generates 60 seconds of natural-sounding audio in under one second on a single GPU and supports custom voice creation from a few seconds of sample audio

4

. The text-to-speech model is priced at $22 per 1 million characters

1

. Combined with MAI-Transcribe-1 and a large language model of the customer's choosing, it forms a complete voice pipeline that runs entirely on Microsoft infrastructure without any dependency on OpenAI's technology

4

.

MAI-Image-2, originally released on MAI Playground on March 19, debuted at number three on the Arena.ai text-to-image leaderboard, placing behind only Google's Gemini 3.1 Flash and OpenAI's GPT Image 1.5

4

. The model was developed in collaboration with photographers, designers, and visual storytellers, with WPP, one of the world's largest marketing groups, among the first enterprise partners building with it at scale

4

. Pricing starts at $5 for 1 million tokens for text input and $33 for 1 million tokens for image output

1

.

Strategic Shift Follows Partnership Renegotiation

Until the September 2025 renegotiation, Microsoft's original partnership agreement with OpenAI contractually prevented the company from independently pursuing general AI development

4

. The revised memorandum of understanding changed that calculus fundamentally—Microsoft retained licensing rights to everything OpenAI builds through 2032, gained $250 billion in new Azure cloud business commitments, and crucially won the freedom to build competing models

4

.

Source: GeekWire

Source: GeekWire

Suleyman acknowledged the pivot directly, stating that the contract renegotiation enabled Microsoft to independently pursue its own superintelligence

4

. In a March internal memo first reported by Business Insider, Suleyman wrote that he intended to focus all of his energy on superintelligence and deliver world-class models for Microsoft over the next five years

4

. He told VentureBeat that Microsoft plans to eventually build a frontier large language model to be "completely independent" if needed

5

.

Microsoft Positions Models as Cost-Effective Alternative

In an increasingly crowded market, Microsoft hopes a selling point for these models is that they are cheaper than those from Google and OpenAI

1

. "At Microsoft AI, we're building Humanist AI. We have a distinct view when creating our AI models—putting humans at the center, optimizing for how people actually communicate, training for practical use," Suleyman wrote in a blog post

1

.

Naomi Moneypenny, who leads the Microsoft Azure AI Foundry Models product team, noted that "these are the same models already powering our own products such as Copilot, Bing, PowerPoint, and Azure Speech"

3

. Copilot's Audio Expressions runs on MAI-Voice-1 while Copilot's Voice Mode transcription service uses MAI-Transcribe-1

3

.

Microsoft Foundry, the platform formerly known as Azure AI Foundry and before that Azure AI Studio, now serves developers at more than 80,000 enterprises including 80 percent of Fortune 500 companies

4

. That distribution advantage makes the MAI model family strategically significant—Microsoft does not need to beat OpenAI on every benchmark to shift enterprise spending

4

. The company also recently hired former Allen Institute for AI CEO Ali Farhadi and other top AI researchers to further bolster Suleyman's team

5

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo