3 Sources
3 Sources
[1]
Microsoft launches MAI-Image-2-Efficient, a cheaper and faster AI image model
Microsoft today launched MAI-Image-2-Efficient, a lower-cost, higher-speed variant of its flagship text-to-image model that the company says delivers production-ready quality at nearly half the price. The release, available immediately in Microsoft Foundry and MAI Playground with no waitlist, marks the fastest turnaround yet from Microsoft's in-house AI superintelligence team -- and the clearest signal that Redmond is serious about building a self-sufficient AI stack that doesn't depend on OpenAI. The new model is priced at $5 per million text input tokens and $19.50 per million image output tokens, a roughly 41% reduction from MAI-Image-2's pricing of $5 and $33, respectively, for those same tiers. Microsoft says the model runs 22% faster than its flagship sibling and achieves 4x greater throughput efficiency per GPU, as measured on NVIDIA H100 hardware at 1024×1024 resolution. The company also claims it outpaces competing hyperscaler models -- specifically naming Google's Gemini 3.1 Flash, Gemini 3.1 Flash Image, and Gemini 3 Pro Image -- by an average of 40% on p50 latency benchmarks. The model is also rolling out across Copilot and Bing, Microsoft said, with additional product surfaces to follow. Microsoft's two-model strategy borrows a page from the AI pricing playbook Microsoft is positioning MAI-Image-2-Efficient and its flagship MAI-Image-2 as complementary tools rather than replacements for each other -- a tiered pairing designed to cover the full spectrum of enterprise image generation needs. MAI-Image-2-Efficient targets high-volume, cost-sensitive production workloads: product photography, marketing creative, UI mockups, branded asset pipelines, and real-time interactive applications. It handles short-form in-image text like headlines and labels cleanly, according to Microsoft, and is built to operate within the tight latency and budget constraints of batch processing environments. MAI-Image-2, meanwhile, remains the company's precision instrument -- the model you reach for when the brief demands the highest photorealistic fidelity, complex stylization like anime or illustration, or longer, more intricate in-image typography. Microsoft is effectively telling enterprise customers: use the efficient model for your assembly line, and the flagship for your showcase. This approach mirrors pricing strategies that have worked across the AI industry -- OpenAI's GPT model tiers, Anthropic's Haiku-Sonnet-Opus lineup, Google's Flash-Pro distinction -- but applies it specifically to image generation, a domain where cost-per-image economics can make or break production deployment at scale. How Microsoft shipped a production-optimized image model in under a month The speed of this release deserves attention. MAI-Image-2 itself only debuted on MAI Playground on March 19, as VentureBeat previously reported, with broader availability through Microsoft Foundry arriving on April 2 alongside two other new foundation models: MAI-Transcribe-1 (a speech-to-text model supporting 25 languages) and MAI-Voice-1 (an audio generation model). Less than a month later, Microsoft has shipped an optimized production variant. That cadence suggests the MAI Superintelligence team -- the research group led by Mustafa Suleyman, CEO of Microsoft AI, that was formed in November 2025 -- is operating more like a startup shipping iterative products than a traditional corporate research lab publishing papers. When Suleyman wrote in his April 2 blog post that the team was "building Humanist AI" with a focus on "optimizing for how people actually communicate, training for practical use," he appears to have meant it literally: the models aren't just shipping, they're shipping fast enough to have product roadmaps. The early reception for MAI-Image-2 has been notably positive. Decrypt reported in its hands-on review that the model had already reached the No. 3 position on the Arena.ai leaderboard for image generation, trailing only Google and OpenAI. Decrypt's reviewer noted that the model's photorealism was "a real strength" and that its text rendering was "a legitimate highlight" that "handled complex typography with far more consistency than we expected." The review also found that in some direct comparisons, MAI-Image-2 outperformed OpenAI's GPT-Image on image quality and text rendering despite sitting below it on the leaderboard -- an observation that underscores how benchmark rankings don't always capture real-world utility. That said, the original model shipped with significant constraints that Decrypt flagged: a 30-second cooldown between generations, a 15-image daily cap in the native UI, only 1:1 aspect ratio output, no image-to-image capabilities, and aggressive content filtering that blocked even innocuous creative prompts. Whether MAI-Image-2-Efficient inherits or relaxes any of these limitations isn't addressed in today's announcement, and enterprise customers accessing the model through the Foundry API will likely face different constraints than playground users. Inside the fraying Microsoft-OpenAI relationship that made in-house models inevitable Today's launch cannot be understood in isolation. It arrives at a moment when the relationship between Microsoft and OpenAI -- once the defining partnership of the generative AI era -- is visibly fraying at the seams. Just yesterday, CNBC reported that OpenAI's newly appointed chief revenue officer, Denise Dresser, sent an internal memo to staff explicitly stating that the Microsoft partnership "has also limited our ability to meet enterprises where they are." The memo reportedly touted OpenAI's new alliance with Amazon Web Services and the Bedrock platform as a key growth driver, describing inbound customer demand as "frankly staggering" since the partnership was announced in late February. Microsoft added OpenAI to its list of competitors in its annual report in mid-2024. OpenAI, meanwhile, has diversified its cloud infrastructure across CoreWeave, Google, and Oracle, reducing its dependence on Microsoft Azure. The MAI model family is the most tangible expression of Microsoft's side of that strategic uncoupling. When Microsoft can generate production-quality images with its own model at $19.50 per million output tokens, the calculus for continuing to license OpenAI's image models -- and paying OpenAI a share of the resulting revenue -- shifts dramatically. Every MAI model that reaches production quality is a line item that Microsoft can potentially move off OpenAI's balance sheet and onto its own. The organizational infrastructure to support this shift is already in place. On March 17, as disclosed in communications posted on Microsoft's official blog, CEO Satya Nadella announced a sweeping reorganization that unified the company's consumer and commercial Copilot efforts under a single leadership team, with Jacob Andreou elevated to EVP of Copilot reporting directly to Nadella. Critically, the reorganization also refocused Suleyman's role. As Nadella wrote in his message to employees, the company is "doubling down on our superintelligence mission with the talent and compute to build models that have real product impact, in terms of evals, COGS reduction, as well as advancing the frontier." That phrase -- "COGS reduction" -- is corporate-speak for reducing the cost of goods sold, and it points directly to the economic motivation behind models like MAI-Image-2-Efficient. Every dollar Microsoft saves by using its own models instead of licensing from partners flows straight to gross margin. Why cheap, fast image generation is the secret ingredient for Microsoft's agentic AI future There's one more dimension that makes today's release strategically significant, and it may be the most important one: the rise of AI agents. TechCrunch reported yesterday that Microsoft is testing ways to integrate OpenClaw-like features into Microsoft 365 Copilot, building toward an always-on agent that can execute multi-step tasks over extended periods. The company has also launched Copilot Cowork (an agent that takes actions within Microsoft 365 apps), Copilot Tasks (an agent for completing multi-step personal productivity tasks), and Agent 365 (referenced in Nadella's March reorganization memo). Microsoft is expected to showcase these agentic capabilities at its Build conference in June. In an agentic world -- where AI systems don't just answer questions but execute complex workflows autonomously -- image generation becomes a primitive that agents call programmatically, not a standalone product that users interact with manually. An enterprise agent building a marketing campaign might need to generate dozens of product images, create social media assets, produce presentation graphics, and iterate on design concepts, all without human intervention at each step. The economics of that workflow are governed entirely by per-token pricing and latency, which is precisely what MAI-Image-2-Efficient optimizes for. If Microsoft's vision for Copilot involves agents that generate images as a routine subtask within larger workflows, those agents need image generation that's fast enough to not create bottlenecks and cheap enough to not blow up cost projections when called thousands of times per day. The 4x efficiency improvement and 41% price cut aren't just nice marketing numbers -- they're architectural requirements for the agentic future Microsoft is betting the company on. What Microsoft still hasn't answered about its new image model Several important questions remain unaddressed by today's announcement. Microsoft didn't disclose whether MAI-Image-2-Efficient resolves the aspect ratio limitations and aggressive content filtering that reviewers flagged in the original model. The company also didn't specify whether the quality-to-speed tradeoffs involve visible degradation on complex prompts -- the announcement describes "production-ready quality" and "flagship quality" interchangeably, but distillation models of any kind typically involve some quality concession. The footnotes in the press release also reveal the narrow conditions under which the benchmark claims were tested: efficiency figures were measured on NVIDIA H100 at 1024×1024 with "optimized batch sizes and matched latency targets," and the latency comparisons against Google models were conducted at p50 (median) rather than p95 or p99, which would capture worst-case performance. Enterprise customers running diverse workloads at varying concurrency levels may see different results. MAI Playground is currently available only in select markets, including the U.S., with EU availability listed as "coming soon." Copilot integration is underway but not complete. And the enterprise API through Foundry, while live, is still in early deployment. But the trajectory is unmistakable. In less than five months since the MAI Superintelligence team was announced, Microsoft has shipped a flagship image model, three additional foundation models, and now a cost-optimized production variant -- all while reorganizing its entire Copilot organization, navigating a fracturing relationship with its most important AI partner, and laying the groundwork for agentic AI features that could redefine enterprise productivity. Whether all of that is fast enough to catch Anthropic's momentum, contain OpenAI's drift toward Amazon, and justify a $600 price target is the multi-hundred-billion-dollar question. But for a company that spent the first two years of the generative AI era mostly reselling someone else's technology, Microsoft is now doing something it hasn't done in a long time in AI: shipping its own work, on its own schedule, at its own price -- and daring the market to keep up.
[2]
Microsoft's MAI-Image-2-Efficient model accelerates company's move away from OpenAI - SiliconANGLE
Microsoft's MAI-Image-2-Efficient model accelerates company's move away from OpenAI Microsoft Corp.'s push for artificial intelligence independence is gaining traction with today's release of MAI-Image-2-Efficient, a lean and mean version of its flagship image generation model that debuted earlier this month. The new version is meant to deliver high-quality visuals faster, and at a fraction of the cost of its predecessor, the company said. The original MAI-Image-2 model was released by Microsoft's new MAI superintelligence team earlier this month, and the team led by Mustafa Suleyman gets credit for launching a streamlined version of that model just weeks later. As the name suggests, MAI-Image-2-Efficient is all about efficiency, with Microsoft promising four-times faster throughput per graphics processing unit when running on Nvidia Corp.'s H100 processors. In terms of raw performance, it's around 22% faster than the main MAI-Image-2 model, outpacing Google LLC's Gemini 3.1 Flash with 40% superior latency, the company said, citing its own benchmarks. Microsoft said MAI-Image-2-Efficient is more cost-effective too, with pricing starting at $5 per million input tokens, the same as the flagship model, and $19.50 per million output tokens, which is 41% cheaper. This two-tier pricing strategy, with users able to choose between a higher-end model for high-fidelity creative work and an efficient version for volume production, mirrors the playbooks of Google and startups like OpenAI Group PBC and Anthropic PBC. The company said MAI-Image-2-Efficient is particularly suited to "assembly line" tasks such as user interface mockups, product photography and marketing assets, where cost-per-image is a key consideration. Early feedback on the original MAI-Image-2 has been positive, with the model claiming the number three spot on Arena.ai's leaderboard and earning praise for its photorealism and ability to render complex typography - an area where many other image generation models struggle. But it's not clear yet if MAI-Image-2-Efficient suffers from the same irksome constraints as its sibling. For instance, the original received a lot of criticism for its 30 second "cooldown" period between prompts, the inability to generate images with anything other than a 1:1 aspect ratio, and its overly aggressive content filtering. Microsoft's push to develop its own models is being driven by its desire to reduce its reliance on OpenAI, which it has invested heavily in over the years. There was a time when the Windows software maker was almost exclusively reliant on OpenAI's models for its AI products and capabilities. But it has seemingly become wary of being "locked-in" to an increasingly powerful company that has shown a desire to assert its independence. Just yesterday, CNBC reported that it had surfaced an internal memo from OpenAI's new Chief Revenue Officer Denise Dresser, which told staff that the company's partnership with Microsoft "has also limited our ability to meet enterprises where they are." The memo pointed to OpenAI's new partnership with Amazon Web Services Inc.'s Bedrock platform as a key growth driver, noting that inbound demand was "frankly, staggering" since the collaboration was announced in February. OpenAI has also looked to diversify its cloud infrastructure base to providers such as CoreWeave Inc., Google LLC and Oracle Corp. at Microsoft's expense. For its part, Microsoft added OpenAI to its official list of competitors in mid-2024. Meanwhile, Valoir analyst Rebecca Wettemann told SiliconANGLE in January that Microsoft's reliance on OpenAI is becoming a major concern for investors, because OpenAI is facing increased competition from rivals like Google and Anthropic. "Investors are losing patience, largely because much of Microsoft's eventual potential payback is tied to money coming from OpenAI that is mostly hypothetical at this point," she explained. Developing its own models can be much more profitable for Microsoft. The MAI models are now the default option whenever someone asks the company's Copilot tool to generate an image, replacing OpenAI's DALL-E model. That means Microsoft can keep all of the costs generated for itself, rather than paying licensing fees to OpenAI. MAI-Image-2-Efficient also helps to accelerate Microsoft's agentic AI strategy. With the launch of services such as Copilot Cowork and Agent 365, Microsoft is moving towards a world where AI executes complex, multistep tasks and workflows on behalf of users. Should an autonomous agent be asked to spin up materials for a new marketing campaign, low-latency and reduced costs are essential, enabling it to iterate at scale.
[3]
Microsoft unveils MAI Image 2 Efficient AI model, calls it production workhorse: How to access
The company says MAI Image 2 Efficient is 22 per cent faster and four times more efficient than the original model. Microsoft introduced its MAI Image 2 AI model last month, which is said to create more realistic images and generate clearer text within visuals. Now, the tech giant has announced a new version called MAI Image 2 Efficient, which is faster and cheaper. Microsoft describes the new model as a 'production workhorse' which provides 'production-ready quality.' The company says MAI Image 2 Efficient is 22 per cent faster and four times more efficient than the original model. It is also priced lower. The new model costs $5 per one million text input tokens and $19.50 per one million image output tokens, which is roughly 41 per cent cheaper compared to its earlier offering. The company also claims the new model is about 40 per cent faster on average than other leading text-to-image AI models. Also read: Google launches Gemini Personal Intelligence in India: What is it and how to use it Microsoft has positioned the two models for different types of work. The new MAI Image 2 Efficient is meant to handle large-scale tasks where speed and cost matter the most. It can generate assets such as product images, marketing visuals, UI mockups and other branded materials in large batches. The company says the model also performs well with short pieces of text placed inside images, such as headlines, captions,or labels. On the other hand, the original MAI Image 2 model is still available for projects where image quality and detail are the top priority. Microsoft says this version works better for portraits, photorealistic scenes, stylised artwork like anime or illustrations, and images that need longer or more complex texts. Also read: OpenAI introduces GPT 5.4 Cyber, an AI model built for cybersecurity defence: All details Microsoft also said that some partners have already started experimenting with the new model. Shutterstock is among the early testers, and the company says initial results look promising. MAI Image 2 Efficient is available through Microsoft Foundry and the MAI Playground. The model is also rolling out across Copilot and Bing, with integration in more platforms like PowerPoint coming soon.
Share
Share
Copy Link
Microsoft unveiled MAI-Image-2-Efficient, a cost-effective image generation model priced at $19.50 per million image output tokens—41% cheaper than its flagship. The faster AI image model delivers 4x high throughput efficiency per GPU and ships just weeks after the original, marking the company's rapid push to build an independent AI stack beyond OpenAI.
Microsoft launched MAI-Image-2-Efficient, a streamlined variant of its flagship text-to-image AI model that delivers production-ready quality at nearly half the price[1](https://venturebeat.com/technology/microsoft-l aunches-mai-image-2-efficient-a-cheaper-and-faster-ai-image-model). Available immediately through Microsoft Foundry and MAI Playground with no waitlist, the new model costs $5 per million text input tokens and $19.50 per million image output tokens—a roughly 41% reduction compared to MAI-Image-2's pricing of $5 and $33 respectively
1
. The release represents the fastest turnaround yet from Microsoft's in-house MAI Superintelligence team led by Mustafa Suleyman, and signals Redmond's determination to build a self-sufficient AI stack that doesn't rely on OpenAI1
.Source: VentureBeat
The faster AI image model runs 22% faster than its flagship sibling and achieves 4x greater high throughput efficiency per GPU, as measured on NVIDIA H100 hardware at 1024×1024 resolution
1
2
. Microsoft claims it outpaces competing hyperscaler models—specifically Google Gemini 3.1 Flash, Gemini 3.1 Flash Image, and Gemini 3 Pro Image—by an average of 40% on p50 latency benchmarks1
. The model is rolling out across Copilot and Bing, with additional integrations like PowerPoint coming soon3
. MAI-Image-2-Efficient has already replaced OpenAI's DALL-E as the default option when users ask Copilot to generate an image, allowing Microsoft to keep all costs for itself rather than paying licensing fees2
.Microsoft positions MAI-Image-2-Efficient and its flagship MAI-Image-2 as complementary tools covering the full spectrum of enterprise image generation needs
1
. The efficient variant targets high-volume, cost-sensitive production workloads: product photography, marketing creative, UI mockups, branded asset pipelines, and real-time interactive applications1
. It handles short-form in-image text like headlines and labels cleanly, operating within tight latency and budget constraints of batch processing environments3
. Meanwhile, MAI-Image-2 remains the precision instrument for projects demanding highest photorealism, complex stylization like anime or illustration, or longer, more intricate in-image typography3
. This tiered model offerings approach mirrors pricing strategies that have worked across Microsoft's AI ecosystem and competitors—OpenAI's GPT model tiers, Anthropic's Haiku-Sonnet-Opus lineup, Google's Flash-Pro distinction1
.
Source: SiliconANGLE
Related Stories
The speed of this release deserves attention. MAI-Image-2 itself only debuted on MAI Playground on March 19, with broader availability through Microsoft Foundry arriving on April 2 alongside MAI-Transcribe-1 and MAI-Voice-1
1
. Less than a month later, Microsoft shipped an optimized production variant—a cadence suggesting the MAI Superintelligence team operates more like a startup shipping iterative products than a traditional corporate research lab1
. Microsoft's push to develop its own models stems from its desire to reduce reliance on OpenAI, which it has invested heavily in over the years2
. An internal memo from OpenAI's Chief Revenue Officer revealed that the partnership with Microsoft "has also limited our ability to meet enterprises where they are," pointing to OpenAI's new AWS Bedrock partnership as a key growth driver with "frankly, staggering" inbound demand2
. Microsoft added OpenAI to its official list of competitors in mid-20242
.Early reception for MAI-Image-2 has been notably positive, with the model reaching the No. 3 position on the Arena.ai leaderboard for image generation, trailing only Google and OpenAI
1
. Reviewers noted that its photorealism was "a real strength" and text rendering was "a legitimate highlight" that handled complex typography with far more consistency than expected1
. Shutterstock is among the early testers experimenting with the new model, with initial results looking promising3
. MAI-Image-2-Efficient also accelerates Microsoft's agentic AI strategy—with services like Copilot Cowork and Agent 365, the company is moving toward a world where AI executes complex, multistep tasks autonomously2
. Should an autonomous agent be asked to spin up materials for a new marketing campaign, low latency and reduced costs become essential for iterating at scale2
.Summarized by
Navi
[2]
20 Mar 2026•Technology

14 Oct 2025•Technology

05 Nov 2025•Technology
