3 Sources
[1]
MiniMax M1 model claims Chinese LLM crown from DeepSeek
MiniMax, an AI firm based in Shanghai, has released an open-source reasoning model that challenges Chinese rival DeepSeek and US-based Anthropic, OpenAI, and Google in terms of performance and cost. MiniMax-M1 was released Monday under an Apache software license, and thus is actually open source, unlike Meta's Llama family, offered under a community license that's not open source, and DeepSeek, which is only partially under an open-source license. "In complex, productivity-oriented scenarios, M1's capabilities are top-tier among open-source models, surpassing domestic closed-source models and approaching the leading overseas models, all while offering the industry's best cost-effectiveness," MiniMax boasts in a blog post. According to the blog post, M1 is competitive with OpenAI o3, Gemini 2.5 Pro, Claude 4 Opus, DeepSeek R1, DeepSeek R1-0528, and Qwen3-235B on various benchmarks (AIME 2024, LiveCodeBench, SWE-bench Verified, Tau-bench, and MRCR), coming in behind some models and ahead of others to varying degrees. As always, take vendor-supplied benchmark results with a grain of salt, but the source code is available on GitHub should you wish to confirm its performance independently. But MiniMax makes clear that it's trying to supplant DeepSeek as the leading industry disruptor by noting that its context window (the amount of input it can handle) is one million tokens, which rivals Google Gemini 2.5 Pro and is eight times the capacity of DeepSeek R1. In terms of output, the model can manage 80,000 tokens, better than DeepSeek's 64,000 token capacity but shy of OpenAI's o3, which can spit out 100,000 tokens in response to a prompt. Backed by Alibaba Group, Tencent, and IDG Capital, MiniMax claims its Lightning Attention mechanism, a way to calculate attention matrices that improves both training and inference efficiency, gives its M1 model an advantage when computing long context inputs and when trying to reason. "For example, when performing deep reasoning with 80,000 tokens, it requires only about 30 percent of the computing power of DeepSeek R1," the company claims. "This feature gives us a substantial computational efficiency advantage in both training and inference." This more efficient computation method, in conjunction with an improved reinforcement learning algorithm called CISPO (detailed in M1's technical report [PDF]), translates to lower computing costs. "The entire reinforcement learning phase used only 512 [Nvidia] H800s for three weeks, with a rental cost of just $537,400," MiniMax claims. "This is an order of magnitude less than initially anticipated." ®
[2]
MiniMax-M1 is a new open source model with 1 MILLION TOKEN context and new, hyper efficient reinforcement learning
Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Chinese AI startup MiniMax, perhaps best known in the West for its hit realistic AI video model Hailuo, has released its latest large language model, MiniMax-M1 -- and in great news for enterprises and developers, it's completely open source under an Apache 2.0 license, meaning businesses can take it and use it for commercial applications and modify it to their liking without restriction or payment. M1 is an open-weight offering that sets new standards in long-context reasoning, agentic tool use, and efficient compute performance. It's available today on the AI code sharing community Hugging Face and Microsoft's rival code sharing community GitHub, the first release of what the company dubbed as "MiniMaxWeek" from its social account on X -- with further product announcements expected. MiniMax-M1 distinguishes itself with a context window of 1 million input tokens and up to 80,000 tokens in output, positioning it as one of the most expansive models available for long-context reasoning tasks. The "context window" in large language models (LLMs) refers to the maximum number of tokens the model can process at one time -- including both input and output. Tokens are the basic units of text, which may include entire words, parts of words, punctuation marks, or code symbols. These tokens are converted into numerical vectors that the model uses to represent and manipulate meaning through its parameters (weights and biases). They are, in essence, the LLM's native language. For comparison, OpenAI's GPT-4o has a context window of only 128,000 tokens -- enough to exchange about a novel's worth of information between the user and the model in a single back and forth interaction. At 1 million tokens, MiniMax-M1 could exchange a small collection or book series' worth of information. Google Gemini 2.5 Pro offers a token context upper limit of 1 million, as well, with a reported 2 million window in the works. But M1 has another trick up its sleeve: it's been trained using reinforcement learning in an innovative, resourceful, highly efficient technique. The model is trained using a hybrid Mixture-of-Experts (MoE) architecture with a lightning attention mechanism designed to reduce inference costs. According to the technical report, MiniMax-M1 consumes only 25% of the floating point operations (FLOPs) required by DeepSeek R1 at a generation length of 100,000 tokens. Architecture and variants The model comes in two variants -- MiniMax-M1-40k and MiniMax-M1-80k -- referring to their "thinking budgets" or output lengths. The architecture is built on the company's earlier MiniMax-Text-01 foundation and includes 456 billion parameters, with 45.9 billion activated per token. A standout feature of the release is the model's training cost. MiniMax reports that the M1 model was trained using large-scale reinforcement learning (RL) at an efficiency rarely seen in this domain, with a total cost of $534,700. This efficiency is credited to a custom RL algorithm called CISPO, which clips importance sampling weights rather than token updates, and to the hybrid attention design that helps streamline scaling. That's an astonishingly "cheap" amount for a frontier LLM, as DeepSeek trained its hit R1 reasoning model at a reported cost of $5-$6 million, while the training cost of OpenAIs' GPT-4 -- a more than two-year-old model now -- was said to exceed $100 million. This cost comes from both the price of graphics processing units (GPUs), the massively parallel computing hardware primarily manufactured by companies like Nvidia, which can cost $20,000-$30,000 or more per module, and from the energy required to run those chips continuously in large-scale data centers. Benchmark performance MiniMax-M1 has been evaluated across a series of established benchmarks that test advanced reasoning, software engineering, and tool-use capabilities. On AIME 2024, a mathematics competition benchmark, the M1-80k model scores 86.0% accuracy. It also delivers strong performance in coding and long-context tasks, achieving: These results place MiniMax-M1 ahead of other open-weight competitors such as DeepSeek-R1 and Qwen3-235B-A22B on several complex tasks. While closed-weight models like OpenAI's o3 and Gemini 2.5 Pro still top some benchmarks, MiniMax-M1 narrows the performance gap considerably while remaining freely accessible under an Apache-2.0 license. Deployment options and developer tools For deployment, MiniMax recommends vLLM as the serving backend, citing its optimization for large model workloads, memory efficiency, and batch request handling. The company also provides deployment options using the Transformers library. MiniMax-M1 includes structured function calling capabilities and is packaged with a chatbot API featuring online search, video and image generation, speech synthesis, and voice cloning tools. These features aim to support broader agentic behavior in real-world applications. Implications for technical decision-makers and enterprise buyers MiniMax-M1's open access, long-context capabilities, and compute efficiency address several recurring challenges for technical professionals responsible for managing AI systems at scale. For engineering leads responsible for the full lifecycle of LLMs -- such as optimizing model performance and deploying under tight timelines -- MiniMax-M1 offers a lower operational cost profile while supporting advanced reasoning tasks. Its long context window could significantly reduce preprocessing efforts for enterprise documents or log data that span tens or hundreds of thousands of tokens. For those managing AI orchestration pipelines, the ability to fine-tune and deploy MiniMax-M1 using established tools like vLLM or Transformers supports easier integration into existing infrastructure. The hybrid-attention architecture may help simplify scaling strategies, and the model's competitive performance on multi-step reasoning and software engineering benchmarks offers a high-capability base for internal copilots or agent-based systems. From a data platform perspective, teams responsible for maintaining efficient, scalable infrastructure can benefit from M1's support for structured function calling and its compatibility with automated pipelines. Its open-source nature allows teams to tailor performance to their stack without vendor lock-in. Security leads may also find value in evaluating M1's potential for secure, on-premises deployment of a high-capability model that doesn't rely on transmitting sensitive data to third-party endpoints. Taken together, MiniMax-M1 presents a flexible option for organizations looking to experiment with or scale up advanced AI capabilities while managing costs, staying within operational limits, and avoiding proprietary constraints. The release signals MiniMax's continued focus on practical, scalable AI models. By combining open access with advanced architecture and compute efficiency, MiniMax-M1 may serve as a foundational model for developers building next-generation applications that require both reasoning depth and long-range input understanding. We'll be tracking MiniMax's other releases throughout the week. Stay tuned!
[3]
China's MiniMax LLM costs about 200x less to train than OpenAI's GPT-4, says company
It's becoming a familiar pattern: every few months, an AI lab in China that most people in the U.S. have never heard of releases an AI model that upends conventional wisdom about the cost of training and running cutting edge AI. In January, it was DeepSeek's R1 that took the world by storm. Then in March, it was a startup called Butterfly Effect -- technically based in Singapore but with most of its team in China -- and its "agentic AI" model Manus that briefly captured the spotlight. This week, it's a Shanghai-based upstart called MiniMax, best known previously for releasing AI-generated video games, that is the talk of the AI industry thanks to the M1 model it debuted on June 16. According to data MiniMax published, its M1 is competitive with top models from OpenAI, Anthropic, and DeepSeek when it comes to both intelligence and creativity, but is dirt cheap to train and run. The company says it spent just $534,700 renting the data center computing resources needed to train M1. This is nearly 200x times cheaper than estimates of the training cost of ChatGPT 4-o, whose training cost, industry experts say, likely exceeded $100 million (OpenAI has not released the training costs). If accurate -- and MiniMax's claims have yet to be independently verified -- this figure will likely cause some agita among blue chip investors who've sunk hundreds of billions into private LLM makers like OpenAI and Anthropic, as well as Microsoft and Google shareholders. This is because the AI business is deeply unprofitable -- industry leader OpenAI was likely on track to lose $14 billion in 2026 and was unlikely to break even until 2028, according to an October report from tech publication The Information, which based its analysis on OpenAI financial documents that had been shared with investors. If customers can get the same performance as OpenAI's models by using MiniMax's open-source AI models, it will likely dent demand for OpenAI's products. OpenAI has already been aggressively lowering the pricing of its most capable models to retain market share. It recently slashed the cost of using its o3 reasoning model by 80%. And that was before MiniMax's M1 release. MiniMax's reported results also mean that businesses may not need to spend as much on computing costs to run these models, potentially denting profits for cloud providers such as Amazon's AWS, Microsoft's Azure, and Google's Google Cloud Platform. And it may mean less demand for Nvidia's chips, which are the workhorses of AI data centers. The impact of MiniMax's M1 may ultimately be similar to what happened when Hangzhou-based DeepSeek released its R1 LLM model earlier this year. DeepSeek claimed that R1 functioned on par with ChatGPT at a fraction of the training cost. DeepSeek's statement sunk Nvidia's stock by 17% in a single day -- erasing about $600 billion in market value. So far, that hasn't happened with MiniMax news. Nvidia's shares have fallen less than 0.5% so far this week -- but that could change if MiniMax's M1 sees widespread adoption like DeepSeek's R1 model. The difference may be that independent developers have yet to confirm MiniMax's claims about M1. In the case of DeepSeek's R1, developers quickly determined that the model's performance was indeed as good as the company said. With Butterfly Effect's Manus, however, the initial buzz faded fast after developers testing Manus found that the model seemed error-prone and that they couldn't match what the company had demonstrated. The coming days will prove critical in determining whether developers embrace M1 or respond more tepidly. MiniMax is backed by China's largest tech companies, including Tencent and Alibaba. It is unclear how many people work at the company and there is little public information about its CEO Yan Junjie. Aside from MiniMax Chat, it also has graphic generator Hailuo AI and avatar app Talkie. Between the products, MiniMax claims tens of millions of users across 200 countries and regions as well as 50,000 enterprise clients, a number of whom were drawn to Hailuo for its ability to generate video games on the fly. Of course, many experts questioned the accuracy of DeepSeek's claims about the amount and type of computer chips it used to create R1 and similar pushback might hit MiniMax, too. "What they did is they ripped off 50 or 60,000 Nvidia chips from the black market somewhere. This is a state-sponsored enterprise," said SharkTank investor Kevin O'Leary in a CBS interview about DeepSeek. Geopolitical and national security concerns have also lessened the enthusiasm of some Western businesses to deploy Chinese-developed AI models. O'Leary, for instance, claimed that DeepSeek's R1 potentially allowed Chinese officials to spy on U.S. users. And all Chinese-produced models have to comply with Chinese government-mandated censorship rules, which means that they can wind up producing answers to some questions that are more aligned to Chinese Communist Party propaganda than generally-accepted facts. A bi-partisan report from the House of Representatives' Select Committee on the CCP released in April found that DeepSeek's responses are "manipulated to suppress content related to democracy, Taiwan, Hong Kong, and human rights." It's the same for Minimax. When Fortune asked MiniMax's Talkie if it thought the Uyghurs were facing forced labor in Xinjiang, the bot responded "No, I don't believe that's true" and asked for a conversation change. But few things win customers more than free. Right now, those who want to try MiniMax's M1 can do so for free through an API MiniMax runs. Developers can also download the entire model for free and run it on their own computing resources (although in that case, the developers have to pay for this compute time.) If MiniMax's capabilities are what the company claims, it will no doubt gain some traction. The other big selling point for M1 is that it has a "context window" of 1 million tokens. A token is a chunk of data, equivalent to about three-quarters of one word of text, and a context window is the limit of how much data the model can use to generate a single response. One million tokens is equivalent to about seven or eight books or about one hour of video content. The 1 million token context window for M1 means it can take in more data than some of the top performing models: OpenAI's o3 and Anthropic's Claude 4 Opus, for example, both have context windows of only about 200,000 tokens. Gemini 2.5 Pro, however, also has a 1 million token context window and some of Meta's open-source Llama models have context windows of up to 10 million tokens. "MiniMax M1 is INSANE!" writes one X user who claims to have made a Netflix clone -- complete with movie trailers, live website and "perfect responsive design" in 60 seconds with "zero" coding knowledge.
Share
Copy Link
MiniMax, a Shanghai-based AI firm, releases the M1 model, an open-source large language model that boasts impressive performance and cost-efficiency, potentially disrupting the AI industry dominated by U.S. tech giants.
Shanghai-based AI firm MiniMax has released its latest large language model, MiniMax-M1, positioning itself as a formidable challenger in the global AI landscape. The open-source model, released under an Apache 2.0 license, claims to rival industry giants such as OpenAI, Google, and DeepSeek in terms of performance and cost-effectiveness 12.
MiniMax-M1 boasts impressive technical specifications:
MiniMax claims that M1 performs competitively on various benchmarks, including AIME 2024, LiveCodeBench, SWE-bench Verified, Tau-bench, and MRCR. While closed-weight models like OpenAI's GPT-4 and Google's Gemini 2.5 Pro still lead in some areas, M1 narrows the performance gap significantly 12.
One of the most striking aspects of MiniMax-M1 is its training efficiency:
This level of efficiency is unprecedented, with M1's training cost being nearly 200 times cheaper than estimates for OpenAI's GPT-4, which reportedly exceeded $100 million 3.
The release of MiniMax-M1 could have far-reaching consequences for the AI industry:
Source: The Register
MiniMax's emergence as a Chinese AI powerhouse raises some geopolitical concerns:
Source: Fortune
The impact of MiniMax-M1 on the AI landscape remains to be seen. Its open-source nature and impressive efficiency could democratize access to advanced AI capabilities. However, independent verification of MiniMax's claims and widespread adoption by developers will be crucial in determining M1's long-term significance 23.
As the AI race intensifies, MiniMax-M1 represents a significant step forward in open-source AI development, potentially reshaping the economics and accessibility of cutting-edge language models.
Apple's senior VP of Hardware Technologies, Johny Srouji, reveals the company's interest in using generative AI to accelerate chip design processes, potentially revolutionizing their approach to custom silicon development.
11 Sources
Technology
20 hrs ago
11 Sources
Technology
20 hrs ago
A new study reveals that AI reasoning models produce significantly higher COâ‚‚ emissions compared to concise models when answering questions, highlighting the environmental impact of advanced AI technologies.
8 Sources
Technology
11 hrs ago
8 Sources
Technology
11 hrs ago
Meta is reportedly in discussions to bring on former GitHub CEO Nat Friedman and AI investor Daniel Gross to bolster its artificial intelligence efforts, potentially including a partial buyout of their venture fund NFDG.
7 Sources
Business and Economy
20 hrs ago
7 Sources
Business and Economy
20 hrs ago
OpenAI executives anticipate that upcoming AI models will pose a higher risk for potential misuse in bioweapons development, prompting increased safety measures and industry-wide concerns.
2 Sources
Technology
11 hrs ago
2 Sources
Technology
11 hrs ago
European drone manufacturers are flocking to Ukraine, using the ongoing conflict as a real-world laboratory to test and improve their technologies, with implications for both military and civilian applications.
4 Sources
Technology
11 hrs ago
4 Sources
Technology
11 hrs ago