2 Sources
[1]
Xiaomi's MiMo AI Models Arrive With Efficient Reasoning, Small Size
MiMo is a seven-billion-parameter-sized AI model It is said to be built entirely in-house from scratch MiMo is said to match the performance of OpenAI o1-mini Xiaomi on Tuesday released an open-source reasoning-focused artificial intelligence (AI) model. Dubbed MiMo, the family of reasoning models innovate the optimisation of reasoning capability in a relatively smaller parameter size. This is also the first open-source reasoning model by the tech giant, and it competes with Chinese models such as DeepSeek R1 and Alibaba's Qwen QwQ-32B, and global reasoning models including OpenAI's o1 and Google's Gemini 2.0 Flash Thinking. The MiMo family comprises four different models, each with unique use cases. With the MiMo series of AI models, Xiaomi researchers aimed to solve the size problem in reasoning AI models. Reasoning models (at least ones that can be measured) have around 24 billion or more parameters. The large size is kept to achieve uniform and simultaneous improvements in both coding and mathematical capabilities of large language models, something considered difficult to achieve with smaller models. In comparison, MiMo features seven billion parameters, and Xiaomi claims that its performance matches OpenAI's o1-mini and outperforms several reasoning models with 32 billion parameters. The researchers claimed that the base AI model was pre-trained on 25 trillion tokens. The researchers claimed that such efficiency was achieved by optimising data preprocessing pipelines, enhancing text extraction toolkits, and applying multidimensional data filtering. Further, MiMo's pre-training included a three-stage data mixture strategy. Based on internal testing, the Xiaomi researchers claim that the MiMo-7B-Base scores 75.2 on the BIG-Bench Hard (BBH) benchmark for reasoning capabilities. The zero-shot reinforcement learning (RL)-based MiMo-7B-RL-Zero is claimed to excel in mathematics and coding-related tasks, and scores 55.4 on the AIME benchmark, outperforming o1-mini by 4.7 points. As MiMo is an open-source AI model, it can be downloaded from Xiaomi's listing on GitHub and Hugging Face. The technical paper details the model's architecture as well as the pre-training and post-training processes. It is a text-based model and does not have multimodal capabilities. Similar to most open-source releases, the details about the model's dataset is not known.
[2]
Xiaomi unveils MiMo open-source AI model for reasoning
Xiaomi has introduced MiMo, its first open-source artificial intelligence large language model, designed for reasoning tasks. Developed by the newly formed Xiaomi Big Model Core Team, the 7-billion-parameter model excels in mathematical reasoning and code generation, matching the performance of larger models like OpenAI's o1-mini and Alibaba's Qwen-32B-Preview. Xiaomi noted that achieving such capabilities in a smaller model is challenging, as most successful reinforcement learning (RL) models rely on larger architectures, such as 32-billion-parameter models. The company believes MiMo's effectiveness in reasoning is driven by the base model's potential, enabled through focused pre-training and post-training strategies. Its smaller size may make it suitable for enterprise use and edge devices with limited resources. MiMo's reasoning ability is built on an optimized pre-training process. Xiaomi improved its data preprocessing pipeline, enhanced text extraction tools, and used multi-layered filtering to increase the density of reasoning patterns. The team compiled a dataset of 200 billion reasoning tokens and applied a three-stage data mixture strategy. The model was trained on 25 trillion tokens over three progressive training phases. Xiaomi also used Multiple-Token Prediction as a training objective to boost performance and reduce inference time. In the post-training stage, Xiaomi applied reinforcement learning using 130,000 mathematics and coding problems, verified by rule-based systems for accuracy and difficulty. To address sparse rewards in complex tasks, the team implemented a Test Difficulty Driven Reward system and used Easy Data Re-Sampling for stable RL training on easier problems. To improve training and validation speed, Xiaomi introduced a Seamless Rollout Engine that cuts down GPU downtime. This system delivered a 2.29× increase in training speed and a 1.96× boost in validation. It also supports Multiple-Token Prediction in vLLM and enhances the RL system's inference stability. MiMo-7B-RL delivered strong performance across various evaluations (temperature = 0.6): The MiMo-7B model series is open-source and accessible on Hugging Face. The full technical report and model checkpoints are also available on GitHub.
Share
Copy Link
Xiaomi has introduced MiMo, a 7-billion-parameter AI model designed for efficient reasoning. Despite its smaller size, MiMo matches or outperforms larger models in mathematical and coding tasks, marking a significant advancement in AI efficiency.
Xiaomi has unveiled MiMo, its first open-source artificial intelligence (AI) model family, designed to excel in reasoning tasks while maintaining a relatively small size. With just 7 billion parameters, MiMo represents a significant advancement in AI efficiency, challenging the notion that larger models are necessary for complex reasoning capabilities 12.
MiMo's development focused on solving the size problem in reasoning AI models. While most effective reasoning models typically feature 24 billion or more parameters, Xiaomi's researchers have achieved comparable performance with a much smaller architecture 1.
Key performance highlights include:
Xiaomi's team employed several innovative strategies to optimize MiMo's performance:
To further refine MiMo's capabilities, Xiaomi applied advanced post-training techniques:
Xiaomi introduced a Seamless Rollout Engine to enhance training and validation speed:
MiMo is now available as an open-source project, allowing researchers and developers to access and build upon Xiaomi's work:
MiMo's compact size and impressive performance have significant implications for the AI industry:
As AI continues to evolve, MiMo represents a step towards more efficient and accessible AI models, potentially reshaping the landscape of AI research and applications.
Summarized by
Navi
[1]
As nations compete for dominance in space, the risk of satellite hijacking and space-based weapons escalates, transforming outer space into a potential battlefield with far-reaching consequences for global security and economy.
7 Sources
Technology
13 hrs ago
7 Sources
Technology
13 hrs ago
Anthropic has updated its Claude Opus 4 and 4.1 AI models with the ability to terminate conversations in extreme cases of persistent harm or abuse, as part of its AI welfare research.
6 Sources
Technology
21 hrs ago
6 Sources
Technology
21 hrs ago
A pro-Russian propaganda group, Storm-1679, is using AI-generated content and impersonating legitimate news outlets to spread disinformation, raising concerns about the growing threat of AI-powered fake news.
2 Sources
Technology
13 hrs ago
2 Sources
Technology
13 hrs ago
OpenAI has made subtle changes to GPT-5's personality, aiming to make it more approachable after users complained about its formal tone. The company is also working on allowing greater customization of ChatGPT's style.
4 Sources
Technology
5 hrs ago
4 Sources
Technology
5 hrs ago
SoftBank has purchased Foxconn's Ohio plant for $375 million to produce AI servers for the Stargate project. Foxconn will continue to operate the facility, which will be retrofitted for AI server production.
5 Sources
Technology
5 hrs ago
5 Sources
Technology
5 hrs ago