Xiaomi Unveils MiMo: A Compact, Open-Source AI Model Excelling in Reasoning Tasks

Xiaomi Introduces MiMo: A Breakthrough in Compact AI Reasoning

Xiaomi has unveiled MiMo, its first open-source artificial intelligence (AI) model family, designed to excel in reasoning tasks while maintaining a relatively small size. With just 7 billion parameters, MiMo represents a significant advancement in AI efficiency, challenging the notion that larger models are necessary for complex reasoning capabilities 1

Innovative Design and Performance

MiMo's development focused on solving the size problem in reasoning AI models. While most effective reasoning models typically feature 24 billion or more parameters, Xiaomi's researchers have achieved comparable performance with a much smaller architecture 1

Key performance highlights include:

MiMo-7B-Base scores 75.2 on the BIG-Bench Hard (BBH) benchmark for reasoning capabilities 1
1
.
MiMo-7B-RL-Zero, utilizing zero-shot reinforcement learning, scores 55.4 on the AIME benchmark, surpassing OpenAI's o1-mini by 4.7 points 1
1
.
The model's performance matches or exceeds that of larger models like OpenAI's o1-mini and Alibaba's Qwen-32B-Preview 2
2
.

Advanced Training Techniques

Xiaomi's team employed several innovative strategies to optimize MiMo's performance:

Enhanced data preprocessing and text extraction toolkits 1
1
.
A three-stage data mixture strategy during pre-training 1
1
.
Compilation of a 200 billion reasoning token dataset 2
2
.
Training on 25 trillion tokens over three progressive phases 2
2
.
Implementation of Multiple-Token Prediction as a training objective 2
2
.

Post-Training Optimization

To further refine MiMo's capabilities, Xiaomi applied advanced post-training techniques:

Reinforcement learning using 130,000 mathematics and coding problems 2
2
.
A Test Difficulty Driven Reward system to address sparse rewards in complex tasks 2
2
.
Easy Data Re-Sampling for stable reinforcement learning on simpler problems 2
2
.

Efficiency Improvements

Xiaomi introduced a Seamless Rollout Engine to enhance training and validation speed:

2.29× increase in training speed 2
2
.
1.96× boost in validation 2
2
.
Support for Multiple-Token Prediction in vLLM 2
2
.

Open-Source Availability

MiMo is now available as an open-source project, allowing researchers and developers to access and build upon Xiaomi's work:

The model can be downloaded from Xiaomi's listings on GitHub and Hugging Face 1
1
.
Technical papers detailing the model's architecture and training processes are publicly available 1
1
2
2
.

Implications and Future Applications

MiMo's compact size and impressive performance have significant implications for the AI industry:

Potential for deployment on enterprise systems and edge devices with limited resources 2
2
.
Demonstration that smaller models can achieve high-level reasoning capabilities, challenging current AI development paradigms.
Opportunity for wider adoption and experimentation due to its open-source nature.

As AI continues to evolve, MiMo represents a step towards more efficient and accessible AI models, potentially reshaping the landscape of AI research and applications.