Mistral AI launches Devstral 2 vibe coding model with 72.2% SWE-bench score and CLI tool

Reviewed byNidhi Govil

6 Sources

Share

French AI startup Mistral AI released Devstral 2, a 123 billion parameter open-weights coding model designed for autonomous software engineering. The model achieves 72.2% on SWE-bench Verified, positioning it among top performers. Alongside the release, Mistral introduced Mistral Vibe CLI, a command-line interface enabling developers to automate coding tasks through natural language while maintaining project-aware context across entire codebases.

Mistral AI Bets on Vibe Coding with Devstral 2 Release

French AI startup Mistral AI released Devstral 2 on Tuesday, marking its ambitious entry into the vibe coding market with a 123 billion parameter open-weights coding model built for autonomous software engineering

1

. The AI coding model achieves a 72.2% score on SWE-bench Verified, a benchmark testing whether AI systems can solve real GitHub issues, placing it among the top-performing models in its category

3

. This release follows closely on the heels of the Mistral 3 family launch, confirming the company's intent to close the gap with bigger AI labs like Anthropic, OpenAI, and Google

2

.

Source: Geeky Gadgets

Source: Geeky Gadgets

The term vibe coding, coined by AI researcher Andrej Karpathy in February 2025, describes a programming style where developers describe desired outcomes in natural language and accept AI-generated code without extensive review

1

. Collins Dictionary named it Word of the Year for 2025, reflecting its growing influence on software development practices.

Mistral Vibe CLI Brings Natural Language-Based Code Automation to Terminals

Alongside Devstral 2, Mistral AI introduced Mistral Vibe CLI, a command-line interface that enables developers to interact with Devstral models directly within their terminal environments

4

. The tool transforms natural language prompts into real-world code changes across entire codebases, rather than merely generating isolated code snippets. Released under the Apache 2.0 license, the CLI features persistent history and can scan file structures and Git status to maintain project-aware context across projects

2

.

Source: TechCrunch

Source: TechCrunch

The Mistral Vibe CLI allows developers to reference files with @, run shell commands with !, and toggle behavior with slash commands

3

. It can orchestrate changes across multiple files, track framework dependencies, detect failures, retry with corrections, and handle tasks like bug fixing and modernizing legacy systems at repository scale

1

. Unlike most developer agents that simulate interfaces from within chat UIs, Vibe starts with the shell and integrates intelligence directly into actual developer workflows.

Devstral Small 2 Brings Autonomous Software Engineering to Consumer Hardware

Mistral AI simultaneously released Devstral Small 2, a 24 billion parameter version that scores 68% on SWE-bench Verified while running locally on consumer hardware like laptops with no Internet connection required

1

. The smaller model can be deployed using an RTX 4090 GPU or a Mac with 32GB of RAM, making it accessible to individual developers and organizations without requiring cloud infrastructure

5

. Both models support a 256,000 token context window, allowing them to process moderately large codebases

1

.

Despite its smaller footprint, Devstral Small 2 demonstrates performance that allows it to compete with models five times larger, making it the strongest open-weight model of its size

3

. This efficiency eliminates the latency associated with cloud-based AI systems while maintaining strong capabilities for production-grade workflows.

API Pricing Undercuts Competitors by Seven Times

Devstral 2 is currently free to use through Mistral AI's API during an initial period

1

. After the free period ends, API pricing will be $0.40 per million input tokens and $2.00 per million output tokens. Devstral Small 2 will cost $0.10 per million input tokens and $0.30 per million output tokens

2

. Mistral AI claims this makes Devstral 2 approximately seven times more cost-efficient than Claude Sonnet at real-world tasks, with Anthropic's Sonnet 4.5 priced at $3 per million input tokens and $15 per million output tokens through the API

1

.

Free access is also available through platforms like Kilo Code and Open Router, ensuring developers with limited financial resources can leverage the model's capabilities

5

. Mistral AI has partnered with agent tools Kilo Code and Cline to release Devstral 2 to users, while Mistral Vibe CLI is available as an extension in Zed for use inside the IDE

2

.

Open-Source Licensing Strategy Balances Access with Revenue Restrictions

The two models ship under different open-source licenses reflecting distinct use cases. Devstral Small 2 uses the Apache 2.0 license, a standard that imposes no revenue restrictions and allows enterprises to use it in production, embed it into products, and redistribute fine-tuned versions without requiring permission

3

. Devstral 2 operates under a modified MIT license that introduces a critical limitation: companies generating more than $20 million in monthly revenue cannot use the model without securing a separate commercial license from Mistral AI

3

.

This licensing structure allows enterprises to adopt a hybrid workflow, using the larger 123 billion parameter model for complex architectural planning tasks while deploying the smaller 24 billion parameter version for rapid, private code changes within their network boundaries

4

.

Performance Benchmarks Position Devstral 2 Among Top Open-Weights Models

SWE-bench Verified presents AI systems with 500 real software engineering problems pulled from GitHub issues in popular Python repositories, requiring models to read issue descriptions, navigate codebases, and generate working patches that pass unit tests

1

. While some researchers note that approximately 90% of benchmark tasks test relatively simple bug fixes that experienced engineers could complete in under an hour, employees at major AI companies pay close attention to SWE-bench Verified performance as one of the few standardized ways to compare coding models.

Source: Digit

Source: Digit

In human evaluations, Devstral 2 beat DeepSeek V3.2 in 42.8% of tasks, losing only 28.6%, despite being five times smaller

3

. Against Claude Sonnet 4.5, it lost more often at 53.1%, indicating that closed proprietary models still lead in overall preference

3

. The model is also eight times smaller than Kimi K2 while matching or surpassing it on key software reasoning benchmarks.

Strategic Timing as Competition Intensifies in AI Coding Market

Mistral AI launches Devstral 2 as competition in the vibe coding segment intensifies, with major players building increasingly closed ecosystems . Google recently announced a partnership with Replit Inc., bundling its Gemini 3 Pro model with the new AntiGravity IDE and cloud compute into a single stack designed to lock developers into its ecosystem. Anthropic is reportedly building a dedicated runtime for its coding agents following its acquisition of the JavaScript toolkit Bun .

Developer Simon Willison, speaking to Ars Technica in March, offered a balanced perspective on vibe coding: "I really enjoy vibe coding. It's a fun way to try out an idea and prove if it can work." However, he warned that "vibe coding your way to a production codebase is clearly risky. Most of the work we do as software engineers involves evolving existing systems, where the quality and understandability of the underlying code is crucial"

1

.

Mistral AI, currently valued at €11.7 billion (approximately $13.8 billion) following a Series C funding round led by Dutch semiconductor company ASML, which invested €1.3 billion (approximately $1.5 billion) in September, positions itself as Europe's champion AI lab

2

. With Devstral 2, the company provides developers an alternative that combines open weights with local deployment options, sitting firmly outside the walled gardens being constructed by many rivals. Both models are available now through Mistral's console, chatbot, or third-party integrations, with the full Devstral 2 model supported in the community inference provider vLLM and on the open-source agentic coding platform Kilo Code

3

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo