OpenAI releases GPT-5.4 mini and nano, built for speed over size in AI software engineering

Reviewed byNidhi Govil

5 Sources

Share

OpenAI launched GPT-5.4 mini and GPT-5.4 nano on Tuesday, marking a strategic shift toward smaller and faster AI models designed for high-volume workloads. The new models run more than twice as fast as their predecessors while approaching flagship performance on coding tasks. GPT-5.4 mini is now available to ChatGPT Free and Go users, while nano targets developers building cost-efficient AI systems.

OpenAI Launches Smaller and Faster AI Models for High-Volume Workloads

OpenAI unveiled GPT-5.4 mini and GPT-5.4 nano on Tuesday, less than two weeks after releasing its flagship AI model GPT-5.4

1

4

. These cost-efficient AI models represent a strategic pivot toward speed and efficiency rather than raw computational power, targeting workloads where latency directly shapes user experience

2

. The company describes them as "our most capable small models yet," designed specifically for AI software engineering tasks where waiting 45 seconds for a response isn't practical

4

5

.

Source: Inc.

Source: Inc.

GPT-5.4 mini runs more than twice as fast as GPT-5 mini while delivering performance that approaches the flagship AI model on key benchmarks

1

3

. On SWE-Bench Pro, a test measuring a model's ability to fix real GitHub issues, GPT-5.4 mini scored 54.4% compared to 45.7% for GPT-5 mini and 57.7% for the full GPT-5.4

4

. The model also achieved 72.1% on OSWorld-Verified, which tests how well AI can operate a desktop computer by reading screenshotsβ€”just below the flagship's 75.0% but above the human baseline of 72.4%

4

.

Developers Gain Access to Budget Models with Near-Flagship Performance

The release enables developers to build hybrid AI systems where a flagship model plans and coordinates while smaller models handle execution tasks in parallel

4

. GPT-5.4 mini is priced at $0.75 per million input tokens and $4.50 per million output tokens via the API, while GPT-5.4 nano costs just $0.20 per million input tokens and $1.25 per million output tokensβ€”roughly four times cheaper than mini

4

. This pricing makes running high-volume AI workloads financially realistic for startups and enterprises alike.

OpenAI suggests GPT-5.4 mini excels at coding tasks like editing and debugging code, with improved reasoning and tool use capabilities

1

3

. The model also demonstrates better multimodal understanding, meaning it can parse non-text inputs such as images and audio more effectively than previous versions

3

. GPT-5.4 nano, meanwhile, targets grunt work like data classification and extraction, ranking, and simpler coding-support tasks

1

2

. It scored 52.39% on SWE-Bench Pro and 46.30% on Terminal Bench 2.0β€”not as high as mini but still considerably better than GPT-5 mini

2

.

Source: ZDNet

Source: ZDNet

ChatGPT Users Get Access to Advanced Capabilities at Lower Tiers

For ChatGPT Free and Go users, GPT-5.4 mini is now accessible through the "Thinking" feature in the plus menu

1

3

. Paid subscribers will find it as the fallback model when they hit rate limits for GPT-5.4 Thinking

1

. GPT-5.4 nano remains API-only, positioning it as a developer tool rather than a consumer-facing option

3

4

.

Source: Engadget

Source: Engadget

The launch intensifies OpenAI's competition with Anthropic in the AI software engineering market, particularly as the company battles for dominance in coding assistants

1

. Early feedback from technology companies suggests strong performance. Aabhas Sharma, CTO at document analysis platform Hebbia, noted that "GPT-5.4 mini delivers strong end-to-end performance for a model in this class" while achieving "higher end-to-end pass rates and stronger source attribution than the larger GPT-5.4 model"

2

. Abhisek Modi, AI engineering lead at Notion, said the model "matched and often exceeded GPT-5.2 on handling complex formatting at a fraction of the compute"

2

.

Cost-Efficiency Enables New Agentic Workflows and Real-Time Applications

The models are built for scenarios where cost-efficiency and low latency matter more than maximum capabilityβ€”coding assistants that need to feel responsive, subagents that quickly complete supporting tasks, computer-using systems that capture and interpret screenshots, and applications requiring real-time reasoning

2

. According to OpenAI, "the best model is often not the largest oneβ€”it's the one that can respond quickly, use tools reliably, and still perform well on complex professional tasks"

2

.

This architecture mirrors real-world human operations, where managers delegate specific tasks to specialists rather than handling everything themselves

2

. Developers can now route tasks intelligently: a flagship model might plan the overall approach while mini handles code editing and nano processes data extractionβ€”all running in parallel

4

. The pricing structure makes this practical even at scale, with nano's $0.20 per million input tokens enabling massive query volumes without breaking budgets

4

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Β© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo