DeepSeek makes 75% price cut permanent, intensifying AI price war with OpenAI and Anthropic

Reviewed byNidhi Govil

5 Sources

Share

Chinese AI startup DeepSeek has permanently slashed prices on its flagship V4 Pro model by 75%, bringing costs down to $0.87 per million output tokens. The aggressive move undercuts OpenAI's GPT-5 and Anthropic's Claude by up to 34x, forcing a reckoning in the enterprise AI market as token costs become the binding constraint on adoption.

DeepSeek V4 Pro Gets Permanent Price Cut

DeepSeek has made its 75% price discount on the flagship V4 Pro model permanent, a move that was originally scheduled to expire on May 31, 2026

1

. The Chinese AI startup now offers pricing ranging from $0.003625 to $0.87 per million tokens, down from the previous range of $0.0145 to $3.48

2

. This permanent price cut arrives just one month after DeepSeek released its V4 models, which the company claimed would usher in the "era of cost-effective 1M context length"

1

.

Source: ET

Source: ET

The decision to lock in the discount signals that DeepSeek is prioritizing market share over per-unit revenue, positioning itself as the default choice for applications processing large documents, codebases, or conversational histories where token costs compound quickly

2

. For enterprise accounts consuming millions of tokens daily, the savings are material. Salesforce projects $300 million in Anthropic token spending this year, and at DeepSeek's new pricing, an equivalent volume would cost a fraction of that figure

2

.

AI Price War Escalates as DeepSeek Undercuts Competitors

The pricing gap between DeepSeek and Western competitors is striking. OpenAI's GPT-5 charges $2.50 per million input tokens and $10 per million output tokens, while Anthropic's Claude Opus 4.7 is priced at $5 input and $25 output

2

. Google's Gemini 3.5 Flash, its cost-optimized model, charges $0.15 input and $0.60 output per million tokens

2

. DeepSeek V4 Pro now sits below all of them, creating a 34x pricing gap on outputs compared to models with essentially the same coding performance

4

.

This API price reduction directly undercuts comparable Western models used as workhorses for enterprise production. DeepSeek is 7x cheaper on inputs and 17x cheaper on outputs than Claude Sonnet or OpenAI's GPT 5.5-Med

3

. The lightweight DeepSeek V4 Flash undercuts entry-tier alternatives like Claude Haiku by 10x to 25x

3

. Meanwhile, OpenAI's GPT-5.5 doubled output prices to $30 per million tokens at launch, and Anthropic's Claude Opus 4.7 shipped with an updated tokenizer that can inflate actual costs by up to 35%

4

.

Cost-Effective AI Model Architecture Enables Aggressive Pricing

The price cuts are enabled by hardware-software innovations, especially around cache, that make DeepSeek's models radically more efficient to run

3

. When hosted natively in China, DeepSeek's cache-read pricing is 87x cheaper than Western clouds—a deflationary floor so aggressive that handset giant Xiaomi just moved to match the exact pricing tier for its newly deployed MiMo architecture

3

.

Source: VentureBeat

Source: VentureBeat

DeepSeek V4 uses two interleaved attention types that compress tokens for selective attention and collapse every 128 tokens for global context at minimal compute. At one million tokens of context, V4 Pro's KV cache is 10% the size of its predecessor's, and single-token inference runs at 27% of the previous compute cost

4

. DeepSeek did not disclose whether the permanent price cut was due to increased supply of Huawei Ascend 950 chips, which it used to maximize V4's performance

5

.

Enterprise AI Market Faces Token Cost Crisis

Companies are already feeling the pressure of high API costs. Uber burned through its entire 2026 budget for Claude Code and Cursor in just the first four months of the year, with its COO saying the cost related to high token usage was getting "harder to justify" without better products to show for it

3

. Pinterest CTO Matt Madrigal confirmed the company went all-in on an open-source AI strategy, post-training Alibaba's Qwen model to achieve frontier-like quality at a 90% reduction in costs

3

.

DeepSeek V4 Flash has captured the No. 1 position on the OpenRouter leaderboard over the past week, surging 48% in token usage. Its advanced counterpart, V4 Pro, sits at No. 6. DeepSeek's top three models processed nearly 6 trillion tokens on OpenRouter over the past week

3

. This structural migration confirms that developers are actively routing workloads to the cheapest capable models.

Geopolitical Considerations and Competitive Dynamics

The question for enterprise buyers is whether DeepSeek's model quality, reliability, and compliance posture justify the switch. The price advantage may be offset by geopolitical considerations of routing sensitive workloads through a Chinese AI provider

2

. For highly regulated U.S. giants in finance, healthcare, and defense, getting comfortable with DeepSeek will take time, despite an open-weights architecture under an MIT license that allows companies to self-host the model locally

3

.

Source: Decrypt

Source: Decrypt

The competitive dynamics are complicated by Anthropic's public accusation that DeepSeek has engaged in "distillation attacks," improperly training on Claude's responses to improve its own models

1

2

. If substantiated, it would mean some of DeepSeek's capability advantage was built on Anthropic's research investment, and the price differential would reflect intellectual property arbitrage rather than engineering efficiency

2

.

Market Bifurcation and the Future of AI Pricing

Anthropic's annualized revenue surged from $9 billion to $30 billion between the end of 2025 and early April 2026, driven largely by enterprise adoption of Claude Code

2

. DeepSeek's pricing pressure threatens the revenue-per-token economics that support Anthropic's valuation trajectory. If enterprise customers begin routing lower-complexity tasks to DeepSeek while reserving Claude for high-stakes reasoning, Anthropic's token volume could hold while revenue per token declines

2

.

The broader AI pricing landscape has been moving toward commoditization throughout 2026. Google has repeatedly cut Gemini prices to compete with open-weight models, and OpenAI's pivot toward consumer platform features reflects recognition that API token revenue alone may not sustain its $852 billion valuation

2

. DeepSeek's strategy appears to be that price will win, with enough volume flowing to the cheapest capable model regardless of origin. Whether that bet is correct depends on whether Western AI companies can close the price gap before DeepSeek closes the capability gap, or whether the market bifurcates into a Western tier and a Chinese tier with fundamentally different economics

2

.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved