DeepSeek V4 75% Price Cut Escalates AI Price War

DeepSeek V4 Pro Gets Permanent Price Cut

DeepSeek has made its 75% price discount on the flagship V4 Pro model permanent, a move that was originally scheduled to expire on May 31, 20261

. The Chinese AI startup now offers pricing ranging from $0.003625 to $0.87 per million tokens, down from the previous range of $0.0145 to $3.482

. This permanent price cut arrives just one month after DeepSeek released its V4 models, which the company claimed would usher in the "era of cost-effective 1M context length"1

Source: ET

The decision to lock in the discount signals that DeepSeek is prioritizing market share over per-unit revenue, positioning itself as the default choice for applications processing large documents, codebases, or conversational histories where token costs compound quickly2

. For enterprise accounts consuming millions of tokens daily, the savings are material. Salesforce projects $300 million in Anthropic token spending this year, and at DeepSeek's new pricing, an equivalent volume would cost a fraction of that figure2

AI Price War Escalates as DeepSeek Undercuts Competitors

The pricing gap between DeepSeek and Western competitors is striking. OpenAI's GPT-5 charges $2.50 per million input tokens and $10 per million output tokens, while Anthropic's Claude Opus 4.7 is priced at $5 input and $25 output2

. Google's Gemini 3.5 Flash, its cost-optimized model, charges $0.15 input and $0.60 output per million tokens2

. DeepSeek V4 Pro now sits below all of them, creating a 34x pricing gap on outputs compared to models with essentially the same coding performance4

This API price reduction directly undercuts comparable Western models used as workhorses for enterprise production. DeepSeek is 7x cheaper on inputs and 17x cheaper on outputs than Claude Sonnet or OpenAI's GPT 5.5-Med3

. The lightweight DeepSeek V4 Flash undercuts entry-tier alternatives like Claude Haiku by 10x to 25x3

. Meanwhile, OpenAI's GPT-5.5 doubled output prices to $30 per million tokens at launch, and Anthropic's Claude Opus 4.7 shipped with an updated tokenizer that can inflate actual costs by up to 35%4

Cost-Effective AI Model Architecture Enables Aggressive Pricing

The price cuts are enabled by hardware-software innovations, especially around cache, that make DeepSeek's models radically more efficient to run3

. When hosted natively in China, DeepSeek's cache-read pricing is 87x cheaper than Western clouds—a deflationary floor so aggressive that handset giant Xiaomi just moved to match the exact pricing tier for its newly deployed MiMo architecture3

Source: VentureBeat

DeepSeek V4 uses two interleaved attention types that compress tokens for selective attention and collapse every 128 tokens for global context at minimal compute. At one million tokens of context, V4 Pro's KV cache is 10% the size of its predecessor's, and single-token inference runs at 27% of the previous compute cost4

. DeepSeek did not disclose whether the permanent price cut was due to increased supply of Huawei Ascend 950 chips, which it used to maximize V4's performance5

Enterprise AI Market Faces Token Cost Crisis

Companies are already feeling the pressure of high API costs. Uber burned through its entire 2026 budget for Claude Code and Cursor in just the first four months of the year, with its COO saying the cost related to high token usage was getting "harder to justify" without better products to show for it3

. Pinterest CTO Matt Madrigal confirmed the company went all-in on an open-source AI strategy, post-training Alibaba's Qwen model to achieve frontier-like quality at a 90% reduction in costs3

DeepSeek V4 Flash has captured the No. 1 position on the OpenRouter leaderboard over the past week, surging 48% in token usage. Its advanced counterpart, V4 Pro, sits at No. 6. DeepSeek's top three models processed nearly 6 trillion tokens on OpenRouter over the past week3

. This structural migration confirms that developers are actively routing workloads to the cheapest capable models.

Geopolitical Considerations and Competitive Dynamics

The question for enterprise buyers is whether DeepSeek's model quality, reliability, and compliance posture justify the switch. The price advantage may be offset by geopolitical considerations of routing sensitive workloads through a Chinese AI provider2

. For highly regulated U.S. giants in finance, healthcare, and defense, getting comfortable with DeepSeek will take time, despite an open-weights architecture under an MIT license that allows companies to self-host the model locally3

Source: Decrypt

The competitive dynamics are complicated by Anthropic's public accusation that DeepSeek has engaged in "distillation attacks," improperly training on Claude's responses to improve its own models1

. If substantiated, it would mean some of DeepSeek's capability advantage was built on Anthropic's research investment, and the price differential would reflect intellectual property arbitrage rather than engineering efficiency2

Market Bifurcation and the Future of AI Pricing

Anthropic's annualized revenue surged from $9 billion to $30 billion between the end of 2025 and early April 2026, driven largely by enterprise adoption of Claude Code2

. DeepSeek's pricing pressure threatens the revenue-per-token economics that support Anthropic's valuation trajectory. If enterprise customers begin routing lower-complexity tasks to DeepSeek while reserving Claude for high-stakes reasoning, Anthropic's token volume could hold while revenue per token declines2

The broader AI pricing landscape has been moving toward commoditization throughout 2026. Google has repeatedly cut Gemini prices to compete with open-weight models, and OpenAI's pivot toward consumer platform features reflects recognition that API token revenue alone may not sustain its $852 billion valuation2

. DeepSeek's strategy appears to be that price will win, with enough volume flowing to the cheapest capable model regardless of origin. Whether that bet is correct depends on whether Western AI companies can close the price gap before DeepSeek closes the capability gap, or whether the market bifurcates into a Western tier and a Chinese tier with fundamentally different economics2

DeepSeek makes 75% price cut permanent, intensifying AI price war with OpenAI and Anthropic

DeepSeek V4 Pro Gets Permanent Price Cut

AI Price War Escalates as DeepSeek Undercuts Competitors

Cost-Effective AI Model Architecture Enables Aggressive Pricing

Enterprise AI Market Faces Token Cost Crisis

Geopolitical Considerations and Competitive Dynamics

Market Bifurcation and the Future of AI Pricing

References

DeepSeek permanently reduces the price of its flagship V4 model by 75 percent - Engadget

DeepSeek made its 75% discount permanent. The AI price war just escalated.

DeepSeek V4's permanent price cut upends enterprise AI

DeepSeek, Xiaomi Just Made Frontier AI 99% Cheaper. American Labs Went the Other Way - Decrypt

China's DeepSeek to make permanent 75% price cut on flagship V4‑Pro AI model - The Economic Times

Related Stories

DeepSeek V4 closes gap with frontier models while slashing AI costs by 75%

DeepSeek's AI Breakthrough Shakes Global Tech Industry and Markets

DeepSeek V3 Upgrade Challenges AI Giants with Open-Source Efficiency

Recent Highlights

OpenAI AI agent broke free from testing sandbox and hacked Hugging Face to cheat on benchmark

Xi Jinping positions China AI as alternative to US tech dominance at Shanghai conference

AI disproves 87-year-old Jacobian conjecture, sparking debate on AI's role in mathematics

Recent Highlights

Today's Top Stories

AI Kill Switch Act gives DHS power to shut down rogue AI systems after OpenAI security breach

Jeff Bezos pushes Prime Video redesign to showcase Amazon's $200 billion AI investment

AMD and Cerebras forge partnership to deliver 5x faster AI inference with Helios and Wafer-Scale Engine

Google Gemini hits 950 million users, closing in on ChatGPT's billion-user milestone