42 Sources
[1]
DeepSeek previews new AI model that 'closes the gap' with frontier models | TechCrunch
The company says both DeepSeek V4 Flash and V4 Pro are mixture-of-experts models with context windows of 1 million tokens each -- enough to allow large codebases or documents to be used in prompts. The mixture-of-experts approach involves activating only a certain number of parameters per task to lower inference costs. The Pro model has a total of 1.6 trillion parameters (49 billion active), which makes it the biggest open-weight model available, outstripping Moonshot AI's Kimi K 2.6 (1.1 trillion), MiniMax's M1 (456 billion), and more than double DeepSeek V3.2 (671 billion). The smaller, V4 Flash has 284 billion parameters (13 billion active). DeepSeek says both models are more efficient and performant than DeepSeek V3.2 due to architectural improvements, and have almost "closed the gap" with current leading models, both open and closed, on reasoning benchmarks. The company claims its new V4-Pro-Max model outperforms its open-source peers across reasoning benchmarks, and outstrips OpenAI's GPT-5.2 and Gemini 3.0 Pro on some tasks. In coding competition benchmarks, DeepSeek said both V4 models' performance is "comparable to GPT-5.4." However, the models seem to fall slightly behind frontier models in knowledge tests, specifically OpenAI's GPT-5.4 and Google's latest Gemini 3.1 Pro. This lag suggests a "developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months," the lab wrote. Both V4 Flash and V4 Pro support text only, unlike many of its closed-source peers, which offer support for understanding and generating audio, video, and images. Notably, DeepSeek V4 is much more affordable than any frontier model available today. The smaller V4 Flash model costs $0.14 per million input tokens and $0.28 per million output tokens, undercutting GPT-5.4 Nano, Gemini 3.1 Flash, GPT-5.4 Mini, and Claude Haiku 4.5. The larger V4 Pro model, meanwhile, costs $0.145 per million input tokens and $3.48 per million output tokens, also undercutting Gemini 3.1 Pro, GPT-5.5, Claude Opus 4.7, and GPT-5.4. The launch comes a day after the U.S. accused China of stealing American AI labs' IP on an industrial scale using thousands of proxy accounts. DeepSeek itself has been accused by Anthropic and OpenAI of "distilling," essentially copying, their AI models.
[2]
Three reasons why DeepSeek's new model V4 matters
For V4-Pro, DeepSeek charges $1.74 per million input tokens and $3.48 per million output tokens, a fraction of the cost of comparable models from OpenAI and Anthropic. V4-Flash is even cheaper, at about $0.14 per million input tokens and about $0.28 per million output tokens, making it one of the cheapest top-tier models available. This would make it a very appealing model to build applications on. In terms of performance, V4 is, perhaps unsurprisingly, a huge jump from R1 -- and it seems to be a strong alternative to just about all the latest big AI models. On the major benchmarks, according to results shared by the company, DeepSeek V4-Pro competes with leading closed-source models, matching the performance of Anthropic's Claude-Opus-4.6, OpenAI's GPT-5.4, and Google's Gemini-3.1. And compared to other open-source models, such as Alibaba's Qwen-3.5 or Z.ai's GLM-5.1, DeepSeek V4 exceeds them all on coding, math, and STEM problems, making it one of the strongest open-source models ever released. DeepSeek also says that V4-Pro now ranks among the strongest open-source models on benchmarks for agentic coding tasks and performs well on other tests that measure ability to carry out multistep problems. Its writing ability and world knowledge also leads the field, according to benchmarking results shared by the company. In a technical report released alongside the model, DeepSeek shared results from an internal survey of 85 experienced developers: More than 90% included V4-Pro among their top model choices for coding tasks. DeepSeek says it has specifically optimized V4 for popular agent frameworks such as Claude Code, OpenClaw, and CodeBuddy. One of the key innovations of V4 is its long context window -- the amount of text the model can process at once. Both versions can handle 1 million tokens, which is large enough to fit all three volumes of The Lord of the Rings and The Hobbit combined. The company says this context window size is now the default across all DeepSeek services and it matches what is offered by cutting-edge versions of models like Gemini and Claude. But it's important to know not just that DeepSeek has made this leap, but how it did so. V4 makes significant architectural changes to the company's former models -- especially in the attention mechanism, which is the feature of AI models that helps them understand each part of a prompt in relation to the rest. As the prompt text gets longer, these comparisons become much more costly, making attention one of the main bottlenecks for long-context models.
[3]
China's DeepSeek previews new AI model a year after jolting US rivals
Chinese AI company DeepSeek released a preview of its hotly anticipated next-generation AI model V4 on Friday, saying that the open-source model can compete with leading closed-source systems from US rivals including Anthropic, Google, and OpenAI. DeepSeek says V4 marks a major improvement over prior models, especially in coding, a capability that has become central to AI agents and helped drive the success of tools like ChatGPT Codex and Claude Code. The release is also a milestone for China's chip industry, with DeepSeek explicitly highlighting compatibility with domestic Huawei technology. The release comes a year after DeepSeek rattled the US AI industry with R1, a model it claimed was trained at a fraction of the cost of leading US systems. DeepSeek has not disclosed V4's training costs or what hardware it was trained on. US officials have accused the company of using banned Nvidia chips and Anthropic claims DeepSeek misused Claude to improve its own products.
[4]
DeepSeek Slashes Fees for New AI Model in Chinese Price War
DeepSeek is aggressively pitching low-priced-plans for its just-released flagship model, intensifying competition across a Chinese artificial intelligence industry trying to take on Silicon Valley's best. The Hangzhou-headquartered AI lab is offering a 75% discount to developers using the DeepSeek-V4-Pro, released last week after months of anticipation. It's also reducing fees for input cache hits across its family of AI platforms to a 10th of their original pricing, dramatically lowering costs for frequent users sending similar or repeated requests. The move threatens to re-ignite a price-based war that erupted after DeepSeek up-ended the industry with the R1 last year. Its latest promotion comes at a time when OpenAI Inc., Anthropic PBC and Alphabet Inc.'s Google are rapidly pushing new releases -- but access to those can be expensive. Chinese AI firms are discounting to incentivize users to switch, accelerating adoption in a crowded global AI field in an attempt to reshape the US-China AI race. DeepSeek is hoping pricing, accessibility and sophisticated features will differentiate its models for next-generation developers and enterprise users. DeepSeek-V4's context window -- -- the maximum amount of data a model can consider at once -- is significant as it enables the processing of complex code bases and long documents. The model allows easy integration of Claude Code, OpenClaw and OpenCode, making it easier to work within the broader AI ecosystem. Why DeepSeek V4 Impresses Despite Lack of 'Wow' Factor Video Player is loading. Mute Current Time 0:00 Loaded: 0% 1x Playback Rate * captions off, selected Share Sorry, something went wrong Check your internet connection or refresh the page. Try Again Ad0:00 China's DeepSeek has rolled out preview versions of a new flagship artificial intelligence model, calling it its most powerful open-source platform. Bloomberg Intelligence's Robert Lea explains how the Chinese flagship system is hampered in some areas by a lack of access to Nvidia tech but playing to its strengths in others. "The pricing, open source availability and 1 million context window features all lower barriers for developers, startups and small enterprises," said Akshar Keremane, co-founder of Bangalore-based AI startup O-Health. "It allows users to experiment at a model capability and scale that wasn't available earlier," said the entrepreneur, whose Gates Foundation-backed startup deploys AI that runs in large hospitals as well as rural clinics.
[5]
DeepSeek's new models offer big inference cost savings
Now available in preview, DeepSeek V4 cuts inference costs to a fraction of R1 Chinese AI darling DeepSeek is back with a new open weights large language model that promises performance to rival the best proprietary American LLMs. Perhaps more importantly, it claims to dramatically reduce inference costs and it extends support for Huawei's Ascend family of AI accelerators. Unveiled on Friday, DeepSeek V4 is available for download on popular model repos like Hugging Face, the company's API, and web service in two new flavors. The first is a smaller 284 billion parameter Flash mixture-of-experts (MoE) model with 13 billion active parameters, while the larger of the two is a 1.6 trillion parameter model, 49 billion of which are in use at any given moment. V4-Pro was trained on 33 trillion tokens and, if DeepSeek is to be believed, beats out every open weight LLM while rivaling the West's best proprietary models across its benchmark suite. Of course, these claims should be taken with a grain of salt. While DeepSeek has had a strong track record with its V3 and R1 family of models that made the Chinese dev a household name, just because it performs well in canned benchmarks doesn't mean it'll hold up in real world applications. We would expect DeepSeek V4-Pro to be much better than the company's prior efforts. The new model is nearly a trillion parameters larger and uses more active parameters during inference. But as was the case with DeepSeek V3, which showed that large frontier models could be trained using less compute than previously thought, benchmarks don't tell the full story. Under the hood, DeepSeek V4 introduces several novel architectural changes that, according to developers, should make the model much less expensive to serve. The first is rather simple. This time around, DeepSeek is releasing a second smaller Flash model, which requires less infrastructure to run and will deliver a more interactive user experience at a lower cost. Smaller models are simply cheaper to serve. This in itself isn't a new strategy, but it's one that DeepSeek is only now embracing, at least as far as its in house models are concerned. The bigger and more meaningful change comes to how DeepSeek is calculating attention. A model's attention mechanism impacts how it converts a prompt into key-value pairs that are used to generate the output tokens. In a paper published alongside the new models, DeepSeek researchers describe a hybrid attention mechanism that combines two technologies: Compressed Sparse Attention and Heavy Compressed Attention to reduce the amount of compute required during inference and the memory required by the KV caches used to track model state. The latter element is key to DeepSeek V4's efficiency, as these caches can be quite large. Inference providers also tend to offload these to system memory or flash to avoid cold start penalties. More heavily compressed KV caches mean less memory and storage is required for large-scale inference deployments. Combined, these technologies mean the model can support a million token context window while using 9.5x-13.7x less memory than DeepSeek V3.2. To further reduce the model's memory footprint, DeepSeek is continuing its tradition of using lower precision datatypes. DeepSeek V3 was among the first open weights models trained at FP8. Now, both V4 models are using a mixture of FP8 and FP4 precision. Specifically, the model devs used quantization-aware training for the MoE expert weights. As we've previously discussed, FP4 effectively halves the memory required to store model weights compared to FP8, making it a significant saving, if you can stomach the loss of precision. DeepSeek's architectural improvements aren't limited to inference either. In V4, the model devs introduced a new optimizer called Muon, designed to speed up convergence and improve training stability. Perhaps the most interesting, least detailed element of the new models relates to the hardware they're running on. While DeepSeek V3 was heavily optimized for Hopper GPUs, V4 has been validated to run on both Nvidia and Huawei accelerators. The DeepSeek V4 paper only mentions the chips in passing, noting that the company validated its "fine-grained EP [Expert Parallel] scheme on both Nvidia GPUs and Ascend NPU platforms." To be clear, this does not mean the model was trained entirely on Huawei hardware, only that DeepSeek has validated the Chinese telecommunications giant's AI accelerators to serve the model. It is possible DeepSeek used a combination of Nvidia GPUs for pre-training and Huawei accelerators for reinforcement learning. The latter is an inference-adjacent post-training step used to teach models new skills, behavior, and chain of thought reasoning. However, the paper doesn't directly address this. Inference generally has a lower barrier to entry for new chipmakers. However, at one point, DeepSeek was trying to train its models on Huawei's chips as well. This effort was reportedly derailed by dodgy chips, glacial interconnects, and an immature software stack that ultimately drove DeepSeek back into Nvidia's embrace. Finally, the use of 4-bit precision data types in V4 could lead some to assume DeepSeek got its hands on Nvidia's Blackwell accelerators, which the AI arms dealer isn't allowed to sell in China, but this isn't strictly necessary. Hopper GPUs don't support FP4 hardware acceleration but can work with the data type in a weights-only fashion. This approach doesn't benefit floating point performance, but reduces the memory footprint and bandwidth required during both training and inference, making it a worthwhile trade-off in many use cases. DeepSeek V4 is currently in preview with both base and instruct tuned versions of the model available for download or via its API. The company is unsurprisingly offering API access to the smaller model at a reduced rate of $0.14 per million input tokens (uncached) and $0.28 per million output tokens. The larger Pro model is much more expensive at $1.74 per million input tokens and $3.48 per million output tokens, but that's still a fraction of what Western AI vendors are charging for access to their top models. For reference, OpenAI charges $5 per million input tokens and $30 per million output tokens for GPT-5.5. ®
[6]
China's DeepSeek slashes prices for new AI model
April 27 (Reuters) - China's DeepSeek is offering developers a 75% discount on its newly unveiled AI model, DeepSeek-V4-Pro, until May 5. The company is also cutting prices for input cache hits across its entire DeepSeek API lineup to one-tenth of the original price, it said in a post on X. On Friday, DeepSeek launched a preview of its highly anticipated V4 model, which has been adapted for Huawei's chip technology. V4 comes in two versions: the more powerful and higher priced Pro, and the lighter, cheaper Flash variant. The Pro version outperforms other open-source models in world-knowledge benchmarks, trailing only Google's (GOOGL.O), opens new tab closed-source Gemini-Pro-3.1, DeepSeek said. According to the Chinese startup, the V4 models are particularly suited to AI agent work, which can execute more complex tasks than chatbots but require greater computing power. Reporting by Disha Mishra in Bengaluru; Editing by Sumana Nandy Our Standards: The Thomson Reuters Trust Principles., opens new tab
[7]
DeepSeek promises its new AI model has 'world-class' reasoning
DeepSeek has released its latest AI models, the V4 Pro and Flash versions, a bit over a year after it went viral and became the top rated free app on Apple's App Store in the US. "Welcome to the era of cost-effective 1 million context length," DeepSeek said in its announcement. Context length is what you call the maximum number of tokens that an AI model can remember, so the bigger it is, the more coherent and consistent an AI is when it comes to extended conversations. OpenAI's recently announced GPT‑5.5 has a context window ranging from 400,000 to 1 million, for instance. The new model is still open-source, allowing users to download its code and modify it if they want. DeepSeek says V4 Pro has enhanced agentic capabilities and claims that it rivals top closed-source models when it comes to reasoning. It also says that it trails only Gemini-3.1-Pro in rich world knowledge. Meanwhile, V4 Flash isn't quite as powerful as the V4 Pro, but it has faster response times. Still, its reasoning abilities closely approach V4 Pro, DeepSeek says, and it performs on par with with the Pro version on simple Agent tasks. Shortly after DeepSeek topped the App Store charts, it was banned for use by US federal agencies and on government-owned devices. Authorities believed it was a national security risk and posed a threat to US AI stocks. South Korea also paused downloads of its app over privacy concerns.
[8]
China's DeepSeek rolls out a long-anticipated update of its AI model
HONG KONG (AP) -- DeepSeek, the Chinese artificial intelligence startup that shook world markets last year, launched preview versions of its latest major update Friday as the AI rivalry between China and the U.S. heats up. DeepSeek's V4 has been keenly anticipated by users keen to test how it compares to U.S. competitors like OpenAI's ChatGPT, Anthropic's Claude and Google's Gemini. Anthropic and OpenAI have accused DeepSeek of unfairly building its technology off their own. Some industry analysts had expected the new model to arrive more than a month earlier at the start of the Lunar New Year. DeepSeek says the new V4 open-source models, which include "pro" and "flash" versions, have big improvements in knowledge, reasoning and in their "agentic" capabilities - the ability to perform complex tasks and workflows autonomously. V4 is a successor to V3, an AI model that DeepSeek released in late 2024. But it was DeepSeek's specialized "reasoning" AI model, called R1, that took markets by surprise with its release in January 2025. DeepSeek claimed it was more cost-effective than OpenAI's similar model and it became a symbol of how China was catching up with the U.S. in technological advancements. DeepSeek said the "V4 Pro Max" version has "superior performance" in terms of standard reasoning benchmarks relative to OpenAI's GPT-5.2 model and Google's Gemini 3.0-Pro. It falls "marginally" short of GPT-5.4 and Gemini 3.1-Pro, it said. In terms of "agentic" capabilities, the Chinese company said the V4 "pro" version could outperform Claude's Sonnet 4.5 and approaches the level of Claude's Opus 4.5 model based on its own evaluation. The "flash" version of V4 performs on a par with the "pro" version on simple agent tasks and has reasoning capabilities closely approaching it, DeepSeek said. "Based on the benchmark results, it does appear DeepSeek V4 is going to be very competitive against its U.S. rivals," said Lian Jye Su, chief analyst at the technology research and advisory group Omdia. Marina Zhang, an associate professor at the University of Technology Sydney, said DeepSeek's V4 rollout is as a "pivotal milestone for China's AI industry", especially as global competition intensifies in the pursuit of self-reliance in critical technologies. DeepSeek offers a free‑to‑use web and mobile chatbot. Unlike the top models from Anthropic, Google and OpenAI, it describes its technology as "open source" in the way that it enables developers access to modify and build on its core technology. Both the V4's "pro" and "flash" versions have a 1 million token context window, a parameter of how much information an AI model can process and recall, and run on a more efficient basis, the startup said. That is a significant improvement from before, since the V3 supported a 128,000 token context window. A report from Microsoft in January showed use of DeepSeek has been gaining ground in many developing nations. However, some analysts remain skeptical. Ivan Su, a senior equity analyst at Morningstar, said while V4 is a "competent" follow-up, it's not as big a breakthrough as the rollout of R1. "Domestic competition has intensified significantly since R1's release," Su said. "Against U.S. models, DeepSeek's own evaluation suggests its capabilities largely match on most fronts, but independent evaluations are needed before final conclusions can be drawn." In February, Anthropic accused DeepSeek and two other China-based AI laboratories of "industrial-scale campaigns" to "illicitly extract Claude's capabilities to improve their own models." It said they did that using a technique called distillation that "involves training a less capable model on the outputs of a stronger one." OpenAI made similar allegations in a letter to U.S. lawmakers. This week, Michael Kratsios, chief science and technology adviser to U.S. President Donald Trump, also accused foreign tech companies "principally based in China" of distilling leading U.S. AI systems and "exploiting American expertise and innovation." China's embassy in Washington hit back at the allegations, describing them as "unjustified suppression of Chinese companies by the U.S."
[9]
China's DeepSeek releases preview of long-awaited V4 model as AI race intensifies
DeepSeek reportedly has not shared its upcoming AI model with American engineers and instead granted early access to Chinese companies, further intensifying the technological war between the U.S. and China, as of Feb. 26, 2026. Chinese artificial intelligence startup DeepSeek on Friday released a preview version of its long-awaited Version 4 large language model, allowing users to test its new capabilities and features. The release comes more than a year after DeepSeek introduced its R1 reasoning model, which rocked global tech markets due to its surprising performance and efficiency, despite reportedly being developed for much lower costs than U.S. rivals. Similar to DeepSeek's previous V3 model, the latest upgrade is open source, allowing developers to download the code, run it locally, and modify it. The Hangzhou-based company claimed that Version 4 achieves strong performance against domestic competitors, particularly in agent-based tasks, knowledge processing and inference. The company added that DeepSeek-V4 has been optimized for use with popular agent tools such as Anthropic's Claude Code and OpenClaw. The model is available in both a "pro" and a "flash" version, depending on size. Founded in 2023, DeepSeek gained attention in late 2024 with its free, open-source V3 model, which it said was trained with less powerful chips and at a fraction of the cost of models built by the likes of OpenAI and Google. Weeks later, in January 2025, it released a reasoning model, R1, that hit similar benchmarks or outperformed many of the world's leading LLMs. The emergence of a globally competitive open-source model raised questions in tech markets about how the scale of spending on AI infrastructure would be impacted. Since then, DeepSeek has released a series of model upgrades, but none have matched the impact of R1. The company now faces growing competition in China's booming AI sector, with players like Alibaba and ByteDance also releasing new models this year.
[10]
DeepSeek's Sequel Set to Extend China's Reach in Open-Source A.I.
Meaghan Tobin reported from Taipei, Taiwan, and Cade Metz from San Francisco. When the Chinese start-up DeepSeek published details about one of its artificial intelligence models last year, it sent shock waves through the tech industry. The company said it had built its system by spending far less on computer chips than American rivals like OpenAI and Anthropic. It marked the start of what became known as China's "DeepSeek moment," shorthand for the belief that Chinese A.I. companies were ready to showcase their technical capabilities to the world. The DeepSeek moment reflected a shift in the global A.I. landscape. The change was not only about lower costs, but also openness in how the technology is shared. DeepSeek released its models as open source, which means others can freely use and modify them. By contrast, OpenAI and Anthropic kept their leading models proprietary. The episode demonstrated that an open-source system could perform almost as well as closed versions. In the months that followed, Chinese firms released dozens of other open-source models. By the end of 2025, these models made up a significant share of global A.I. usage. On Friday, DeepSeek released a preview of V4, its long-awaited follow-up model, which it intends to open source. The new model excels at writing computer code, an increasingly important skill for leading A.I. systems. It significantly outperformed every other open-source system at generating code, according to tests from Vals AI, a company that tracks the performance of A.I. technologies. DeepSeek released its new model just days after Moonshot AI, another Chinese start-up, introduced its latest open-source model, Kimi 2.6. While these systems trail the coding capabilities of the leading U.S. models from Anthropic and OpenAI, the gap is narrowing. The implications are meaningful. Using A.I. to write code is faster and frees up human programmers to focus on bigger issues. It also means people can use DeepSeek's latest release to power A.I. agents, which are personal digital assistants that can use other software applications on behalf of office workers, including spreadsheets, online calendars and email services. As A.I. systems improve at writing computer code, they are also getting better at finding security vulnerabilities in software -- a skill that is fundamentally changing cybersecurity. That means tools like DeepSeek's can be used to both attack and defend computer networks. Across tasks, DeepSeek V4 is on par with Moonshot's latest model. "They are basically neck-and-neck," said Rayan Krishnan, the chief executive of Vals AI. In the months leading up to DeepSeek's latest release, foreign rivals moved to pre-empt another round of glowing headlines. Silicon Valley's A.I. giants, Anthropic and OpenAI, said DeepSeek had unfairly piggybacked on their technology through distillation, a process in which engineers mimic a rival model by querying it millions of times and copying its behavior. The competition to build the best-performing A.I. systems has transformed into a geopolitical power struggle. While Silicon Valley leaders at Anthropic and OpenAI warn that their technology would be dangerous in the hands of autocratic countries, China has invested billions to become an A.I. superpower, viewing the technology as a critical engine of economic growth. DeepSeek's open-source models are central to this strategy. While many Western companies guard their most valuable models, China has embraced open source and almost all of its top-performing systems are widely available. Even so, Chinese A.I. firms face major hurdles. Three U.S. administrations have imposed export controls limiting access to advanced chips needed for cutting-edge A.I. systems. And firms in Silicon Valley continue to outspend Chinese rivals in the race for top A.I. talent. China's push into open-source A.I. has become a major economic advantage at home, according to a new study by a U.S. congressional advisory body. With few barriers to use, the systems have spread across industries such as robotics, logistics and manufacturing. The study found that these industrial applications generate real-world data that are used to improve A.I. systems. This approach has allowed Chinese tech firms to capture global influence, as programmers and engineers around the world adopt their systems to build new products. From Lagos to Kuala Lumpur, developers on tight budgets are turning to Chinese open-source models because they are cheaper to run and therefore easier to experiment with. Last May, Malaysia's deputy minister of communications said the country's sovereign A.I. infrastructure would be built on DeepSeek's technology. Chinese open-source models accounted for roughly one-third of global A.I. usage last year, according to a study by OpenRouter, an A.I. model marketplace. DeepSeek was the most widely used, followed by models from Alibaba, the Chinese internet company. That reflects a broader strategy. As Chinese companies expand abroad, making their systems open-source helps them gain traction with coders by offering cheaper, more accessible tools. "Open source is the soft power of technology of the future," said Kevin Xu, the U.S.-based founder of Interconnected Capital, a hedge fund that invests in artificial intelligence technologies. Mr. Xu and his fund do not invest in DeepSeek. Wei Sun, a principal analyst in A.I. at Counterpoint Research in Beijing, said DeepSeek's success paved the way for China's tech giants to release their A.I. systems publicly rather than closely guarding them. Alibaba has since emerged as a leader. Its Qwen model family has surpassed 1 billion downloads. ByteDance, parent company of TikTok, has also shared some details of its technology after spending $11 billion on A.I. infrastructure in 2024. "The A.I. generation of open-source builders from China was arguably the biggest A.I. story in 2025," Mr. Xu said. "The progress of the models, the pace of the releases, and the number of A.I. labs that both compete with each other but also seem to cheer each other on came fast and furious with no signs of slowing down."
[11]
DeepSeek's Long-Awaited New Model Fails to Narrow US Lead in AI
When China's DeepSeek released a competitive new artificial intelligence model called R1 last January purportedly built for less than many rivals, some feared the achievement posed a threat to America's lead in artificial intelligence. More than a year later, DeepSeek has unveiled preview versions of a long-awaited new flagship model called V4, which costs less than many alternatives to use but doesn't meaningfully narrow the US lead in AI capabilities. You could almost hear sighs of relief emanating from Washington and Silicon Valley. "It is not competitive with frontier US models, and does not appear to close the gap with the United States in AI," said Chris McGuire, a senior fellow for China and emerging technologies at the Council on Foreign Relations. Dean Ball, a former AI adviser in the Trump administration, echoed the sentiment. "R1 remains the closest I've seen Chinese models get to the U.S. frontier," he wrote in a post on social media. "V4 is further behind than that." DeepSeek unveiled the V4 Flash and V4 Pro series on Friday, touting top-tier performance in coding benchmarks and big advancements in reasoning and agentic tasks. But in a paper accompanying the release, the Chinese company conceded that, in certain respects, the new model lags the most cutting-edge AI software in the US. "DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks," the company said, referring to months-old models from OpenAI and Alphabet Inc.'s Google. "Nevertheless, its performance falls marginally short of GPT-5.4 and Gemini-3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months." DeepSeek's model is likely even further behind OpenAI's GPT-5.5, a newer offering released the day before V4 that's designed to be better at completing tasks without much direction. While it will take time for people to adopt the model and test it out, the initial details may provide some relief to US executives and officials who have been bracing for DeepSeek's next release. US tech companies have accused DeepSeek and other Chinese firms of basing their chatbots at least in part on American AI models -- a practice known as distillation. House lawmakers have pushed for sanctions on Chinese firms, and the White House announced its own efforts to rein in adversarial distillation. In February of this year, OpenAI sent a memo to Congress noting it has found continued and sophisticated distillation attempts from actors in China and Russia. "DeepSeek's next model (whatever its form) should be understood in the context of its ongoing efforts to free-ride on the capabilities developed by OpenAI and other US frontier Labs," OpenAI wrote in the memo. Even if DeepSeek's new model fails to outperform the most advanced software from OpenAI, Anthropic PBC and Google, it could still pose risks to their businesses by undercutting them on price. DeepSeek, like its Chinese peers, has focused on open-weight models, meaning that parts of the underlying AI system are publicly available for users to freely download and run on their own platforms -- and therefore is cheaper to use. US officials have estimated that unauthorized distillation costs Silicon Valley labs billions of dollars in annual profit, Bloomberg News has reported. In a blog post Friday, developer Simon Willison described DeepSeek's V4 as "almost on the frontier, a fraction of the price." For DeepSeek, that may be good enough.
[12]
DeepSeek cuts V4-Pro prices by 75%
The promotional discount runs until 5 May 2026. Even at full price, V4-Pro already undercuts GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro on per-token costs. The move is a direct challenge to the pricing strategy of US AI providers at a moment when the Trump administration has accused Chinese firms of distilling American AI models on an industrial scale. DeepSeek announced on Monday that it is offering a 75% discount on its newly released DeepSeek-V4-Pro model to developers until 5 May 2026, and is simultaneously cutting the price of input cache hits across its entire API suite to one-tenth of previous levels, effective immediately. The discount was announced in a post on X. The move intensifies a pricing competition with US AI providers that DeepSeek first triggered in January 2025 with its R1 model, which claimed frontier-level reasoning performance at a fraction of the cost of comparable OpenAI products. The pricing context is important. At full price, before any promotional discount, DeepSeek-V4-Pro already costs $0.145 per million input tokens and $3.48 per million output tokens, undercutting OpenAI's GPT-5.5, Google's Gemini 3.1 Pro, and Anthropic's Claude Opus 4.7 on per-token basis. The 75% promotional discount on input tokens reduces the V4-Pro input price to approximately $0.036 per million tokens. The Flash variant, V4's smaller, faster model, costs $0.14 per million input tokens and $0.28 per million output tokens at full price, already undercutting GPT-5.4 Nano, Gemini 3.1 Flash, GPT-5.4 Mini, and Claude Haiku 4.5. The cache-hit price cut to one-tenth of prior levels specifically targets frequent users and enterprise developers who send similar or repeated requests, which is the dominant pattern in production agentic applications. The strategic logic is explicit and well-documented in how DeepSeek has operated since R1. Open-source availability removes the model access barrier entirely; aggressive API pricing removes the cost barrier for production deployment; a 1 million-token context window makes the model viable for enterprise use cases involving large codebases or long documents that would otherwise require multiple API calls. V4-Pro also integrates natively with Claude Code, OpenClaw, and OpenCode, the dominant agentic coding frameworks used by developers already in the Western AI ecosystem. The combined effect is to lower the friction of switching from an OpenAI, Anthropic, or Google API to a DeepSeek API for any developer whose primary constraint is cost. Akshar Keremane, co-founder of Bangalore-based AI startup O-Health, described the combination of pricing, open-source availability, and the 1 million-token context window as lowering barriers "for developers, startups and small enterprises." The V4-Pro model, launched last Friday, is a mixture-of-experts model with 1.6 trillion total parameters and 49 billion active parameters per task, the largest open-weight model currently available, outstripping Moonshot AI's Kimi K2.6 and MiniMax's M1. Its Hybrid Attention Architecture is designed to maintain coherence across long contexts. It is trained on and optimised for Huawei's Ascend 950 chips and Cambricon hardware rather than Nvidia GPUs. Zhang Yi, founder of tech research firm iiMedia, told AFP that V4's architecture represents a "genuine inflection point" for long-context AI processing, predicting that ultra-long context support will move beyond research labs into mainstream commercial applications. Wei Sun, principal analyst at Counterpoint Research, noted that V4 running on domestic chips "allows AI systems to be built and deployed without relying solely on Nvidia" and could "accelerating adoption domestically and contributing to faster global AI development overall." The pricing move arrives in a charged geopolitical context. On Thursday last week, White House Director of Science and Technology Policy Michael Kratsios accused foreign entities, primarily based in China, of conducting "industrial-scale" campaigns to distil frontier AI models from US companies, a process in which a smaller model is trained using the outputs of a larger model to acquire similar capabilities at lower cost. Kratsios's memo did not directly name DeepSeek, but DeepSeek has previously been accused by both Anthropic and OpenAI of distilling their models. CNN reported it has reached out to DeepSeek for comment on those accusations. The US government's distillation crackdown, alongside China's parallel move to restrict US investment in its AI firms, was announced the day before V4's launch. DeepSeek's response, three days later, is to cut prices rather than respond to the accusations directly: a competitive move that is also a political statement about where it believes the AI race will ultimately be decided. OpenAI has cut API prices multiple times; Anthropic has introduced tiered pricing for different Claude model sizes; Google has progressively reduced Gemini API costs. DeepSeek's Monday announcement is the latest move in that ongoing compression, but it is distinctive in its scale, a 75% promotional discount on top of a model that already undercuts the US frontier at standard pricing, and in its timing, which positions the Hangzhou startup as the low-cost challenger in the same week that OpenAI shipped GPT-5.5 and the US government moved to restrict Chinese model distillation.
[13]
DeepSeek’s Newest Models Take on Silicon Valley at a Fraction of the Cost
DeepSeek rolled out preview versions of its highly anticipated V4 AI models today, once again narrowing the gap with leading AI models from the world’s largest tech companies. The Chinese startup released two open-source versions, a high-performance V4 Pro model and a cheaper, smaller V4 Flash model. The company is pitching both as competitive with frontier systems, highlighting strong performance in coding, improved reasoning, and more advanced agentic capabilities. One of the more eye-catching upgrades is the jump to a 1-million-token context window, which allows the models to process entire codebases or extremely long documents in a single prompt. But what really makes these models stand out is their focus on efficiency. The V4 models rely on a mixture-of-experts (MoE) architecture, a design that activates only a subset of the model’s parameters at any given time. While the system may have trillions of parameters in total, only a fraction are used per task, which keeps inference costs low. The new models arrive just over a year after DeepSeek first made headlines with its R1 reasoning model. That system rivaled advanced models from companies like OpenAI and Google, but was reportedly built at a fraction of the cost and used fewer AI chips for training. The news even triggered a trillion-dollar selloff on Wall Street, with Nvidia losing nearly $600 million in a single day. In a technical paper, the company says its latest models are competitive, while acknowledging a small performance gap. “Through the expansion of reasoning tokens, DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks,†the company said. “Nevertheless, its performance falls marginally short of GPT-5.4 and Gemini-3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months.†Still, for many users, the cost savings may outweigh any slight performance shortfall. Datasette creator Simon Willison compared token pricing across major models on his blog and found DeepSeek to be the cheapest in its class. DeepSeek is charging $0.14 per million input tokens and $0.28 per million output tokens for its V4 Flash model. For comparison, GPT-5.4 Nano costs $0.20 per million input tokens and $1.25 per million output tokens, while Claude Haiku 4.5 is priced at $1 and $5 per million input and output tokens, respectively. The gap becomes even more stark when it comes to the pro models. DeepSeek is charging $1.74 per million input tokens and $3.48 per million output tokens for its V4 Pro model. By comparison, Gemini 3.1 Pro costs $2 per million input tokens and $12 per million output tokens, while GPT-5.5 is priced at $5 and $30 per million input and output tokens, respectively. And of course, in keeping with DeepSeek’s previous releases, V4 is MIT-licensed and open-weight, so if you have the resources to run it, it’s “free†in the way a movie on Netflix is “free.†Nobody is charging you at the moment you press play, but the meter is still running somewhere. In this case, it’s your electricity bill.
[14]
DeepSeek-V4, the Chinese AI model adapted for Huawei chips
BEIJING, April 24 (Reuters) - Chinese startup DeepSeek on Friday released a preview version of V4, its new artificial intelligence model adapted to run on Huawei chips, marking another step in China's push to build a self-sufficient AI ecosystem. Here is what we know so far about the long-awaited open-source offering. V4 MODEL CHARACTERISTICS DeepSeek said V4 is designed to work with agent frameworks including Claude Code and OpenClaw, reflecting the industry shift away from prompt-based chatbots towards models that can complete complex, multi-step tasks with less human input. V4 comes in two versions: the more powerful and more expensive Pro, and the cheaper, lighter Flash. Pro is positioned as a higher-end model with performance comparable to leading closed-source systems, particularly in agentic coding, world knowledge, STEM (science, technology, engineering and mathematics) and competitive programming. In maximum reasoning mode, Pro outperforms all open-source models, though it still trails frontier closed-source systems such as Google's Gemini 3.1 Pro and OpenAI's GPT-5.4 in some areas, according to a DeepSeek paper released alongside the model. "DeepSeek-V4-Pro Max ... redefines the state of the art for open models, outperforming its predecessors in core tasks," DeepSeek said. Flash delivers similar reasoning ability in some areas but runs faster and at lower cost than Pro, with weaker world knowledge and lower performance on more demanding agent-based tasks. Both versions support a 1-million-token context window, matching the expansion DeepSeek introduced with V3 in February. DeepSeek said V4's architecture is designed to reduce compute and memory costs for long-context use. ADAPTED FOR HUAWEI CHIPS A key change from earlier DeepSeek releases is that V4 was adapted for Huawei's most advanced Ascend AI chips. Reuters reported in February that DeepSeek had not shared its new model with U.S. chipmakers for performance tuning, instead granting early access to domestic companies such as Huawei, despite previously working closely with Nvidia's technical staff. Hours after the preview release, Huawei said V4 is fully supported on its Ascend 950-based supernode clusters, and that its chips were used for part of V4-Flash's training. "Through close technical collaboration ... the entire Ascend supernode product line now supports the DeepSeek-V4 series models," Huawei said. DeepSeek's earlier V3 and R1 models were trained on Nvidia chips. The company did not say whether the same applied to V4. SELF-SUFFICIENCY PUSH AND LIMITS Lian Jye Su, chief analyst at tech research firm Omdia, said the partnership shows DeepSeek models can deliver similar performance on both Huawei and Nvidia hardware. "The popularity of DeepSeek in the domestic Chinese market encouraged Huawei to optimize the model for its hardware, and this, in turn, lowers the barriers for Chinese developers and companies to build AI apps entirely on domestic solutions," he said. He added that Huawei still trails Nvidia technologically, and moving developers away from Nvidia's ecosystem remains difficult. Even so, he said, "DeepSeek's pivot reveals real, tangible progress toward AI infrastructure self-sufficiency." DeepSeek also faces compute constraints under U.S. export controls on Nvidia chips and chipmaking equipment. The company said Pro can cost up to 12 times more than Flash because of "constraints in high-end compute capacity," limiting current Pro service availability. DeepSeek said Pro pricing could fall sharply once Huawei Ascend 950 supernodes are deployed at scale in the second half of the year. Reporting by Eduardo Baptista; additional reporting by Liam Mo; Editing by Mark Potter Our Standards: The Thomson Reuters Trust Principles., opens new tab * Suggested Topics: * Artificial Intelligence Eduardo Baptista Thomson Reuters Eduardo Baptista is a Senior Correspondent for Reuters based in Beijing, covering China's technology, space, and automotive industries. He has led enterprise and investigative reporting on China's military-linked companies, artificial intelligence and semiconductor supply chains, as well as macroeconomic and industrial policy. Baptista has reported from China for nearly a decade and holds a BA in History from the University of Cambridge.
[15]
DeepSeek unveils V4 AI with 1M context window, targets US rivals
The artificial intelligence race is accelerating. OpenAI launched GPT-5.5 this week, while the White House accused China of copying US AI systems at scale. Now, DeepSeek has introduced preview versions of its V4 model, targeting direct competition with top US platforms. DeepSeek released the V4 Flash and V4 Pro series, highlighting gains in coding, reasoning, and agent-driven tasks. The models incorporate architectural upgrades and optimization improvements, with a clear focus on efficiency as systems grow more expensive to run.
[16]
DeepSeek returns with V4-Pro and V4-Flash, a year after its 'Sputnik moment'
The Hangzhou startup released preview versions of both models on Hugging Face on Friday. V4-Pro claims top performance on coding and maths among open models, trails only Gemini 3.1-Pro for world knowledge, and falls "marginally short" of GPT-5.4 and Gemini 3.1-Pro, a gap DeepSeek says is "approximately 3 to 6 months." Both models are open-source. DeepSeek, the Hangzhou-based AI startup that upended Silicon Valley with its R1 model in January 2025, released preview versions of its latest flagship models on Friday, approximately one year after that original shock. The company posted DeepSeek-V4-Pro and DeepSeek-V4-Flash on Hugging Face, describing the release as the most powerful open-source AI platform available and a direct challenge to rivals from OpenAI to Anthropic. Both models follow the open-source model used by their predecessors, meaning developers are free to use and modify the source code. The headline technical advance in V4 is the Hybrid Attention Architecture, a technique DeepSeek says improves the model's ability to retain context across long conversations. Combined with a 1-million-token context window, sufficient to process an entire codebase or a book-length document in a single prompt, the architecture is designed for agentic and long-horizon reasoning tasks, where previous models degraded in quality as context length increased. The Flash variant is optimised for speed and cost efficiency; the Pro variant for peak capability. DeepSeek's own benchmark positions V4-Pro as the strongest open-source model in coding and mathematics, with performance that trails only Google's closed-source Gemini 3.1-Pro in world knowledge. Against the current closed-source frontier, OpenAI's GPT-5.4 and Gemini 3.1-Pro -- the company says V4-Pro falls only "marginally short," and offers a candid self-assessment: its "developmental trajectory" trails state-of-the-art frontier models by approximately 3 to 6 months." That framing is unusual in AI model releases, which typically emphasise comparisons where the new model leads. That DeepSeek is publishing a gap estimate rather than claiming parity reflects either unusual intellectual honesty or a strategic move to set conservative expectations ahead of independent evaluation. The chip story is the geopolitically charged element. DeepSeek worked with Chinese AI chipmakers Huawei and Cambricon to optimise V4 for their latest hardware, according to Reuters, citing The Information. The company did not give Nvidia or AMD early access for optimisation, a reversal of standard industry practice in which Western chipmakers are typically the first to receive new model weights for performance tuning. Running a frontier-class model at this scale on Huawei's Ascend chips, rather than on Nvidia H100s or H200s, would be a significant proof of concept for China's domestic AI hardware supply chain, which has been under US export restrictions since October 2022. The V4 release does not end that geopolitical constraint, but it tests its limits in a commercially visible way. The timing of the release, Friday, 24 April 2026, approximately one year after DeepSeek-R1, is deliberate. DeepSeek-R1's January 2025 launch erased roughly $600 billion from Nvidia's market capitalisation in a single day, as investors recalibrated their assumptions about the compute required to build frontier AI. Marc Andreessen called it "AI's Sputnik moment." The practical claim was that a Chinese lab had matched OpenAI's best reasoning model while spending less than $6 million on compute, a figure some analysts disputed but which nonetheless reset global assumptions about the economics of frontier AI development. V4 arrives into a different market: one where OpenAI has released GPT-5.5 on the same day, where Anthropic is valued at $1 trillion on secondary markets, and where the US- China AI competition has become an explicit dimension of trade and technology policy. DeepSeek's second act is landing in a much more contested arena than its first. Both models are preview releases, not final production versions. Independent benchmarking has not yet been completed as of the time of writing, and DeepSeek's own benchmarks should be treated as preliminary until third-party evaluation confirms them. The same caveat applied to R1, whose claims were broadly validated by external testing within days of release. Whether V4 holds up to the same scrutiny will be clear within the week.
[17]
DeepSeek V4 arrives with near state-of-the-art intelligence at fraction of the cost of Opus 4.7, GPT-5.5
DeepSeek, the Chinese AI startup offshoot of High-Flyer Capital Management quantitative analysis firm, became a near-overnight sensation globally in January 2025 with the release of its open source R1 model that matched proprietary U.S. giants. It's been an epoch in AI since then, and while DeepSeek has released several updates to that model and its other V3 series, the international AI and business community has been largely waiting with baited breath for the follow-up to the R1 moment. Now it's arrived with last night's release of DeepSeek-V4, a 1.6-trillion-parameter Mixture-of-Experts (MoE) model available free under commercially-friendly open source MIT License, which nears -- and on some benchmarks, surpasses -- the performance of the world's most advanced closed-source systems at approximately 1/6th the cost over the application programming interface (API). This release -- which DeepSeek AI researcher Deli Chen described on X as a "labor of love" 484 days after the launch of V3 -- is being hailed as the "second DeepSeek moment". As Chen noted in his post, "AGI belongs to everyone". It's available now on AI code sharing community Hugging Face and through DeepSeek's API. Frontier-class AI gets pushed into a lower price band The most immediate impact of the DeepSeek-V4 launch is economic. The corrected pricing table shows DeepSeek is not pricing its new Pro model at near-zero levels, but it is still pushing high-end model access into a far lower cost tier than the leading U.S. frontier models. DeepSeek-V4-Pro is priced through its API at $1.74 USD per 1 million input tokens on a cache miss and $3.48 per million output tokens. That puts a simple one-million-input, one-million-output comparison at $5.22. With cached input, the input price drops to $0.145 per million tokens, bringing that same blended comparison down to $3.625. That is dramatically cheaper than the current premium pricing from OpenAI and Anthropic. GPT-5.5 is priced at $5.00 per million input tokens and $30.00 per million output tokens, for a combined $35.00 in the same simple comparison. Claude Opus 4.7 is priced at $5.00 input and $25.00 output, for a combined $30.00. On standard, cache-miss pricing, DeepSeek-V4-Pro comes in at roughly one-seventh the cost of GPT-5.5 and about one-sixth (1/6th) the cost of Claude Opus 4.7. With cached input, the gap widens: DeepSeek-V4-Pro costs about one-tenth as much as GPT-5.5 and about one-eighth as much as Claude Opus 4.7. The more extreme near-zero story belongs to DeepSeek-V4-Flash, not the Pro model. Flash is priced at $0.14 per million input tokens on a cache miss and $0.28 per million output tokens, for a combined $0.42. With cached input, that drops to $0.308. In that case, DeepSeek's cheaper model is more than 98% below GPT-5.5 and Claude Opus 4.7 in a simple input-plus-output comparison, or nearly 1/100th the cost -- though the performance dips significantly. DeepSeek is compressing advanced model economics into a much lower band, forcing developers and enterprises to revisit the cost-benefit calculation around premium closed models. For companies running large inference workloads, that price gap can change what is worth automating. Tasks that look too expensive on GPT-5.5 or Claude Opus 4.7 may become economically viable on DeepSeek-V4-Pro, and even more so on DeepSeek-V4-Flash. The launch does not make intelligence free, but it does make the market harder for premium providers to defend on performance alone. Benchmarking the frontier: DeepSeek-V4-Pro gets close, but GPT-5.5 and Opus 4.7 still lead on most shared tests DeepSeek-V4-Pro-Max is best understood as a major open-weight leap, not a clean across-the-board defeat of the newest closed frontier systems. The model's strongest benchmark claims come from DeepSeek's own comparison tables, where it is shown against GPT-5.4 xHigh, Claude Opus 4.6 Max and Gemini 3.1 Pro High and bests them on several tests, including Codeforces and Apex Shortlist. But that is not the same as a head-to-head against OpenAI's newer GPT-5.5 or Anthropic's newer Claude Opus 4.7. Looking only at DeepSeek V4 versus the latest proprietary models, the picture is more restrained. On this shared set, GPT-5.5 and Claude Opus 4.7 still lead most categories. DeepSeek-V4-Pro-Max's best showing is on BrowseComp, the benchmark measuring agentic AI web browsing prowess (especially highly containerized information), where it scores 83.4%, narrowly behind GPT-5.5 at 84.4% and ahead of Claude Opus 4.7 at 79.3%. On Terminal-Bench 2.0, DeepSeek scores 67.9%, close to Claude Opus 4.7's 69.4%, but far behind GPT-5.5's 82.7%. The shared academic-reasoning results favor the closed models: On GPQA Diamond, DeepSeek-V4-Pro-Max scores 90.1%, while GPT-5.5 reaches 93.6% and Claude Opus 4.7 reaches 94.2%. On Humanity's Last Exam without tools, DeepSeek scores 37.7%, behind GPT-5.5 at 41.4%, GPT-5.5 Pro at 43.1% and Claude Opus 4.7 at 46.9%. With tools enabled, DeepSeek rises to 48.2%, but still trails GPT-5.5 at 52.2%, GPT-5.5 Pro at 57.2% and Claude Opus 4.7 at 54.7%. The agentic and software-engineering results are more mixed, but they still show DeepSeek-V4-Pro-Max trailing GPT-5.5 and Opus 4.7. On Terminal-Bench 2.0, DeepSeek's 67.9% is competitive with Claude Opus 4.7's 69.4%, but GPT-5.5 is much higher at 82.7%. On SWE-Bench Pro, DeepSeek's 55.4% trails GPT-5.5 at 58.6% and Claude Opus 4.7 at 64.3%. On MCP Atlas, DeepSeek's 73.6% is slightly behind GPT-5.5 at 75.3% and Claude Opus 4.7 at 79.1%. BrowseComp is the standout: DeepSeek's 83.4% beats Claude Opus 4.7's 79.3% and nearly matches GPT-5.5's 84.4%, though GPT-5.5 Pro's 90.1% remains well ahead. So ultimately, DeepSeek-V4-Pro-Max does not appear to dethrone GPT-5.5 or Claude Opus 4.7 on the benchmarks that can be directly compared across the companies' published tables. But it gets close enough on several of them -- especially BrowseComp, Terminal-Bench 2.0 and MCP Atlas -- that its much lower API pricing becomes the headline. In practical terms, DeepSeek does not need to win every leaderboard row to matter. If it can deliver near-frontier performance on many enterprise-relevant agent and reasoning tasks at roughly one-sixth to one-seventh the standard API cost of GPT-5.5 or Claude Opus 4.7, it still forces a major rethink of the economics of advanced AI deployment. DeepSeek-V4-Pro-Max is clearly the strongest open-weight model in the field right now, and it is unusually close to frontier closed systems on several practical benchmarks. While GPT-5.5 and Claude Opus 4.7 still retain the lead in most direct head-to-head comparisons across the company's benchmark charts, DeepSeek V4 Pro gets close while being dramatically cheaper and openly available. A big jump from DeepSeek V3.2 To understand the magnitude of this release, one must look at the performance gains of the base models. DeepSeek-V4-Pro-Base represents a significant advancement over the previous generation, DeepSeek-V3.2-Base. In World Knowledge, V4-Pro-Base achieved 90.1 on MMLU (5-shot) compared to V3.2's 87.8, and a massive jump on MMLU-Pro from 65.5 to 73.5. The improvement in high-level reasoning and verified facts is even more pronounced: on SuperGPQA, V4-Pro-Base reached 53.9 compared to V3.2's 45.0, and on the FACTS Parametric benchmark, it more than doubled its predecessor's performance, jumping from 27.1 to 62.6. Simple-QA verified scores also saw a dramatic rise from 28.3 to 55.2. The Long Context capabilities have also been refined. On LongBench-V2, V4-Pro-Base scored 51.5, significantly outpacing the 40.2 achieved by V3.2-Base. In Code and Math, V4-Pro-Base reached 76.8 on HumanEval (Pass@1), up from 62.8 on V3.2-Base. These numbers underscore that DeepSeek has not just optimized for inference cost, but has fundamentally improved the intelligence density of its base architecture. The efficiency story is equally compelling for the Flash variant. DeepSeek-V4-Flash-Base, despite utilizing a substantially smaller number of parameters, outperforms the larger V3.2-Base across wide benchmarks, particularly in long-context scenarios. A new information 'traffic controller,' Manifold-Constrained Hyper-Connections (mHC) DeepSeek's ability to offer these prices and performance figures is rooted in radical architectural innovations detailed in its technical report also released today, "Towards Highly Efficient Million-Token Context Intelligence." The standout technical achievement of V4 is its native one-million-token context window. Historically, maintaining such a large context required massive memory (the key values or KV cache). DeepSeek solved this by introducing a Hybrid Attention Architecture that combines Compressed Sparse Attention (CSA) to reduce initial token dimensionality and Heavily Compressed Attention (HCA) to aggressively compress the memory footprint for long-range dependencies. In practice, the V4-Pro model requires only 10% of the KV cache and 27% of the single-token inference FLOPs compared to its predecessor, the DeepSeek-V3.2, even when operating at a 1M token context. To stabilize a network of 1.6 trillion parameters, DeepSeek moved beyond traditional residual connections. The company's researchers incorporated Manifold-Constrained Hyper-Connections (mHC) to strengthen signal propagation across layers while preserving the model's expressivity. mHC allows an AI to have a much wider flow of information (so it can learn more complex things) without the risk of the model becoming unstable or "breaking" during its training. It's like giving a city a 10-lane highway but adding a perfect AI traffic controller to ensure no one ever hits the brakes. This is paired with the Muon optimizer, which allowed the team to achieve faster convergence and greater training stability during the pre-training on more than 32T diverse and high-quality tokens. This pre-training data was refined to remove hatched auto-generated content, mitigating the risk of model collapse and prioritizing unique academic values. The model's 1.6T parameters utilize a Mixture-of-Experts (MoE) design where only 49B parameters are activated per token, further driving down compute requirements. Training the mixture-of-experts (MoE) to work as a whole DeepSeek-V4 was not simply trained; it was "cultivated" through a unique two-stage paradigm. The model's reasoning capabilities are further segmented into three increasing "effort" modes. Breaking the Nvidia GPU stranglehold with local Chinese Huawei Ascend NPUs While the model weights are the headline, the software stack released alongside them is arguably more important for the future of "Sovereign AI." Analyst Rui Ma highlighted a single sentence from the release as the most critical: DeepSeek validated their fine-grained Expert Parallelism (EP) scheme on Huawei Ascend NPUs (neural processing units). By achieving a 1.50x to 1.73x speedup on non-Nvidia GPU platforms, DeepSeek has provided a blueprint for high-performance AI deployment that is resilient to Western GPU supply chains and export controls. However, it's important to note that DeepSeek still claims it used officially licensed, legal Nvidia GPUs for DeepSeek V4's training, in addition to the Huawei NPUs. DeepSeek has also open-sourced the MegaMoE mega-kernel as a component of its DeepGEMM library. This CUDA-based implementation delivers up to a 1.96x speedup for latency-sensitive tasks like RL rollouts and high-speed agent serving. This move ensures that developers can run these massive models with extreme efficiency on existing hardware, further cementing DeepSeek's role as the primary driver of open-source AI infrastructure. The technical report emphasizes that these optimizations are crucial for supporting a standard 1M context across all official services. Licensing and local deployment DeepSeek-V4 is released under the MIT License, the most permissive framework in the industry. This allows developers to use, copy, modify, and distribute the weights for commercial purposes without royalties -- a stark contrast to the "restricted" open-weight licenses favored by other companies. For local deployment, DeepSeek recommends setting sampling parameters to temperature = 1.0 and top_p = 1.0. For those utilizing the "Think Max" reasoning mode, the team suggests setting the context window to at least 384K tokens to avoid truncating the model's internal reasoning chains. The release includes a dedicated encoding folder with Python scripts demonstrating how to encode messages in OpenAI-compatible format and parse the model's output, including reasoning content. DeepSeek-V4 is also seamlessly integrated with leading AI agents like Claude Code, OpenClaw, and OpenCode. This native integration underscores its role as a bedrock for developer tools, providing an open-source alternative to the proprietary ecosystems of major cloud providers. Community reactions and what comes next The community reaction has been one of shock and validation. Hugging Face officially welcomed the "whale" back, stating that the era of cost-effective 1M context length has arrived. Industry experts noted that the "second DeepSeek moment" has effectively reset the developmental trajectory of the entire field, placing massive pressure on closed-source providers like OpenAI and Anthropic to justify their premiums. AI evaluation firm Vals AI noted that DeepSeek-V4 is now the "#1 open-weight model on our Vibe Code Benchmark, and it's not close". DeepSeek is moving quickly to retire its older architectures. The company announced that the legacy deepseek-chat and deepseek-reasoner endpoints will be fully retired on July 24, 2026. All traffic is currently being rerouted to the V4-Flash architecture, signifying a total transition to the million-token standard. DeepSeek-V4 is more than just a new model; it is a challenge to the status quo. By proving that architectural innovation can substitute for raw compute-maximalism, DeepSeek has made the highest levels of AI intelligence accessible to the global developer community at a far lower cost -- something that could benefit the globe, even at a time when lawmakers and leaders in Washington, D.C. are raising concerns about Chinese labs "distilling" from U.S. proprietary giants to train open source models, and fears of said open source or jailbroken proprietary models being used to create weapons and commit terror. The truth is, while all of these are potential risks -- as they were and have been with prior technologies that broadened information access, like search and the internet itself -- the benefits seem far outweigh them, and DeepSeek's quest to keep frontier AI models open is of benefit to the entire planet of potential AI users, especially enterprises looking to adopt the cutting-edge at the lowest possible cost.
[18]
DeepSeeek V4 is out, touting some disruptive wins over Gemini, ChatGPT, and Claude
An open-source model that beats GPT-5.4 on Codeforces and costs a fraction of Claude isn't just a news story; it's a pricing ultimatum to the entire AI industry. China's DeepSeek has a habit of showing up, uninvited, to Silicon Valley's AI party, and this time, it has done so with the long-awaited V4 preview. The Hangzhou-based company has released its latest AI model, which beats popular American models in certain areas. DeepSeek has launched two new models: V4-Pro (Expert mode) and V4-Flash (Instant mode). While the former is a massive 1.6 trillion parameter model, the latter is at a more manageable 284 billion parameters. However, both of them have a one-million-token context window. What exactly did DeepSeek release? What's even more important is that both models are open source, meaning they're available to download from Hugging Face and run locally on your hardware. However, V4-Pro's sheer scale means that you'll need a considerable amount of VRAM to run it locally. Recommended Videos One of the most interesting parts of the announcement is the comparison with popular AI models like Gemini, ChatGPT, and Claude. For instance, V4-Pro punches hard in coding, scoring 3,206 on Codeforces ratings, clearing GPT-5.4's 3,168, and Gemini 3.1's 3,052. This makes it the strongest open model for competitive programming tasks. On LiveCodeBench, V4-Pro posts 93.5, ahead of Claude Opus 4.6's 88.8 and Gemini 91.7, and likewise, for agentic tasks, it scores 51.8 on Toolathlon, beating both Claude (47.2) and Gemini (48.8). The faster and more efficient V4-Flash, meanwhile, matches V4-Pro on simple agent tasks, at a fraction of the compute cost. Where does V4-Pro beat the competition? There are several areas where DeekSeek's new model runs behind the competition, though. For instance, Claude's Opus 4.6 leads on long-context retrieval. It scores 92.9 on MRCR 1M versus V4-Pro's 83.5. GPT-5.4 still tops Terminal Bench 2.0 at 75.1 against V4-Pro's 67.9. Where DeepSeek truly disrupts the competition is the pricing. The V4-Pro costs $3.48 per million output tokens, which, compared to OpenAI's $30 and Anthropic's $25 for equivalent workloads, might sound much more attractive to potential customers. That gap is enormous for everyday developers building AI-powered apps.
[19]
DeepSeek unveils its newest model at rock-bottom prices and with 'full support' from Huawei chips | Fortune
Chinese AI company DeepSeek has unveiled its long-awaited V4 model. On Friday, the Hangzhou-based startup released its newest large language model in a preview capacity. The release comes over a year after it shook markets and reset the entire conversation around AI with its V3 and R1 models, which showed that frontier-level performance could be achieved through an open-source model trained on a relatively tiny budget. V4 may not have as large of an effect on markets as its predecessor did. So far, the biggest impacts have been on the shares of other Chinese companies. Shares in Semiconductor Manufacturing International Corporation jumped 10% in Hong Kong trading. That Chinese chipmaker makes Huawei's Ascend AI processors, which DeepSeek said it used to train its new model. (Huawei has been trying to catch up to the performance of Nvidia's GPUs, which most companies use for training AI models.) Meanwhile, shares in Minimax and Knowledge Atlas, two of DeepSeek's competitors, sank by more than 9%. Yet the increasingly narrow performance gap between DeepSeek and leading U.S. models, as well as its rock-bottom prices, will raise questions about the competitive moat that surrounds leading U.S. labs like OpenAI and Anthropic -- and the constraints that still hold back China's AI development. DeepSeek released two versions on Friday: "DeepSeek-V4-Pro," which it said had "performance rivaling the world's top closed-source models," and "DeepSeek-V4-Flash," a smaller and cheaper variant. The Pro version has 1.6 trillion parameters, the tunable knobs within a neural network that serve as a measure of a model's size, and a context length of 1 million tokens. (A token is the fundamental building block used by an LLM to understand and generate text; a million tokens is roughly equivalent to 750,000 words) The startup claims its model beats all other open-source models in agentic coding and reasoning. DeepSeek also pointed to benchmarks that showed V4 performing favorably against Anthropic's Claude Opus 4.6, OpenAI's GPT-5.4, and Google's Gemini 3.1-Pro. The startup's tech report says that V4 "falls marginally short of GPT-5.4 and Gemin i3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months." Just as important is how much DeepSeek is charging for access to its model. DeepSeek's V4-Pro model will cost $3.48 for one million tokens of output; by comparison, OpenAI and Anthropic charge $30 and $25 respectively for the same amount of work. Even Kimi, from fellow Chinese AI startup Moonshot AI, costs $4 for a million tokens of output. DeepSeek's V4-Flash costs even less, at just $0.28 for a million tokens. That pricing puts DeepSeek at odds with a trend across the wider AI sector. Both OpenAI and Anthropic have hiked prices and imposed rate limits to manage surging demand. Chinese developers have followed suit, also increasing prices and removing subscriptions that offered unlimited usage. DeepSeek's prices could get even cheaper: It expects to lower V4-Pro prices later in the year as Huawei scales up production of its new Ascend 950 AI processors DeepSeek first captured global attention in December 2024, when it released its V3 large language model. The startup claimed it trained V3 on just $5.6 million worth of processors, which AI researcher Andrej Karpathy called a "joke of a budget." DeepSeek later released R1, a reasoning model that matched the equivalent offering from OpenAI. That sparked a massive sell-off in U.S. tech stocks, as investors repriced how much money was needed to train and run AI models. At one point, tech stocks lost $1 trillion in value. While markets eventually recovered, DeepSeek's decision to release its model on an open-source basis ended up being more significant. The startup's release built on momentum started by Alibaba's Qwen to inspire several other Chinese labs to release their own open-source models. Even OpenAI was pushed to release its own open-source model, gpt-oss. DeepSeek's open-source approach allowed developers to download its model for free, tweak it for their own purposes, and run it on local hardware. That helped win DeepSeek, and later labs that also went open-source, support from resource-strapped developers, particularly those outside of the U.S. and Europe. V3 and R1 also sparked a revival of investor interest in China's beaten-down tech sector. In recent months, several AI startups have surged in public markets as investors try to build exposure to China's AI ecosystem. In January, MiniMax and Knowledge Atlas listed in Hong Kong, as some of the first AI labs to tap public markets: The two startups are up 370% and 700% respectively from their offer price. This also means that DeepSeek is releasing V4 into a new competitive landscape. Alibaba, Moonshot AI, MiniMax and Knowledge Atlas have all released high-performing open-source models this year. Both The Information and the Financial Times report that DeepSeek is trying to raise money from Tencent and Alibaba in a funding round that would value the lab at $20 billion. DeepSeek, which is owned by High-Flyer, a Chinese hedge fund, isn't short of cash. The reason, according to the FT, is talent: DeepSeek needs to raise money to convince its AI researchers to stick around, and not get poached by other labs with larger valuations. The AI frontier is also about to be pushed further out. This week, OpenAI released GPT 5.5, the latest version of its flagship LLM, as well as an updated version of its image generation model. Anthropic is also testing its latest Mythos model with a select group of companies; the Claude developer claims these restrictions are necessary to ensure the model can't be misused upon a more public release. DeepSeek worked closely with Huawei to ensure its new model runs on the Chinese tech giant's Ascend AI processors. On Friday, Huawei announced its Ascend processors would offer "full support" for DeepSeek's models. Processing power is in short supply, as tech companies snap up GPUs, CPUs and memory chips in order to build more data centers and AI processors. Chinese developers have an additional problem: U.S. export controls bar them from buying the most advanced processors made by companies like Nvidia. Ironically, these export controls may have helped Chinese AI startups, including DeepSeek, learn how to operate in a world of scarce processing power. Because Chinese developers have been forced to train and run models with limited resources, they had to come up with ways to make their models much more efficient. This power-efficiency has, in turn, allowed the Chinese developers to reduce prices substantially. U.S. pressure also forced China's semiconductor sector to accelerate plans to manufacture chips domestically. DeepSeek's ability to train and run its model on Huawei chips could represent another step away from U.S. chipmakers like Nvidia and AMD. "The best AI researchers in the world, because they are limited in compute, also come up with extremely smart algorithms," Nvidia CEO Jensen Huang said on the Dwarkesh Podcast last week, adding that "the day that DeepSeek comes out on Huawei first, that is a horrible outcome for [the U.S.]" U.S. officials and developers have long claimed that DeepSeek's success is based on Nvidia processors smuggled into China, or which the Chinese company has managed to access illegally in other countries in Asia, such as Malaysia through front companies, as well as illicit distillation of leading U.S. models. Distillation is the process of training a smaller, more efficient model on the outputs of a larger one. On Thursday, Michael Kratsios, science advisor to U.S. President Donald Trump, alleged that Chinese AI developers were engaged in "industrial-scale campaigns" to copy U.S. technology. "There is nothing innovative about systematically extracting and copying the innovations of American industry," Kratsios wrote in a memo, and promised to work closely with the private sector to protect them against these attacks. OpenAI and Anthropic have also accused Chinese AI developers, including DeepSeek, of conducting "illicit" distillation attacks. On Friday, China's foreign ministry called Kratsios's claims "groundless" and "a smear against the achievements of China's AI industry."
[20]
DeepSeek previews V4 models with Huawei integration, signaling shift in China's AI stack
Chinese artificial intelligence startup DeepSeek has released preview versions of its latest models -- DeepSeek-V4-Pro and V4-Flash -- marking a closer integration with domestic chipmaker Huawei and intensifying competition with US developers, including OpenAI and Google. The launch comes as artificial intelligence becomes an increasingly contested front in the US-China technology rivalry, amid heightened scrutiny over intellectual property and access to advanced semiconductors. Performance positioning narrows gap with leading models DeepSeek said the V4-Pro model outperforms other open-source systems in mathematics, coding and reasoning benchmarks, and trails only closed-source models such as Google's Gemini 3.1-Pro in world knowledge. The company added that performance is only "marginally short" of leading proprietary models, suggesting a gap of several months behind frontier systems, according to its disclosures. The V4 series includes a higher-performance Pro version and a lower-cost Flash variant, which offers similar reasoning capabilities with faster response times and improved cost efficiency. Both models support context windows of up to one million tokens. DeepSeek said the V4-Pro model is built on a mixture-of-experts architecture with 1.6 trillion total parameters and 49 billion active parameters, while the Flash version adopts a smaller configuration with 284 billion total parameters and 13 billion active parameters for improved efficiency. The company added that throughput for the Pro version is currently limited, reflecting constraints in high-end computing supply. The company also highlighted improvements in so-called agentic capabilities, particularly in coding-related tasks, and said the models adopt a sparse attention mechanism designed to reduce computational and memory requirements. Detailed technical specifications have not been independently verified. Huawei collaboration highlights shift in AI supply chain The V4 models are designed to run on Huawei's Ascend AI chip platform, reflecting closer collaboration between the two companies and a shift away from DeepSeek's earlier reliance on Nvidia chips, according to company statements and media reports. Huawei said its Ascend systems support the full DeepSeek V4 model series, underscoring its efforts to build a domestic AI computing ecosystem. The shift comes as US export restrictions on advanced semiconductors, introduced in 2022, continue to limit China's access to high-end chips, accelerating the development of local alternatives. Pricing and open-source approach add pressure on market DeepSeek continues to position its models as open-source, allowing developers to use and modify the code -- in contrast to the closed systems offered by major US firms. The company has emphasized cost efficiency as a key differentiator. Media reports indicate pricing for the V4 models is significantly lower than that of comparable proprietary systems, though final commercial terms have not been disclosed. DeepSeek added that the models are compatible with widely used interfaces from OpenAI and Anthropic, lowering barriers for developers looking to adopt the system. Market reports indicate the release weighed on shares of some domestic AI firms, reflecting concerns over rising competitive pressure. Launch coincides with rising geopolitical scrutiny The release follows heightened scrutiny from Washington. The White House has accused China of conducting "industrial-scale" theft of US artificial intelligence intellectual property, according to official statements. DeepSeek has been cited in broader concerns over potential export control violations and model development practices. The company has acknowledged using Nvidia hardware in earlier systems, but has not confirmed whether those chips were subject to restrictions. Chinese authorities have rejected the allegations. The Chinese Embassy in Washington said the claims are "baseless" and reiterated that Beijing places importance on intellectual property protection. Artificial intelligence development remains a key battleground in global technology competition. While US firms continue to lead in advanced model development, Chinese companies have narrowed the performance gap in recent years, according to industry analyses.
[21]
DeepSeek V4 Is Here -- Its Pro Version Costs 98% Less Than GPT 5.5 Pro - Decrypt
DeepSeek is back, and it showed up a few hours after OpenAI dropped GPT-5.5. Coincidence? Maybe. But if you're a Chinese AI lab that the U.S. government has been trying to slow down with chip export bans for the past three years, your sense of timing gets pretty sharp. The Hangzhou-based lab released preview versions of DeepSeek-V4-Pro and DeepSeek-V4-Flash today, both open-weight, both with one million token context windows. That means you can basically work with a context roughly the size of the Lord of the Rings Trilogy before the model collapses. Both are also priced well below anything comparable in the West, and both are free for those capable of running locally. DeepSeek's last major disruption -- R1 in January 2025 -- wiped $600 billion from Nvidia's market cap in a single day as investor questioned whether American companies really needed such huge investments to produce results that a small chinese lab achieved with a fraction of the cost. V4 is a different kind of move: quieter, more technical, and more focused on efficiency for anyone actually building with AI. Of the two new models, DeepSeek's V4-Pro is the big one, with 1.6 trillion total parameters. To put that in perspective, parameters are the internal "settings" or "brain cells" that a model uses to store knowledge and recognize patterns -- the more parameters a model has, the more complex information it can theoretically hold. That makes it the biggest open-source model in the LLM market to date. The size may sound ridiculous until you learn it only activates 49 billion of them per inference pass. This is the Mixture-of-Experts trick DeepSeek has refined since V3: The full model sits there, but only the relevant slice of it wakes up for any given request. More knowledge, same compute bill. "DeepSeek-V4-Pro-Max, the maximum reasoning effort mode of DeepSeek-V4-Pro, significantly advances the knowledge capabilities of open-source models, firmly establishing itself as the best open-source model available today," Deepseek wrote in the model's official card on Huggingface. "It achieves top-tier performance in coding benchmarks and significantly bridges the gap with leading closed-source models on reasoning and agentic tasks." V4-Flash is the practical one: 284 billion total parameters, 13 billion active. It's designed to be faster, cheaper, and according to DeepSeek's own benchmarks, "achieves comparable reasoning performance to the Pro version when given a larger thinking budget." Both support one million tokens of context. That's roughly 750,000 words -- roughly the entire "Lord of the Rings" trilogy plus change. And that's as a standard feature, not a premium tier. Here's the technical part for nerds or those interested in the magic powering the model. Deepseek doesn't hide its secrets, and everything is available for free -- the full paper is available on Github. Standard AI attention -- the mechanism that lets a model understand relationships between words -- has a brutal scaling problem. Every time you double the context length, the compute cost roughly quadruples. So running a model on a million tokens isn't just twice as expensive as 500,000 tokens. It's four times as expensive. This is why long context has historically been a checkbox labs add and then silently throttle behind rate limits. DeepSeek invented two new attention types to get around this. The first, Compressed Sparse Attention, works in two steps. It first compresses groups of tokens -- say, every 4 tokens -- into a single entry. Then, instead of attending to all of those compressed entries, it uses a "Lightning Indexer" to pick only the most relevant results for any given query. Your model goes from attending to a million tokens to attending to a much smaller set of the most important chunks, kind of like a librarian who doesn't read every book but knows exactly which shelf to check. The second, Heavily Compressed Attention, is more aggressive. It collapses every 128 tokens into a single entry -- no sparse selection, just brutal compression. You lose fine-grained detail, but you get an extremely cheap global view. The two attention types run in alternating layers, so the model gets both the detail and the overview. The result, from the technical paper: At one million tokens, V4-Pro uses 27% of the compute its predecessor (V3.2) needed. KV cache -- the memory the model needs to track context -- drops to just 10% of V3.2. V4-Flash pushes that further: 10% of compute, 7% of memory. And this ended up with Deepseek being able to offer a much cheaper price per token than its competitors, while providing comparable results. To put that in dollar terms: GPT-5.5 launched yesterday at $5 input and $30 output per million tokens with GPT-5.5 Pro priced at $30 per million input tokens and $180 per million output tokens. Deepseek V4-Pro is $1.74 input and $3.48 output. V4-Flash is $0.14 input and $0.28 output. Cline CEO Saoud Rizwan pointed out that if Uber had used DeepSeek instead of Claude, its 2026 AI budget -- reportedly enough for four months of usage -- would have lasted seven years. DeepSeek does something unusual in its technical report: It publishes the gaps. Most model releases cherry-pick the benchmarks where they win. DeepSeek ran the full comparison against GPT-5.4 and Gemini-3.1-Pro, found that V4-Pro's reasoning lags behind those models by about three to six months, and printed it anyway. Where V4-Pro-Max actually wins: Codeforces, competitive programming benchmark, rated like human chess. V4-Pro scored 3,206, placing it around 23rd among actual human contest participants. On Apex Shortlist, a curated set of hard math and STEM problems, it scored a pass rate and hit 90.2% versus Opus 4.6's 85.9% and GPT-5.4's 78.1%. On SWE-Verified, which measures whether a model can resolve real GitHub issues pulled from actual open-source repositories, it scored 80.6% -- matching Claude Opus 4.6. Where it trails: multitasking benchmark MMLU-Pro (Gemini-3.1-Pro at 91.0% vs V4-Pro at 87.5%), expert knowledge benchmark GPQA Diamond (Gemini 94.3 vs V4-Pro 90.1), and Humanity's Last Exam, a graduate-level benchmark where Gemini-3.1-Pro's 44.4% still beats V4-Pro's 37.7%. On long context specifically, V4-Pro leads open-source models and beats Gemini-3.1-Pro on the CorpusQA benchmark (a test simulating real document analysis at one million tokens), but loses to Claude Opus 4.6 on MRCR -- a test measuring how well a model retrieves specific needles buried deep in a very long haystack. The agentic stuff is where this release gets interesting for developers actually shipping products. V4-Pro can run in Claude Code, OpenCode, and other AI coding tools. According to DeepSeek's internal survey of 85 developers who used V4-Pro as their primary coding agent, 52% said it was ready to be their default model, 39% leaned toward yes, and fewer than 9% said no. Internal employees said it outperforms Claude Sonnet and approaches Claude Opus 4.5 on agentic coding tasks. Artificial Analysis, which runs independent evaluations of AI models on real-world tasks, ranked V4-Pro first among all open-weight models on GDPval-AA -- a benchmark testing economically valuable knowledge work across finance, legal, and research tasks, scored via Elo. V4-Pro-Max scored 1,554 Elo, ahead of GLM-5.1 (1,535) and MiniMax's M2.7 (1,514). For reference, Claude Opus 4.6 scores 1,619 on the same benchmark -- still ahead, but the gap is closing. Deepseek's V4 also introduces something called "interleaved thinking." In previous models, if you were running an agent that made multiple tool calls -- say, it searched the web, then ran some code, then searched again -- the model's reasoning context got flushed between rounds. Each new step, the model had to rebuild its mental model from scratch. V4 retains the full chain of thought across tool calls, so a 20-step agent workflow doesn't suffer from amnesia halfway through. This matters more than it sounds for anyone running complex automated pipelines. The U.S. has been restricting high-end Nvidia chip exports to China since 2022. The stated goal was to slow Chinese AI development, but the chip ban didn't stop DeepSeek and instead made them invent a more efficient architecture and build out domestic hardware supply. DeepSeek didn't release V4 in a vacuum -- the AI space has been flush with activity as of late: Anthropic shipped Claude Opus 4.7 on April 16 -- a model Decrypt tested and found strong on coding and reasoning, with notably high token usage. The day before that, Anthropic was also sitting on Claude Mythos, a cybersecurity model it says it can't release publicly because it's too good at autonomous network attacks. Xiaomi dropped MiMo V2.5 Pro on April 22, going full multimodal -- image, audio, video. Costs $1 input and $3 output per million tokens. It matches Opus 4.6 on most coding benchmarks. Three months ago, nobody was talking about Xiaomi as a frontier AI company. Now it's shipping competitive models faster than most Western labs. OpenAI's GPT-5.5 landed yesterday with costs spiking up to $180 per million tokens of output in the Pro version. It beats V4-Pro on Terminal Bench 2.0 (82.7% vs 70.0%), which tests complex command-line agent workflows. But it costs considerably more than V4-Pro for equivalent tasks. That same day Tencent released Hy3, another state-of-the-art model focused on efficiency. So with so many new models available, the question developers are actually asking: When is the premium worth it? For enterprise, the math may have changed. A model that leads open-source benchmarks at $1.74 per million input tokens means large-scale document processing, legal review, or code generation pipelines that were expensive six months ago are now much cheaper. The one-million-token context means you can feed entire codebases or regulatory filings in a single request instead of chunking them across multiple calls. Besides, its open-source nature means it can not only be run for free on local hardware, but it can be customized and improved based on the company's needs and use cases. For developers and solo builders, V4-Flash is the one to watch. At $0.14 input and $0.28 output, it's cheaper than models that were considered budget options a year ago -- and it handles most tasks the Pro version handles. DeepSeek's existing deepseek-chat and deepseek-reasoner endpoints already route to V4-Flash in non-thinking and thinking modes respectively, so if you're on the API, you're already using it. The models are text-only for now. DeepSeek said it's working on multimodal capabilities, which means other big labs from Xiaomi to OpenAI still have that edge. Both models are MIT licensed and available on Hugging Face today. The old deepseek-chat and deepseek-reasoner endpoints retire on July 24, 2026.
[22]
Everything to know about DeepSeek's new AI model V4
China's AI startup is back a year after it stirred up the AI industry with 'world-leading' processing power at a fraction of the cost of other models. The Chinese artificial intelligence company DeepSeek has unveiled a preview of its new large language model, V4, as the global AI race makes pace. DeepSeek became one of the most closely watched AI companies last year after releasing models in January 2025 that appeared to deliver strong performance at lower cost than many US rivals. The Hangzhou-based startup released the model as DeepSeek V4-Pro and DeepSeek V4-Flash. V4-Pro is a larger model aimed at more demanding tasks, while V4-Flash is a smaller version designed to respond faster and cost less to run. "In world knowledge benchmarks, DeepSeek V4-Pro significantly leads other open source models and is only slightly outperformed by the top-tier closed-source model Gemini-3.1-Pro," DeepSeek said. The company said the new AI model supports a "one-million token context length," meaning it can take in much larger prompts, such as long documents or code, before producing an answer. In AI, context length means the amount of text, measured in tokens, that a model can consider at one time. A larger context window lets an AI model process longer inputs and include more information in each output, according to IBM. "Welcome to the era of cost-effective 1M context length," the company said in an announcement on Hugging Face, an open source AI platform. DeepSeek also said V4 offers "world-leading long context with drastically reduced compute and memory costs". DeepSeek compared V4-Pro with rival models from OpenAI, Anthropic and Google, though Google's Gemini-3.1-Pro scored higher in some tests. The new model can be used with popular AI agents such as Claude Code, OpenClaw and OpenCode, allowing developers to use the model for software tasks outside DeepSeek's own chatbot. Known for its more open approach than its rivals with closed AI models, DeepSeek has also made V4 available for download, tested and adapted freely on Hugging Face. The Chinese startup gained attention in late 2024 with its open source V3 model, which surprised the AI industry with strong performance at a lower cost and using less powerful chips than many US rivals, including OpenAI and Google. The company caught global attention in January 2025 when it released R1, an AI reasoning model t that the company claimed was cheaper and performed just as well as OpenAI's rival ChatGPT's large language models. However, some countries banned government agencies last year from using DeepSeek, including Italy, the United States, and South Korea, citing national security concerns. Germany also banned DeepSeek in Apple and Google app stores in 2025, citing illegal transfer of user data to China. The release of V4 comes a day after the US AI giant OpenAI also launched its latest AI model, GPT-5.5, which it called its "smartest and most intuitive" yet. The release also comes a day after the White House accused China of stealing American AI labs' IP on an industrial scale using thousands of proxy accounts. OpenAI, Anthropic, and Google warned this year that Chinese AI firms, including DeepSeek, were conducting model extraction attacks (MEA), otherwise known as "distillation". The method feeds the larger model thousands of questions, collects its answers, and uses those responses to teach a new, smaller model to think in the same way.
[23]
Five things to know about Chinese AI startup DeepSeek
Beijing (AFP) - As DeepSeek releases its first major new artificial intelligence model in over a year -- DeepSeek-V4 -- here are five things to know about the Chinese startup: Founded by Liang Wenfeng in the eastern Chinese tech hub Hangzhou, DeepSeek started life in 2023 as a side project of Liang's data-driven hedge fund that had access to a cache of powerful AI processors made by US chip giant Nvidia. It shot to global attention in January 2025 with the release of its R1 deep-reasoning large language model, which sparked a US tech share sell-off. Industry insiders were stunned by R1's high performance -- at a level similar to ChatGPT and other leading US chatbots -- and DeepSeek's claims to have developed it at a fraction of the cost. Venture capitalist Marc Andreessen described it as a "Sputnik moment" -- referencing the 1957 launch of Earth's first artificial satellite by the Soviet Union that stunned the Western world. Censorship concerns Like other Chinese chatbots, DeepSeek's AI tools eschew topics usually censored in the world's second-largest economy, such as the 1989 Tiananmen crackdown. That and data privacy concerns have led DeepSeek AI to be banned or restricted on government-issued devices in several countries, including the United States, Australia and South Korea. However, its low cost and ease of deployment have made it a popular choice in developing countries, analysts say. The company holds four percent of global market share for chatbots, according to web traffic analysis company Similarweb. ChatGPT dominates at 68 percent. Open source DeepSeek's systems are open-source -- meaning their inner workings are public, allowing programmers to customise parts of the software to suit their needs. That is the same for other major Chinese AI players, including tech giant Alibaba, in contrast to the "closed" models sold by OpenAI and other Western rivals. The Chinese government has trumpeted its lead in open-source AI technology, which it says can accelerate innovation. "Chinese AI models are leading the way in the open-source innovation ecosystem," National People's Congress spokesman Lou Qinjian told policymakers this month. Startup boost The success of DeepSeek has galvanised China's AI scene, despite hurdles posed by rivalry with the United States, and fears of a global market bubble. Shares in two leading Chinese AI startups, Zhipu AI and MiniMax, soared on their market debuts in Hong Kong this year, and it has been a similar story for Chinese chipmakers such as MetaX. Shi Yaqiong and her team at Beijing-based Jinqiu Capital told AFP there has been a "clear surge" in enthusiasm around Chinese AI -- and competition among investors -- since the DeepSeek shock. Chip smuggling reports DeepSeek's rise has not been without controversy. Reports, including in technology outlet The Information, say DeepSeek has been skirting a US ban on the export of top-end chips to China to train its new V4 model. The Information said in December, citing six people with knowledge of the matter, that DeepSeek developed V4 using thousands of chips dismantled in third countries and smuggled to China. DeepSeek did not respond to AFP's request for comment. Nvidia did not respond to a request for comment but told The Information that they had not seen any evidence of this and that "such smuggling seems farfetched".
[24]
DeepSeek open-sources V4 large language model series - SiliconANGLE
Chinese artificial intelligence developer DeepSeek today released a new series of open-source large language models. V4, as the algorithm family is called, comprises 2 LLMs on launch. There's the flagship V4-Pro and a smaller model called V4-Flash that trades off some output quality for lower hardware usage. Both algorithms are based on a mixture of experts, or MoE, architecture. That means they comprise multiple neural networks rather than a single set of artificial neurons. V4-Pro has 1.6 trillion parameters and activates a subset of its neural networks with 49 billion parameters when answering user prompts. V4-Flash, in turn, contains 284 billion parameters and activates 13 billion at any given time. One of the new architectural features in the LLM series is a so-called hybrid attention mechanism. An LLM's attention mechanism ranks the data points in a user prompt based on their importance. The model takes the most relevant data points into consideration when generating responses and discards irrelevant details, which boosts output quality. Attention mechanisms don't process prompts in their original form, but rather use a mathematical representation called a KV cache. V4's hybrid attention architecture uses two different compression methods to reduce the size of the KV cache, which lowers memory requirements. As a result, the model family's KV cache uses 90% less memory during inference than the one in DeepSeek's previous-generation LLMs. Many of the other new features in the V4 lineup were added to optimize its training workflow. A neural network comprises artificial neuron collections called layers that process data in a specific order. Prompts enter the first layer, which carries out a series of calculations and transmits the results to the second layer. The second layer then performs calculations of its own, sends the results to the third layer and so forth. Data regularly moves between an LLM's layers during training. V4 includes a feature called mHC that enables data to travel directly between distant layers without going through the intermediate neuron clusters between them. That approach reduces training errors, which in turn boosts AI output quality. The neuron clusters between the first and last layers of an LLM are known as its hidden layers. According to DeepSeek, V4 uses a software module called Muon to optimize the hidden layers. It helps speed up training runs and reduce the associated infrastructure requirements. DeepSeek carried out V4's initial training using a dataset that compromised about 27 trillion tokens. It then applied a two-step post-training workflow. The first step separately optimized the neural networks that make up each V4 model, while the second improved their ability to coordinate their work.
[25]
China's DeepSeek unveils long-awaited V4 AI model
At its highest capacity, DeepSeek's V4 'redefines the state-of-the-art for open models'. Chinese AI darling DeepSeek has launched its long awaited V4 large language model (LLM) in preview, as speculation around a possible first funding round swirls. The latest open source launch comes more than a year after the start-up released R1, whose cost effectiveness and performance sent Silicon Valley leaders in a flurry, igniting accusations of theft. R1 was trained using lower-capacity Nvidia chips. The V4 series comes in two versions, a "Pro" with 49bn activated parameters and a "Flash" version with 13bn activated parameters, both supporting a context length of 1m tokens. At its maximum capacity, the V4-Pro-Max mode "redefines the state-of-the-art for open models, outperforming its predecessors in core tasks", DeepSeek said. This mode has "significantly closed the gap" with Google's Gemini 3.1-Pro, the leading model in knowledge-based evaluations, according to the company, while outpacing OpenAI's GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks. In agentic tasks, DeepSeek's V4-Pro-Max is on par with leading open-source models, such as Kimi-K2.6 and GLM-5.1, but slightly worse than frontier closed models, it noted. Its internal evaluations revealed that the Pro-Max version outperforms Anthropic's Claude Sonnet 4.5 and approaches the level of Opus 4.5. Huawei has said that its Ascend supernode based on Ascend 950 AI chips would be supporting V4's versions. OpenAI made fresh allegations against DeepSeek as recent as February, calling the company's distillation techniques a part of an "ongoing efforts to free-ride on the capabilities developed by OpenAI and other US frontier labs." While, the US White House yesterday (23 April) said it will work closely with AI companies to fight "industrial-scale campaigns" by foreign actors attempting to steal its technology. DeepSeek's AI competitors Tencent and Alibaba are reportedly in talks to join the company's first funding round. A source told Bloomberg that the benchmark for a valuation would be around $40bn. The publication further reported that Tencent has proposed a 20pc stake in the company. DeepSeek's Chinese contemporaries have made their own AI model launches in the months past, wishing to get ahead of V4, which was hyped to be the company's most important launch since R1, and V3 in late 2024. Latest launches include Alibaba's Quen3.5; ByteDance Seedance 2.0; Zhipu's GLM-5, trained entirely using Chinese chips; MiniMax, which released M2.5; and the Alibaba-backed Moonshot AI, which came out with Kimi K2.5. Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
[26]
DeepSeek slashes V4-Pro API pricing with major discount
DeepSeek announced major price reductions for its API service on Saturday, reducing input cache hit fees to one-tenth of their original price and offering a 75% limited-time discount on its flagship V4-Pro model through May 5. The new promotional price for V4-Pro's input cache hit is now 0.025 yuan, approximately $0.0036, per million tokens. The model's standard pricing during this promotional period is set at 3 yuan for input and 6 yuan for output per million tokens. These rates significantly undercut Western competitors, whose output prices range from $12 to $25 per million tokens, according to OpenRouter data. DeepSeek released the V4-Pro and V4-Flash models in a preview capacity on April 24, marking the company's first significant model launch since last December. V4-Pro features 1.6 trillion parameters, with 49 billion active parameters per inference pass, positioning it as the largest open-weight model currently available. In contrast, V4-Flash offers a smaller option with 284 billion parameters. Prior to these discounts, V4-Pro's standard pricing of $1.74 for input and $3.48 for output per million tokens was already approximately 98% less than OpenAI's GPT-5.5 Pro rates. The latest cuts further widen this price differential. DeepSeek's pricing strategy is a response to rising computing power costs within the AI sector. Wei Sun, principal AI analyst at Counterpoint Research told CNN that the company has implemented "the concept of 'AI price reduction'" amidst overall increases in industry costs. DeepSeek's V4 models operate on Huawei Ascend hardware rather than Nvidia chips. This transition is viewed as important for AI system development and deployment due to reduced dependency on Nvidia technology. Sun emphasized that this shift may accelerate domestic adoption and global AI advancements. Additionally, V4-Pro requires only 27% of the computing power needed by its predecessor, V3.2, for a one-million-token context window. DeepSeek has acknowledged that V4 models currently lag behind frontier models like GPT-5.4 and Gemini 3.1 Pro by about three to six months in performance capabilities.
[27]
US-China AI race intensifies as DeepSeek releases 'reduced' cost model
China's DeepSeek on Friday released a new AI model with "drastically reduced' costs that it said was capable of processing extra-long texts to help it complete tasks. The company caused shockwaves last year after it revealed a reasoning model that upended assumptions of US dominance in the sector. Chinese startup DeepSeek released a new artificial intelligence model with "drastically reduced" costs Friday, more than a year after it stunned the world with a low-cost reasoning model that matched the capabilities of US rivals. The AI race has intensified the rivalry between China and the United States, and the White House on Thursday accused Chinese entities of a massive effort to steal artificial intelligence technology. Hangzhou-based DeepSeek burst onto the scene in January last year with a generative AI chatbot, powered by its R1 reasoning model, that upended assumptions of US dominance in the strategic sector. The new version, DeepSeek-V4, "features an ultra-long context of one million words", the company said in a statement on social media platform WeChat, hailing it as "world-leading... with drastically reduced compute (and) memory costs" in a separate announcement on X. The model's context length, which determines how much input a model is able to absorb to help it complete tasks, "(achieves) leadership in both domestic and open-source fields across agent capabilities, world knowledge, and reasoning performance", the WeChat statement said. A "preview version" of the open source model is now available, the company said. Experts say V4's release marks an "inflection point" in terms of hardware and cost. "This addresses the long-standing issues of slower performance and higher costs associated with long context lengths, marking a genuine inflection point for the industry," Zhang Yi, the founder of tech research firm iiMedia, told AFP. "For end users, this will bring widespread, accessible benefits. For instance, if ultra-long context support becomes a standard feature, long-text processing is expected to move beyond high-end research labs and enter mainstream commercial applications," he said. The new V4 is released as two versions, DeepSeek-V4-Pro and DeepSeek-V4-Flash, with the latter being "a more efficient and economical choice" because it has smaller parameters. 'Sputnik moment' V4-Pro has 1.6 trillion parameters while the V4-Flash has 284 billion parameters, which refine models' decision-making ability. The model has also been "optimised" for popular AI Agent products such as Claude Code, OpenClaw, OpenCode and CodeBuddy, the DeepSeek statement said. "In world knowledge benchmarks, DeepSeek-V4-Pro significantly leads other open-source models and is only slightly outperformed by the top-tier closed-source model, (Google's) Gemini-Pro-3.1," the statement added. Last year's so-called "DeepSeek shock" sparked a sell-off of AI-related shares and a reckoning on business strategy in what was also described as a "Sputnik moment" for the industry. The chatbot performed at a similar level to ChatGPT and other top American offerings, but the company said it had taken significantly less computing power to develop. However, its sudden popularity raised questions over data privacy and censorship, with the chatbot often refusing to answer questions on sensitive topics such as the 1989 Tiananmen crackdown. At home, DeepSeek's AI tools have been widely adopted by Chinese municipalities and healthcare institutions as well as the financial sector and other businesses. This has been partly driven by DeepSeek's decision to make its systems open source, with their inner workings public - in contrast to the proprietary models sold by OpenAI and other Western rivals. But the White House has accused Chinese firms of vying to "steal" American technology, ahead of an expected summit between Donald Trump and Xi Jinping in Beijing next month. "The US has evidence that foreign entities, primarily in China, are running industrial-scale distillation campaigns to steal American AI," Trump's science and technology chief advisor Michael Kratsios said in a post on X. Distillation is a common practice within AI development, often used by companies to create cheaper, smaller versions of their own models. DeepSeek's Friday announcement also came as Meta said it planned to cut a tenth of its staff as it looks for productivity gains from the rest of the workforce while investing heavily in artificial intelligence. Reports said Microsoft was also looking to trim its ranks.
[28]
China's DeepSeek Rolls Out a Long-Anticipated Update of Its AI Model
HONG KONG (AP) -- DeepSeek, the Chinese artificial intelligence startup that shook world markets last year, launched preview versions of its latest major update Friday as the AI rivalry between China and the U.S. heats up. DeepSeek's V4 has been keenly anticipated by users keen to test how it compares to U.S. competitors like OpenAI's ChatGPT, Anthropic's Claude and Google's Gemini. Anthropic and OpenAI have accused DeepSeek of unfairly building its technology off their own. Some industry analysts had expected the new model to arrive more than a month earlier at the start of the Lunar New Year. DeepSeek says the new V4 open-source models, which include "pro" and "flash" versions, have big improvements in knowledge, reasoning and in their "agentic" capabilities - the ability to perform complex tasks and workflows autonomously. V4 is a successor to V3, an AI model that DeepSeek released in late 2024. But it was DeepSeek's specialized "reasoning" AI model, called R1, that took markets by surprise with its release in January 2025. DeepSeek claimed it was more cost-effective than OpenAI's similar model and it became a symbol of how China was catching up with the U.S. in technological advancements. DeepSeek said the "V4 Pro Max" version has "superior performance" in terms of standard reasoning benchmarks relative to OpenAI's GPT-5.2 model and Google's Gemini 3.0-Pro. It falls "marginally" short of GPT-5.4 and Gemini 3.1-Pro, it said. In terms of "agentic" capabilities, the Chinese company said the V4 "pro" version could outperform Claude's Sonnet 4.5 and approaches the level of Claude's Opus 4.5 model based on its own evaluation. The "flash" version of V4 performs on a par with the "pro" version on simple agent tasks and has reasoning capabilities closely approaching it, DeepSeek said. "Based on the benchmark results, it does appear DeepSeek V4 is going to be very competitive against its U.S. rivals," said Lian Jye Su, chief analyst at the technology research and advisory group Omdia. Marina Zhang, an associate professor at the University of Technology Sydney, said DeepSeek's V4 rollout is as a "pivotal milestone for China's AI industry", especially as global competition intensifies in the pursuit of self-reliance in critical technologies. DeepSeek offers a free‑to‑use web and mobile chatbot. Unlike the top models from Anthropic, Google and OpenAI, it describes its technology as "open source" in the way that it enables developers access to modify and build on its core technology. Both the V4's "pro" and "flash" versions have a 1 million token context window, a parameter of how much information an AI model can process and recall, and run on a more efficient basis, the startup said. That is a significant improvement from before, since the V3 supported a 128,000 token context window. A report from Microsoft in January showed use of DeepSeek has been gaining ground in many developing nations. However, some analysts remain skeptical. Ivan Su, a senior equity analyst at Morningstar, said while V4 is a "competent" follow-up, it's not as big a breakthrough as the rollout of R1. "Domestic competition has intensified significantly since R1's release," Su said. "Against U.S. models, DeepSeek's own evaluation suggests its capabilities largely match on most fronts, but independent evaluations are needed before final conclusions can be drawn." In February, Anthropic accused DeepSeek and two other China-based AI laboratories of "industrial-scale campaigns" to "illicitly extract Claude's capabilities to improve their own models." It said they did that using a technique called distillation that "involves training a less capable model on the outputs of a stronger one." OpenAI made similar allegations in a letter to U.S. lawmakers. This week, Michael Kratsios, chief science and technology adviser to U.S. President Donald Trump, also accused foreign tech companies "principally based in China" of distilling leading U.S. AI systems and "exploiting American expertise and innovation." China's embassy in Washington hit back at the allegations, describing them as "unjustified suppression of Chinese companies by the U.S." ___ O'Brien reported from Providence, Rhode Island.
[29]
DeepSeek launches V4 model with one million token context
China's DeepSeek has launched its AI model, DeepSeek-V4, claiming it offers enhanced capabilities over open-source alternatives. The new model is optimized for domestic chips and features an ultra-long context of one million words, asserting leadership in agent capabilities, world knowledge, and reasoning performance. DeepSeek-V4 is available in two editions: DeepSeek-V4-Pro and DeepSeek-V4-Flash. The latter is marketed as a more efficient and economical option. In world knowledge benchmarks, DeepSeek-V4-Pro significantly outperforms other open-source models and closely trails Google's closed-source model, Gemini-Pro-3.1. The DeepSeek-V4-Pro version introduces a "maximum reasoning effort mode," enhancing its knowledge capabilities compared to existing open-source models. This release follows a previous market downturn caused by DeepSeek's earlier R1 model, which effectively competed against ChatGPT at a lower cost. While the specific chip system used for training the V4 models remains undisclosed, DeepSeek stated its software is compatible with Nvidia and Huawei chips. This launch coincides with increasing U.S. semiconductor export restrictions to China, particularly for high-end GPUs crucial for AI development. The new model can process a maximum output of 384,000 tokens, which are the fundamental units of data for AI models. A token typically represents about four characters; thus, the model's rapid processing enhances learning and response speed. DeepSeek claims a significant jump in computational efficiency, capable of understanding the context of up to one million tokens. DeepSeek-V4-Pro reportedly outperforms Google's Gemini-3.1-Pro in processing long text strings but still trails Anthropic's Claude Opus 4.6 model. The company aims to further improve the model's intelligence and usability across various applications. DeepSeek indicated that this breakthrough would initiate a new era of million-length contexts for next-generation language models. "This breakthrough enables efficient support for a context length of one million tokens," DeepSeek stated in its announcement.
[30]
DeepSeek-V4, the Chinese AI model adapted for Huawei chips - The Economic Times
Chinese AI startup DeepSeek has launched V4, a new model designed for Huawei chips. This move strengthens China's drive for an independent AI sector. The V4 model comes in Pro and Flash versions, offering advanced capabilities. DeepSeek's adaptation for Huawei hardware signals progress in domestic AI infrastructure.Chinese startup DeepSeek on Friday released a preview version of V4, its new artificial intelligence model adapted to run on Huawei chips, marking another step in China's push to build a self-sufficient AI ecosystem. Here is what we know so far about the long-awaited open-source offering. DeepSeek said V4 is designed to work with agent frameworks including Claude Code and OpenClaw, reflecting the industry shift away from prompt-based chatbots towards models that can complete complex, multi-step tasks with less human input. V4 comes in two versions: the more powerful and more expensive Pro, and the cheaper, lighter Flash. Pro is positioned as a higher-end model with performance comparable to leading closed-source systems, particularly in agentic coding, world knowledge, STEM (science, technology, engineering and mathematics) and competitive programming. In maximum reasoning mode, Pro outperforms all open-source models, though it still trails frontier closed-source systems such as Google's Gemini 3.1 Pro and OpenAI's GPT-5.4 in some areas, according to a DeepSeek paper released alongside the model. "DeepSeek-V4-Pro Max ... redefines the state of the art for open models, outperforming its predecessors in core tasks," DeepSeek said. Flash delivers similar reasoning ability in some areas but runs faster and at lower cost than Pro, with weaker world knowledge and lower performance on more demanding agent-based tasks. Both versions support a 1-million-token context window, matching the expansion DeepSeek introduced with V3 in February. DeepSeek said V4's architecture is designed to reduce compute and memory costs for long-context use. Adapyed for Huawei chips A key change from earlier DeepSeek releases is that V4 was adapted for Huawei's most advanced Ascend AI chips. Reuters reported in February that DeepSeek had not shared its new model with U.S. chipmakers for performance tuning, instead granting early access to domestic companies such as Huawei, despite previously working closely with Nvidia's technical staff. Hours after the preview release, Huawei said V4 is fully supported on its Ascend 950-based supernode clusters, and that its chips were used for part of V4-Flash's training. "Through close technical collaboration ... the entire Ascend supernode product line now supports the DeepSeek-V4 series models," Huawei said. DeepSeek's earlier V3 and R1 models were trained on Nvidia chips. The company did not say whether the same applied to V4. Lian Jye Su, chief analyst at tech research firm Omdia, said the partnership shows DeepSeek models can deliver similar performance on both Huawei and Nvidia hardware. "The popularity of DeepSeek in the domestic Chinese market encouraged Huawei to optimize the model for its hardware, and this, in turn, lowers the barriers for Chinese developers and companies to build AI apps entirely on domestic solutions," he said. He added that Huawei still trails Nvidia technologically, and moving developers away from Nvidia's ecosystem remains difficult. Even so, he said, "DeepSeek's pivot reveals real, tangible progress toward AI infrastructure self-sufficiency." DeepSeek also faces compute constraints under U.S. export controls on Nvidia chips and chipmaking equipment. The company said Pro can cost up to 12 times more than Flash because of "constraints in high-end compute capacity," limiting current Pro service availability. DeepSeek said Pro pricing could fall sharply once Huawei Ascend 950 supernodes are deployed at scale in the second half of the year.
[31]
Why Developers Are Switching to DeepSeek V4 Flash for Open-Source AI
DeepSeek V4 Flash, as outlined by Gen AI Spotlight, introduces a dual-model system designed to address varying operational needs. The Pro model, featuring 1.6 trillion parameters, excels in handling complex tasks like large-scale data analysis and advanced natural language processing. In contrast, the Flash model prioritizes speed and efficiency, making it well-suited for real-time applications or environments with limited computational resources. Both models share a notable feature: a 1 million token context window, allowing them to process extensive content such as lengthy documents or intricate codebases. Additionally, their open source framework allows developers and researchers to adapt the models without proprietary restrictions. Explore the Flash model's cost structure, including its $0.30 per 1 million output tokens pricing and understand how it balances affordability with dependable performance. Gain insight into its practical applications, such as autonomous workflows and skill chaining and examine its compatibility with platforms like Nvidia and Open Code for streamlined integration. This breakdown highlights the specific features and use cases of DeepSeek V4 Flash. DeepSeek V4 offers two configurations designed to address varying computational and operational needs: This dual-model approach ensures that users can select a solution aligned with their specific priorities, whether they require maximum computational power or a more streamlined, efficient option. A defining feature of DeepSeek V4 is its open source framework, which sets it apart from many proprietary AI systems. Both the Pro and Flash models, along with their weights, are freely available on platforms such as Hugging Face. This level of accessibility enables developers and researchers to: By promoting transparency and collaboration, DeepSeek V4 is not only advancing the field of AI but also providing widespread access to access to powerful tools that were previously restricted to a select few. Here are additional guides from our expansive article library that you may find useful on DeepSeek. DeepSeek V4 has undergone rigorous benchmarking against leading models such as GPT-54 and Opus, demonstrating competitive performance across key metrics. Its real-world testing highlights its strengths in managing complex, multi-step tasks, particularly in agentic workflows. Key performance highlights include: These benchmarks underscore DeepSeek V4's reliability and versatility, making it a strong contender for users seeking high-performance AI solutions capable of handling diverse and demanding applications. Affordability is a cornerstone of DeepSeek V4's appeal, making sure that advanced AI technology is accessible to a wide range of users. This pricing strategy makes DeepSeek V4 an attractive option for businesses of all sizes, from startups to large enterprises, looking to integrate advanced AI into their operations without incurring prohibitive costs. DeepSeek V4 demonstrates remarkable versatility, excelling in a wide range of use cases across industries. Its capabilities include: These features make DeepSeek V4 a versatile tool for applications requiring adaptability, efficiency and precision, from content creation to data analysis and beyond. DeepSeek V4 is designed for ease of integration, making sure that it fits seamlessly into existing workflows and operational setups. This focus on integration and compatibility enhances DeepSeek V4's utility, making it a practical choice for organizations looking to incorporate advanced AI into their existing systems without significant disruption. DeepSeek V4 distinguishes itself in the competitive AI market through its: However, challenges remain. Concerns about data usage and the implications of its Chinese origin may influence adoption in certain markets, particularly those with stringent data security regulations. While these factors warrant consideration, they do not overshadow the technical strengths and potential impact of DeepSeek V4. DeepSeek V4 supports a robust ecosystem of tools designed to empower developers and foster innovation. Key features include: These tools and integrations make DeepSeek V4 a powerful resource for developers aiming to push the boundaries of what AI can achieve. DeepSeek V4 sets a new standard for open source AI technology by combining power, accessibility and cost-effectiveness. Its dual-model approach, competitive performance and robust developer ecosystem make it a versatile and practical solution for a wide range of applications. By fostering transparency and collaboration, DeepSeek V4 is not only advancing the field of AI but also making sure that its benefits are accessible to a global audience. As the AI landscape continues to evolve, DeepSeek V4 is well-positioned to play a pivotal role in shaping its future. Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[32]
DeepSeek V4 Cuts KV Cache by 90% at 1M Tokens, But Aggressive Compression Could Risk 'Needle in a Haystack' Failures
Chinese artificial intelligence lab DeepSeek claims to significantly reduce computing resources required for token inference and memory resources with its latest V4 model, according to its release notes. DeepSeek claims that the V4 AI model requires just 27% single-token inference FLOPs and 10% of key-value (KV) cache when compared to its predecessor, the DeepSeek V3.2 model. The reduction in cache requirements addresses memory requirements, with lower requirements conserving memory and increasing the context available to model builders when creating their models. In its release notes for DeepSeek V4, DeepSeek outlines that the new model is capable of using just 27% of single-token inference FLOPs and 10% of key-value (KV) cache when running a one-million-token context window. A context window is the amount of text that an artificial intelligence large language model can process before it has to free up memory resources. This improved memory utilization is particularly important when it comes to the Decode phase of AI computing, which is broadly defined in two phases, Prefill and Decode. Since the AI model generates outputs in the Decode phase, it has to store the context of the conversation or the prompt it received in the Prefill stage. As a result, the Decode phase requires more memory than Prefill, particularly when it comes to the key-value (KV) cache. As the number of tokens in a context increases, so do the requirements from the KV cache, which means that at one million tokens, a model that uses lower cache is able to process more requests or require fewer memory resources. DeepSeek's other claim of the V4 model requiring 27% single-inference token FLOPs only improves performance if there is adequate memory available for the GPU to perform calculations. Additionally, using significantly less cache memory requires the model to rely on trade-offs, which can make it miss out on the specifics. This is called a "needle in a haystack" failure and might lead to imprecise outputs. The latest improvements rely on DeepSeek's Multi-Head Latent Attention architecture, which it introduced in earlier models. This architecture was designed keeping memory constraints in mind, as it compresses the model's key and value in a single unit, which is then expanded at the time of computation to allow the model to utilize resources efficiently.
[33]
China's DeepSeek V4 shocks with power and price -- but is it moving too fast to survive?
China's fast-rising AI contender DeepSeek has unveiled its latest model, DeepSeek V4, with massive scale and aggressive pricing. But behind the headline numbers, a wave of scrutiny is building, from chip-related allegations to safety concerns, raising a bigger question: can this Chinese AI giant sustain its breakneck pace, or is it heading toward a potential stumble? The newly released DeepSeek V4 comes in two versions: a high-end "Pro" model with 1.6 trillion parameters and a lighter "Flash" version designed for efficiency. Both boast a one-million-token context window, a major technical leap aimed at handling longer, more complex prompts. Like earlier iterations, the model remains open-source, allowing developers to download, modify and deploy it freely. That strategy has helped DeepSeek gain traction globally, positioning it as a low-cost alternative to systems from companies like OpenAI and Google. Yet unlike the surprise success of its earlier R1 model, the V4 launch has been met with a more cautious reaction. When DeepSeek's earlier models hit the market, they disrupted expectations around cost and performance. But analysts say that novelty has worn off. Instead of shock, the conversation around DeepSeek V4 is shifting toward risk. Experts now broadly accept that Chinese AI can compete on price and capability, but questions remain about reliability, compliance and long-term sustainability. This change in tone suggests the company is no longer judged as an underdog innovator, but as a major player facing global scrutiny. One of the biggest concerns surrounding DeepSeek V4 is the pace of development. Reports suggest that China's AI race is accelerating rapidly, with companies under pressure to deliver cutting-edge systems quickly. But that speed may come at a cost. Security researchers have flagged vulnerabilities in earlier models, while academic testing has reportedly shown weaknesses in safety guardrails. These findings raise concerns about how well the systems can prevent misuse or harmful outputs. As models grow more powerful, those risks scale as well, making safety not just a technical issue, but a global one. The concerns are no longer theoretical. Several U.S. states have already restricted DeepSeek's use on government devices, while federal-level limitations have also been reported. In Europe, regulators across multiple countries are examining the company's data practices, with coordinated responses being discussed at the policy level. This growing scrutiny reflects a broader tension in the global AI race, where innovation is often followed closely by regulation. Another major issue hanging over DeepSeek V4 is how it was trained. Reports have alleged that the company may have used advanced chips restricted under U.S. export controls, particularly high-end processors from Nvidia. There are also suggestions that alternatives from Huawei could have been involved. Compounding the issue are claims that training methods may have relied on outputs from other AI systems, a process known as distillation. If true, it could raise intellectual property questions involving major players like Anthropic and others. DeepSeek has not publicly addressed these allegations, leaving uncertainty around one of the most critical aspects of AI development: how the model was built. One of DeepSeek's strongest advantages remains its pricing. The company continues to offer powerful models at significantly lower costs than competitors, making it attractive to developers and businesses worldwide. But affordability alone may not be enough. As the company scales, it must navigate export controls, regulatory pressure and trust concerns, all while maintaining technical performance. That balance is difficult even for established players. DeepSeek V4 represents both ambition and risk. It showcases how far Chinese AI has come, but also highlights the challenges of scaling rapidly in a highly scrutinized industry. The path forward will likely depend on whether DeepSeek can address safety gaps, clarify its training methods and build confidence among regulators and users alike. Without that, the warning that it could "crash and burn" may not be entirely far-fetched. DeepSeek V4 is the latest AI model from DeepSeek, featuring advanced capabilities, large-scale parameters and open-source access. It faces concerns over safety, alleged use of restricted chips and possible reliance on outputs from other AI systems.
[34]
Deepseek V4 : Why Its 1.6 Trillion Parameters Aren't Quite Enough
Deepseek v4 has officially undergone comprehensive testing, revealing both its potential and its limitations. Developed as an open source AI model, it is available in two versions: the high-performance Deepseek v4 Pro and the cost-efficient Deepseek v4 Flash. The Pro model, with its 1.6 trillion parameters and focus on advanced tasks like STEM applications and code generation, aims to cater to demanding use cases. Meanwhile, the Flash model offers a streamlined alternative with 284 billion parameters, targeting users with simpler needs. However, as highlighted by World of AI, real-world testing has exposed critical gaps in performance, particularly in areas requiring creativity, nuanced reasoning, or precision. Explore the strengths and weaknesses of Deepseek v4 through a closer look at its pricing structure, task-specific performance and how it compares to competitors like Kimi K2.6 and Opus 4.6. Gain insight into why the Pro model struggles with consistency despite its technical specifications and learn how the Flash model balances affordability with practical constraints. This breakdown also examines where Deepseek v4 excels, such as long-context processing and considers what improvements are necessary for it to better meet the demands of its users. Deepseek v4 introduces two models designed to meet the diverse needs of AI users: Both models are open source and licensed under the MIT framework, making sure accessibility for developers and researchers. This licensing approach encourages innovation and collaboration within the AI community. The pricing structure reflects the models' target audiences: These features position Deepseek v4 as a versatile tool for a wide range of applications, but its performance in real-world scenarios has raised questions about its overall reliability. Despite its ambitious design, Deepseek v4 has struggled to consistently meet expectations in practical applications. While benchmark tests suggest competitive capabilities, real-world use cases reveal significant performance gaps. For instance, both models underperform in areas like UI design, 3D modeling and cloning applications, where precision and innovation are critical. These shortcomings highlight a disconnect between the models' theoretical capabilities and their practical utility, limiting their appeal for users seeking reliable, high-quality outputs. Deep dive into the latest in Deepseek by exploring our other resources and articles. In the competitive landscape of AI, Deepseek v4 faces formidable rivals. Models such as Kimi K2.6, Qwen 3.6 Plus, Minimax M2.7 and Opus 4.6/4.7 consistently outperform Deepseek in key areas, including code generation, reasoning algorithms and creative tasks. For example, in the Code Arena, a widely recognized benchmark for evaluating AI performance, Deepseek v4 Pro ranked third, trailing behind GLM 5.1 and Kimi K2.6. These results underscore the need for significant refinement if Deepseek is to establish itself as a serious contender in the crowded AI market. The competition highlights the importance of not only technical specifications but also the ability to deliver consistent, high-quality results in real-world scenarios. Deepseek v4 is not without its strengths. Its open source nature and cost efficiency make it an attractive option for developers and researchers seeking affordable AI solutions. Additionally, the models excel in long-context processing, which could serve as a foundation for future scalability and improvements. However, these strengths are counterbalanced by several critical weaknesses: These limitations diminish the models' overall appeal, particularly when compared to more sophisticated proprietary and open source alternatives. While the affordability and accessibility of Deepseek v4 are commendable, its performance issues make it less suitable for users with high-stakes or complex requirements. Deepseek v4 represents a work in progress, offering a glimpse into its potential as an open source AI solution. While the current iteration falls short in several key areas, there is room for improvement. Addressing issues such as inconsistent output quality, refining reasoning algorithms and optimizing performance for task-specific applications could significantly enhance its value. If these challenges are effectively addressed, Deepseek v4 could emerge as a competitive alternative in the AI ecosystem. Its strengths in affordability and long-context processing provide a solid foundation for future development, particularly for users who prioritize cost efficiency and open source accessibility. However, achieving this potential will require a concerted effort to bridge the gap between the models' theoretical capabilities and their practical performance. Deepseek v4's journey underscores the complexities of developing AI models that balance technical innovation with real-world utility. While it is not yet a definitive solution, its open source framework and scalability suggest that it could play a meaningful role in the evolving AI landscape. Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[35]
Can DeepSeek's New AI Models be the Answer for US Models' Token Gluttony?
Once again DeepSeek costs only a sixth of its competitor US models, but the Chinese product falls behind on the Knowledge tests The East versus West battle for AI supremacy got a fresh impetus with the release of Chinese AI lab DeepSeek's two new versions of its frontier model DeepSeek V4. The company has pitched their newest model against the token-guzzling competitors such as Claude, ChatGPT, and Gemini, claiming it could lower inference costs drastically. Of course, the news of the release was greeted in the US by a State Department order claiming widespread efforts by Chinese companies like DeepSeek to steal intellectual property from the local AI labs. The State did this via a cable sent to diplomatic and consular posts across the world asking staff to speak to their foreign counterparts about "concerns over adversaries' extraction and distillation of US AI models. In the past, OpenAI has claimed that DeepSeek was targeting it other leading AI companies to replicate models. This flip-flop is actually surprising as prior to DeepSeek's announcement the western media was agog with reports of why the Chinese AI major's latest model was delayed by a year and four months after the V3.2 model and R1 reasoning had taken the AI world by storm last January. Most reports claimed that DeepSeek was hit by the US blockade of chips and their internal efforts to upgrade indigenous processors to do the same job. The same story was flipped around by the Chinese media which said the delay was aimed at a wider goal of achieving technology self-reliance - simply put they wanted Huawei Ascend chips. Either way, the Chinese did have a smooth landing with claims that DeepSeek V4 Flash and V4 Pro models with context windows of a million tokens each would be more than sufficient to allow large codebases to be used in prompts. This mixture of experts causes only a specific number of parameters per task, thus lowering costs. This probably explains why DeepSeek V4 is priced way below the others like Gemini, ChatGPT and Claude. Their V4 costs $1.74 per a million tokens and $3.48 per million output tokens. In comparison, GPT5.5 costs $5 and $30 respectively while Claude Opus 4.7 comes at $5 and $25. Gemini is the cheapest at $2 and $12 for the same range. On the performance front, DeepSeek claims their latest model has taken big strides in agentic tasks and coding, stating that "DeepSeek-V4 is seamlessly integrated with leading AI agents like Claude Code, OpenClaw & OpenCode." It even showcased some benchmark results that indicate similar results to comparable models. However, there is a catch. The new DeepSeek models fall behind frontier models in knowledge tests, specifically OpenAI's GPT-5.4 and the Gemini 3.1 Pro. The Chinese company admitted the lag and claimed it was a "developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months." Moreover, both the V4 Flash and V4 Pro support text only in its current avatar unlike the others that use better understanding (context) to generate audio, video and images. However, both models are far more efficient and performant than DeepSeek V3.2 and credit it to architectural improvements. Must say the US State Department has a knack with their timing, even if they haven't covered themselves with glory on other fronts. The DeepSeek launch comes exactly a day after the US accused China of stealing American IP, which is an accusation made by both OpenAI and Anthropic in the past. Maybe, the State Department's timing wasn't all that good after all. Or maybe, it is the Chinese who are blessed with better timing, especially given that US AI companies are battling each other and their respective customers are engaged in a scrap with their own employees over high token gluttony.
[36]
China's DeepSeek rolls out a long-anticipated update of its AI model - The Korea Times
HONG KONG -- DeepSeek, the Chinese artificial intelligence startup that shook world markets last year, launched preview versions of its latest major update Friday as the AI rivalry between China and the U.S. heats up. DeepSeek's V4 has been keenly anticipated by users looking to test how it compares to U.S. competitors like OpenAI's ChatGPT, Anthropic's Claude and Google's Gemini. Anthropic and OpenAI have accused DeepSeek of unfairly building its technology off their own. Some industry analysts had expected the new model to arrive more than two months earlier at the start of the Lunar New Year. DeepSeek says the new V4 open-source models, which include "pro" and "flash" versions, have big improvements in knowledge, reasoning and in their "agentic" capabilities - the ability to perform complex tasks and workflows autonomously. Another big change is they are supported in part by computer chips made by Chinese tech giant Huawei, reducing DeepSeek's reliance on U.S. chipmakers like Nvidia. V4 is a successor to V3, an AI model that DeepSeek released in late 2024. But it was DeepSeek's specialized "reasoning" AI model, called R1, that took markets by surprise with its release in January 2025. DeepSeek claimed it was more cost-effective than OpenAI's similar model and it became a symbol of how China was catching up with the U.S. in technological advancements. DeepSeek said the "V4 Pro Max" version has "superior performance" in terms of standard reasoning benchmarks relative to OpenAI's GPT-5.2 model and Google's Gemini 3.0-Pro. It falls "marginally" short of GPT-5.4 and Gemini 3.1-Pro, it said. DeepSeek's release came hours after OpenAI released its new GPT-5.5 model on Thursday. In terms of "agentic" capabilities, the Chinese company said the V4 "pro" version could outperform Claude's Sonnet 4.5 and approaches the level of Claude's Opus 4.5 model based on its own evaluation. The "flash" version of V4 performs on a par with the "pro" version on simple agent tasks and has reasoning capabilities closely approaching it, DeepSeek said. "Based on the benchmark results, it does appear DeepSeek V4 is going to be very competitive against its U.S. rivals," said Lian Jye Su, chief analyst at the technology research and advisory group Omdia. Marina Zhang, an associate professor at the University of Technology Sydney, said DeepSeek's V4 rollout is as a "pivotal milestone for China's AI industry," especially as global competition intensifies in the pursuit of self-reliance in critical technologies. DeepSeek offers a free‑to‑use web and mobile chatbot. Unlike the top models from Anthropic, Google and OpenAI, it describes its technology as "open source" in the way that it enables developers access to modify and build on its core technology. Both the V4's "pro" and "flash" versions have a 1 million token context window, a parameter of how much information an AI model can process and recall, and run on a more efficient basis, the startup said. That is a significant improvement from before, since the V3 supported a 128,000 token context window. Huawei said in a separate statement Friday that its Ascend chips and related technology are compatible with the DeepSeek V4 models. It's a demonstration of technical feasibility of operating outside the Nvidia-dominated computing ecosystem "amid sustained technological decoupling between China and the U.S.," said Zhang. A report from Microsoft in January showed use of DeepSeek has been gaining ground in many developing nations, particularly those where Huawei phones are widely used. However, some analysts remain skeptical. Ivan Su, a senior equity analyst at Morningstar, said while V4 is a "competent" follow-up, it's not as big a breakthrough as the rollout of R1. "Domestic competition has intensified significantly since R1's release," Su said. "Against U.S. models, DeepSeek's own evaluation suggests its capabilities largely match on most fronts, but independent evaluations are needed before final conclusions can be drawn." In February, Anthropic accused DeepSeek and two other China-based AI laboratories of "industrial-scale campaigns" to "illicitly extract Claude's capabilities to improve their own models." It said they did that using a technique called distillation that "involves training a less capable model on the outputs of a stronger one." OpenAI made similar allegations in a letter to U.S. lawmakers. This week, Michael Kratsios, chief science and technology adviser to U.S. President Donald Trump, also accused foreign tech companies "principally based in China" of distilling leading U.S. AI systems and "exploiting American expertise and innovation." China's embassy in Washington hit back at the allegations, describing them as "unjustified suppression of Chinese companies by the U.S."
[37]
How DeepSeek 4'S Massive 1M Token Context Window is Changing Open-Source AI
DeepSeek 4 introduces two open source language models designed to meet varying computational requirements, as detailed by Prompt Engineering. The Pro model, with 1.6 trillion parameters, is optimized for tasks demanding high precision and processing power, while the Flash model, featuring 284 billion parameters, is suited for environments with limited resources. Both models include a 1 million token context window, allowing them to process extensive text sequences. A notable feature, compressed sparse attention, reduces memory usage during token generation, allowing efficient operation even on less capable hardware. Discover how these models perform in areas such as technical problem-solving and large-scale content generation. Learn about specific efficiency gains, including a 27% reduction in resource consumption for the Pro model and explore their open source framework, which supports customization and collaborative development. Additionally, understand their hardware compatibility and how their pricing structure aligns with cost-conscious organizational needs. DeepSeek 4 introduces two distinct models, each designed to cater to specific user needs and technical environments: Both models are equipped with an unprecedented 1 million token context window, allowing them to process and generate extensive, coherent text sequences. Trained on a vast dataset of approximately 32-33 trillion tokens, these models exhibit exceptional adaptability and precision across a wide range of language tasks. This scalability ensures that users can tackle both simple and complex challenges effectively. Efficiency is a cornerstone of DeepSeek 4's architecture. The Pro model achieves a 27% reduction in computational resource usage compared to its predecessor, while the Flash model operates at just 10% of the previous version's FLOPs. These advancements result in faster processing speeds and lower hardware demands, making the models accessible to a broader audience. A critical innovation driving this efficiency is the implementation of compressed sparse attention. This architectural enhancement minimizes memory requirements for key-value caching, significantly accelerating token generation and reducing computational overhead. As a result, users can experience smooth performance even on less powerful hardware, broadening the practical applications of these models. Discover other guides from our vast content that could be of interest on DeepSeek 4. DeepSeek 4 reinforces its commitment to open source principles by making its model weights, including base weights, freely available for fine-tuning. This transparency enables developers to customize the models for specific use cases, fostering collaboration and innovation within the AI community. Historically, open source models have lagged behind their closed-source counterparts in terms of performance and availability. DeepSeek 4 narrows this gap considerably, delivering innovative capabilities while maintaining its dedication to accessibility. This approach not only democratizes advanced AI technology but also encourages a more inclusive ecosystem for AI development. DeepSeek 4 has been rigorously tested on multiple hardware platforms, including Nvidia GPUs and Havi Ascent NPUs. The latter has emerged as a cost-effective alternative for inference tasks, offering users additional flexibility in hardware selection. While specific details about the training hardware remain undisclosed, the models' compatibility with diverse systems highlights their versatility. To further enhance accessibility, DeepSeek 4 introduces a competitive pricing structure: This pricing model positions DeepSeek 4 as an attractive option for organizations seeking high-quality AI solutions without incurring prohibitive costs. In benchmark evaluations, DeepSeek 4 demonstrates strong agentic capabilities, excelling in tasks that require planning, execution and adaptability. While it slightly trails competitors like Gemini 3.1 in knowledge and reasoning tasks, it remains highly effective for real-time applications and complex instructions. Potential applications for DeepSeek 4 include: However, the quality of outputs is heavily influenced by the specificity of prompts. Vague or overly simplistic prompts may result in less refined outputs, emphasizing the importance of precise input design to maximize the models' potential. A standout feature of DeepSeek 4 is its compressed sparse attention, which reduces memory overhead while enhancing token generation speed. This innovation allows the models to handle larger context windows without compromising performance, making them suitable for tasks requiring extensive contextual understanding. Additionally, integration with external agentic harnesses expands the models' functionality, allowing more sophisticated applications across diverse fields such as healthcare, finance and education. These integrations pave the way for advanced AI-driven solutions that can adapt to complex, real-world scenarios. Despite its many strengths, DeepSeek 4 is not without limitations. Users have reported the following challenges: While these issues are notable, they do not significantly detract from the overall utility of the models. Moreover, they are likely to be addressed in future updates, reflecting the ongoing commitment to refinement and user feedback. Looking ahead, DeepSeek 4 is poised to expand its capabilities further. The planned deployment of 950 super nodes is expected to enhance service capacity and reduce operational costs, making the models even more accessible to a wider audience. Additionally, continued integration with external agentic harnesses promises to unlock new possibilities for advanced AI applications. These developments highlight the forward-thinking approach of DeepSeek 4's creators, making sure that the models remain at the forefront of open source AI innovation. By addressing current limitations and exploring new opportunities, DeepSeek 4 is well-positioned to shape the future of language modeling. DeepSeek 4 represents a significant advancement in the field of open source AI, combining state-of-the-art technology with a commitment to accessibility and efficiency. Whether you are a researcher, developer, or organization seeking innovative AI solutions, DeepSeek 4 offers a compelling blend of performance, affordability and innovation. Its release marks a pivotal moment in the evolution of language models, setting a new standard for what open source AI can achieve. Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[38]
DeepSeek previews new AI model adapted to run on Huawei chips
BEIJING -- DeepSeek, the Chinese startup whose low-cost AI model stunned the world last year, on Friday launched a preview of a highly awaited new model adapted for Huawei chip technology, underlining China's growing self-sufficiency in the sector. The Pro version of the new model outperforms other open-source models in world-knowledge benchmarks, trailing only Google's GOOGL.O closed-source Gemini-Pro-3.1, DeepSeek said. The close collaboration with Huawei HWT.UL on the new model, the V4, contrasts with DeepSeek's past reliance on Nvidia's NVDA.O chips. "This is a big deal for China's AI industry," said He Hui, director of semiconductor research at consultancy Omdia. "Huawei's Ascend chips are the country's best homegrown alternative to Nvidia, and supporting DeepSeek V4 shows that top Chinese AI models can now run on Chinese hardware." Most leading AI models are trained and run on chips made by Nvidia. Huawei said its chips were used in some of the V4's training process. DeepSeek's pivot underscores concerns raised by Nvidia CEO Jensen Huang and others that the U.S. firm risks losing its developer ecosystem in China due to U.S. export controls and Beijing's self-sufficiency push. "The day that DeepSeek comes out on Huawei first, that is a horrible outcome for our nation," Huang said in a Dwarkesh Podcast interview this month. DeepSeek has often been hit with accusations by Washington and U.S. rivals that its success owes much to the improper use of U.S. know-how, and Friday's launch comes one day after the White House accused China of stealing U.S. AI labs' intellectual property on an industrial scale. It was not clear, however, if the issue is large enough to strain relations ahead of a visit by U.S. President Donald Trump to Beijing next month to meet with China's leader, Xi Jinping. For its part, DeepSeek has acknowledged the use of Nvidia chips but has not commented on whether those particular chips were subject to export bans. It has also said it has not intentionally used synthetic data generated by OpenAI. Washington began to restrict China's access to advanced AI chips made by U.S. firms in 2022 and since then Beijing has accelerated its push to achieve tech self-sufficiency. Huawei said it had worked closely with DeepSeek so the new V4 models could run across its full line of high-performance Ascend systems. Nvidia now faces even more of an uphill battle in regaining market share in China. The Trump administration in January gave the green light for Nvidia's powerful H200 chips to be sold in China but sources have said that shipments have been stymied by disagreements over the terms of the sales both in China and the U.S. Nvidia's shares were flat in pre-market trading. Chinese chipmakers rallied on expectations for wider use of homegrown chips, with Huahong Semiconductor 1347.HK and SMIC 0981.HK surging 15% and 10% respectively. Many Western and some Asian governments have banned their institutions and officials from using DeepSeek, citing data privacy concerns. Nevertheless, DeepSeek's models have consistently been among the most used on international platforms that host open-source models. In China, despite rocketing to national champion status a year ago, its lead has evaporated amid a slew of competitive offerings from domestic rivals. DeepSeek said on Friday that the V4 would be particularly suited to AI agent work, which can execute more complex tasks than chatbots but needs more computing power. Just how successful it will be remains to be seen. "The rapid iteration of new AI models makes it difficult to determine an ultimate winner. We believe the market is not a 'winner-takes-all' scenario; instead, success will depend on monetization, adoption rates, and sustainable revenue from services," Citi analysts said in a note. That said, the release of the V4 sent shares of rivals tumbling, with Zhipu AI 2513.HK and MiniMax 0100.HK both losing 9%. The V4 also comes in a lower-cost Flash version. Preview versions allow the company to incorporate real-world feedback and make changes ahead of a final product launch. DeepSeek did not provide a timeline for when the model is expected to be finalized. Owned by China's High-Flyer Capital Management, DeepSeek is aiming to raise funds at a valuation exceeding $20 billion, according to a report by The Information this month, which also said that tech giants Alibaba 9988.HK and Tencent 0700.HK were in discussions to take stakes.
[39]
Deepseek Unveils New Open-Source AI Model V4 Preview to Challenge OpenAI, Anthropic
According to reports, the company positions V4 as an upgraded tool for reasoning, coding, and long-context handling. DeepSeek aims to attract developers with open-source AI access and flexible deployment. The Hangzhou-based company claimed, "V4 achieves strong performance against domestic competitors, particularly in agent-based tasks, knowledge processing and inference. DeepSeek-V4 has been optimized for use with popular agent tools such as Anthropic's Claude Code and OpenClaw." The achieves better performance through its implementation of sparse attention techniques and reduced cost. The lower pricing of DeepSeek services enables the company to provide better value than its competitors, which include OpenAI and Gemini. Experts suggest that mass adoption of the product will increase because of its affordable pricing. This allows businesses to operate at scale without incurring high compute expenses. The company has made fundamental improvements to its hardware development plan. DeepSeek has aligned V4 with Huawei's Ascend AI chips.
[40]
China's DeepSeek releases long-awaited new AI model - The Korea Times
BEIJING -- Chinese startup DeepSeek released "preview" versions of a new artificial intelligence (AI) model on Friday, more than a year after it stunned the world with a low-cost reasoning model that matched the capabilities of U.S. rivals. The long-awaited DeepSeek-V4 "features an ultra-long context of 1 million words, achieving leadership in both domestic and open-source fields across agent capabilities, world knowledge and reasoning performance," the company said in a statement on social media. A "preview version" is now available, the company said. The new DeepSeek-V4 model is released as two versions, DeepSeek-V4-Pro and DeepSeek-V4-Flash, with the latter being "a more efficient and economical choice" because it has smaller parameters. "In world knowledge benchmarks, DeepSeek-V4-Pro significantly leads other open-source models and is only slightly outperformed by the top-tier closed-source model, (Google's) Gemini-Pro-3.1," the statement added.
[41]
DeepSeek's new AI model does not wow markets in fast-changing industry
BEIJING, April 27 (Reuters) - Market reaction to DeepSeek's preview of its long-awaited next-generation artificial intelligence model has so far been subdued compared with the Chinese startup's outsized global breakthrough last year after the launch of its low-cost AI models. The release and overseas reception of DeepSeek-V3 and R1, which the Hangzhou-based company said were trained with a fraction of the computing power used by U.S. rivals, triggered a global tech share selloff as investors questioned the logic behind massive spending on AI infrastructure. That moment was widely viewed by analysts as a "black swan" event that forced a sudden repricing of assumptions about cost, competition, and China's ability to innovate under U.S. chip restrictions. But the muted reaction so far to DeepSeek-V4, launched on Friday, highlights how quickly those assumptions have shifted. Markets and industries have grown accustomed to low-cost, highly efficient models developed under computing constraints, reducing the element of surprise. "This announcement followed a rather predictable path," said Lian Jye Su, chief analyst at Omdia, noting that advances in model architectures and efficiency have since been widely explored across industry and academia. Benchmark data support that view. According to Artificial Analysis, DeepSeek-V4 Pro shows significant improvement over previous versions but overall ranks among leading open-weight models rather than clearly surpassing rivals, with competitors such as Kimi and Qwen narrowing the gap. That contrasts with last year, when DeepSeek appeared to leap ahead of domestic peers, driving rapid adoption in China and amplifying its global impact. Analysts say the earlier shock was driven by a convergence of factors: lofty valuations of U.S. tech firms, expectations of continued dominance by a handful of players, and the emergence of a relatively unknown Chinese startup delivering unexpectedly strong results. Those conditions are no longer present. "The expectation that new players will emerge is now baked into valuations," Su said, adding that markets have become more realistic about both the capabilities and limits of AI. At the same time, competition within China has intensified, with multiple firms releasing increasingly capable models, eroding DeepSeek's relative lead. On Monday, stock markets in South Korea and Taiwan hit new highs, buoyed by broad optimism for AI-related stocks. Alfredo Montufar-Helu, managing director at Ankura China Advisors, said the significance of V4 lies less in market impact and more in the U.S.-China race for tech supremacy. He pointed to DeepSeek's adaptation of V4 to run best on Huawei chips, as tightening U.S. export controls are designed to cut off the Chinese market's access to cutting-edge U.S. chips that power AI model development. "The 'wow factor' was last year - that's already priced in," he said. "What matters now is whether China can continue advancing on AI development, and potentially do so with its own chips - the geopolitical implications would be significant." (Reporting by Eduardo Baptista; Editing by Jacqueline Wong)
[42]
DeepSeek V4 just dropped: 3 things you should know about it
DeepSeek is back. The Chinese AI firm that rattled the entire industry in January 2025 with its shockingly efficient R1 model has released V4, its most ambitious model since. It comes in two versions - V4-Pro, built for heavy coding and agentic tasks, and V4-Flash, a lighter, faster variant - and both are open source. But beyond the headline numbers, there are three things worth paying attention to. From the start, DeepSeek has marketed itself as the value option to the American giants, and this is true once again with the V4. The price of V4-Pro is $1.74 for a million input tokens and $3.48 for a million output tokens, which is very cheap compared to what OpenAI or Anthropic charge for a similar performance level. In addition, V4-Flash is $0.14 and $0.28 per million tokens of input and ouput, respectively, making it perhaps one of the most budget-friendly best-in-class options on the market. On benchmarks, DeepSeek claims V4-Pro matches Anthropic's Claude Opus 4.6, OpenAI's GPT-5.4, and Google's Gemini 3.1 - while outperforming every other open-source model on coding, math, and STEM. An internal survey of 85 developers found more than 90% ranked V4-Pro among their top choices for coding tasks. That last number is self-reported and should be taken with a grain of salt, but the pricing alone gives developers a real reason to experiment. Both V4 versions support a one-million-token context window, large enough to hold the entire Lord of the Rings trilogy. That alone isn't novel, pun intended, Gemini and Claude already operate at similar scales. What's sets DeepSeek apart is how it got there. Rather than treating every part of a long prompt as equally important, V4 compresses older context and focuses compute on what's most likely to matter right now. The result is dramatic: in a one-million-token context, V4-Pro uses just 27% of the compute and 10% of the memory that its predecessor V3.2 required. V4-Flash is even leaner, at 10% of the compute and 7% of the memory. For anyone building tools that need to reason across an entire codebase or a large document archive, that efficiency gap is meaningful as it directly translates to lower costs and faster responses at scale. This is the most geopolitically loaded aspect of V4, and probably the least understood. For the first time, DeepSeek has optimised a model for domestic Chinese chips - specifically Huawei's Ascend series - and notably denied early access to Nvidia and AMD ahead of launch. Huawei has confirmed its Ascend 950 supernodes will support V4, and DeepSeek says prices could fall further once those chips ship at scale later this year. But the move is partial. DeepSeek appears to have adapted V4 for Chinese chips primarily for inference, when users query the model, while training may still rely heavily on Nvidia hardware. Researchers note that Chinese chips remain behind Nvidia's on raw training performance. What V4 represents, then, isn't a clean break from American silicon. It's a first, deliberate step toward building a parallel AI infrastructure, one that Beijing has been pushing for, and that the industry will be watching closely. V4 probably won't shake the world the way R1 did. That moment was a surprise. This one was anticipated. But across pricing, architecture, and chip strategy, DeepSeek is making moves that matter and developers, investors, and policymakers would be unwise to scroll past.
Share
Copy Link
Chinese AI company DeepSeek has released V4, its most powerful open-weight AI model yet, featuring 1.6 trillion parameters and a one million token context window. The model competes with leading systems from OpenAI, Google, and Anthropic while offering dramatic cost reductions—up to 75% off for developers. The release intensifies the Chinese AI price war and showcases architectural innovations that could reshape the global AI landscape.
Chinese AI company DeepSeek has unveiled its highly anticipated DeepSeek V4 AI model, marking a significant leap in the competitive landscape of large language models. Available in two versions—V4 Flash and V4 Pro—both models employ a mixture-of-experts architecture with a one million token context window, large enough to process entire codebases or lengthy documents in a single prompt
1
. The larger V4 Pro variant boasts 1.6 trillion parameters with 49 billion active, making it the biggest open-weight model available and more than double the size of DeepSeek V3.2's 671 billion parameters1
. The smaller V4 Flash contains 284 billion parameters with 13 billion active1
.
Source: Geeky Gadgets
According to benchmarks shared by the company, DeepSeek V4 competes with frontier models including OpenAI's GPT-5.4, Google's Gemini 3.1 Pro, and Anthropic's Claude Opus 4.6
1
2
. The Chinese AI lab claims its V4-Pro-Max model outperforms open-source peers across reasoning benchmarks and excels in coding tasks, with performance "comparable to GPT-5.4" in coding competition benchmarks1
. In an internal survey of 85 experienced developers, more than 90% included V4-Pro among their top model choices for coding tasks2
.
Source: Silicon Republic
DeepSeek is aggressively positioning V4 as a cost-effective alternative to Silicon Valley's best models, offering a 75% discount to developers using the DeepSeek-V4-Pro. The V4 Flash model costs just $0.14 per million input tokens and $0.28 per million output tokens, undercutting GPT-5.4 Nano, Gemini 3.1 Flash, and Claude Haiku 4.5
1
. The larger V4 Pro model costs $1.74 per million input tokens and $3.48 per million output tokens, a fraction of comparable models from OpenAI and Anthropic2
. The company has also reduced fees for input cache hits to a tenth of their original pricing.
Source: VentureBeat
These substantial inference cost savings stem from novel architectural improvements. DeepSeek V4 introduces a hybrid attention mechanism combining Compressed Sparse Attention and Heavy Compressed Attention, which reduces compute requirements during inference and significantly compresses the key-value caches used to track model state
5
. This innovation allows the model to support the one million token context window while using 9.5x to 13.7x less memory than DeepSeek V3.25
. The models also employ a mixture of FP8 and FP4 precision, with quantization-aware training for the mixture-of-experts expert weights, effectively halving memory requirements compared to FP85
.The release threatens to reignite the Chinese AI price war that erupted after DeepSeek upended the industry with R1 last year. Chinese AI firms are discounting aggressively to incentivize users to switch platforms, accelerating adoption in a crowded global AI field in an attempt to reshape the US-China AI race. DeepSeek is betting that pricing, accessibility, and sophisticated features will differentiate its models for next-generation developers and enterprise users.
The model has been specifically optimized for popular agent frameworks such as Claude Code, OpenClaw, and CodeBuddy, allowing easy integration within the broader AI ecosystem
2
. "The pricing, open source availability and 1 million context window features all lower barriers for developers, startups and small enterprises," said Akshar Keremane, co-founder of Bangalore-based AI startup O-Health. "It allows users to experiment at a model capability and scale that wasn't available earlier".Related Stories
DeepSeek V4 explicitly highlights compatibility with domestic Huawei technology, marking a milestone for China's chip industry
3
. While DeepSeek V3 was heavily optimized for Nvidia GPUs, V4 has been validated to run on both Nvidia and Huawei's Ascend family of AI accelerators5
. The company validated its "fine-grained EP [Expert Parallel] scheme on both Nvidia GPUs and Ascend NPU platforms," though it remains unclear whether the model was trained entirely on Huawei hardware or used a combination of chipsets5
.The launch arrives amid heightened geopolitical tensions, coming just one day after the U.S. accused China of stealing American AI labs' IP on an industrial scale using thousands of proxy accounts
1
. DeepSeek itself has been accused by Anthropic and OpenAI of "distilling," essentially copying, their AI models1
3
. U.S. officials have also accused the company of using banned Nvidia chips3
. DeepSeek has not disclosed V4's training costs or what hardware it was trained on3
.Despite its strengths, the models appear to trail frontier models in knowledge tests, specifically OpenAI's GPT-5.4 and Google's latest Gemini 3.1 Pro. This lag suggests a "developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months," according to the lab
1
. Both V4 Flash and V4 Pro support text only, unlike many closed-source peers from OpenAI, Google, and Anthropic which offer support for understanding and generating audio, video, and images1
.For developers and enterprises watching the AI landscape, DeepSeek V4's combination of competitive performance, dramatic cost reductions, and open availability could accelerate adoption of Chinese AI technology globally. The model's emphasis on coding capabilities positions it well for the growing AI agent market, while its architectural innovations in attention mechanisms and memory efficiency may influence future model designs across the industry. As the Chinese AI price war intensifies and geopolitical tensions around IP theft allegations escalate, V4 represents both a technical achievement and a strategic move in the ongoing competition between Chinese and American AI labs.
Summarized by
Navi
[1]
[2]
[5]
1
Technology

2
Technology

3
Technology
