22 Sources
[6]
The DeepSeek effect: How China's new AI is supercharging product design
Twenty-four hours before the White House and Silicon Valley announced the $500 billion Project Stargate to secure the future of AI, China dropped a technological love bomb called DeepSeek. DeepSeek R1 is a whole lot like OpenAI's top-tier reasoning model, o1. It offers state-of-the art artificial thinking: the sort of logic that doesn't just converse convincingly, but can code apps, calculate equations, and think through a problem more like a human. DeepSeek largely matches o1's performance, but it runs at a mere 3% the cost, is open source, can be installed on a company's own servers, and allows researchers, engineers, and app developers a look inside and even tune the black box of advanced AI. In the two weeks since it launched, the AI industry has been supercharged with fresh energy around the products that could be built next. Through a dozen conversations with product developers, entrepreneurs, and AI server companies, it's clear that the worried narratives most of us have heard about DeepSeek -- it's Chinese propaganda, it's techie hype -- doesn't really matter to a free market. "Everyone wants OpenAI-like quality for less money," says Andrew Feldman, CEO and cofounder of the AI cloud hosting service Cerebras Systems that is hosting DeepSeek on its servers. DeepSeek has already driven down OpenAI's own pricing on a comparable model by 13.6x. Beyond cost, DeepSeek is also demonstrating the values of open technologies versus closed, and wooing interest from Fortune 500s and startups alike. OpenAI declined an interview for this piece.
[7]
DeepSeek has rattled large AI players -- but smaller chip firms see it as a force multiplier
Many smaller AI chip companies say they've seen an uptick in demand amid the DeepSeek fervor and that it might represent a catalyzing event for the market. DeepSeek has rattled the U.S.-led AI ecosystem with its latest model, shaving hundreds of billions in chip leader Nvidia's market cap. While the sector leaders grapple with the fallout, smaller AI companies see an opportunity to scale with the Chinese startup. Several AI-related firms told CNBC that DeepSeek's emergence is a "massive" opportunity for them, rather than a threat. "Developers are very keen to replace OpenAI's expensive and closed models with open source models like DeepSeek R1..." said Andrew Feldman, CEO of artificial intelligence chip startup Cerebras Systems. The company competes with Nvidia's graphic processing units and offers cloud-based services through its own computing clusters. Feldman said the release of the R1 model generated one of Cerebras' largest-ever spikes in demand for its services. "R1 shows that [AI market] growth will not be dominated by a single company -- hardware and software moats do not exist for open-source models," Feldman added. Open source refers to software in which the source code is made freely available on the web for possible modification and redistribution. DeepSeek's models are open source, unlike those of competitors such as OpenAI. DeepSeek also claims its R1 reasoning model rivals the best American tech, despite running at lower costs and being trained without cutting-edge graphic processing units, though industry watchers and competitors have questioned these assertions. "Like in the PC and internet markets, falling prices help fuel global adoption. The AI market is on a similar secular growth path," Feldman said.
[8]
DeepSeek's breakthrough emboldens open-source AI models like Meta's Llama
DeepSeek's powerful new artificial intelligence model isn't just a win for China -- it's a victory for open-source versions of the tech from the likes of Meta, Databricks, Mistral and Hugging Face, according to industry experts who spoke with CNBC. Last month, DeepSeek released R1, an open-source reasoning model that claims to rival the performance of OpenAI's o1 model using a cheaper, less energy-intensive process. The development caused the market values of Nvidia and other chipmakers to plummet on fears that it could lead to reduced spending on high-performance computing infrastructure. DeepSeek is a Chinese AI lab that focuses on developing large language models with the ultimate aim of achieving artificial general intelligence, or AGI. It was founded in 2023 by Liang Wenfeng, co-founder of AI-focused quantitative hedge fund High-Flyer. AGI loosely refers to the idea of an AI that equals or surpasses human intellect on a wide range of tasks.
[9]
The beginning of the end? DeepSeek goes 100% Chinese for AI inference as it uses controversial Huawei's Ascend 910x accelerators
The pricing is much lower than offered by Azure and AWS who have started trialing DeepSeek DeepSeek recently massively unsettled global markets with the launch of its open reasoning LLM, which was built and trained for a fraction of the cost of models from much larger US competitors, although OpenAI has since accused DeepSeek's developers of using its models to train theirs. A new paper had claimed DeepSeek's V3 LLM was trained on a cluster of just 2,048 Nvidia H800 GPUs - crippled versions of the H100 designed to comply with US export restrictions to China. Rumors around DeepSeek's newer reasoning model, R1, suggest it may have been trained on as many as 50,000 Nvidia "Hopper" GPUs, including H100, H800, and the newer H20, although DeepSeek hasn't - and likely won't - confirm this. If true, it raises serious questions about China's access to advanced AI hardware despite ongoing trade restrictions, although it's no secret there's a thriving black market for advanced Nvidia AI hardware there. Now, in a move that's going to further shake Western firms, the South China Morning Post reports Huawei Technologies' cloud computing unit has partnered with Beijing-based AI infrastructure start-up SiliconFlow to make DeepSeek's models available to end users for an incredibly low price. This collaboration, which was worked on during the Chinese Lunar New Year holidays, provides efficient, cost-effective access to DeepSeek's V3 and R1 models through Huawei's Ascend cloud service, which is powered by Huawei's own homegrown solutions, including the controversial Ascend 910x accelerators which are banned in the US, UK and Europe. Huawei has made no secret that it wants to become the Chinese Nvidia, and Huawei Cloud claims its performance levels are comparable to those of models running on premium global GPUs. SiliconFlow, which hosts the DeepSeek models, has come out swinging with some aggressive pricing, offering it for 1 yuan (approximately US$0.13) per 1 million input tokens and 2 yuan for output tokens with V3, while R1 access is priced at 4 yuan and 16 yuan. Microsoft added DeepSeek to its Azure AI Foundry a few days ago, and Amazon swiftly followed suit, adding the LLM to its AWS' Bedrock managed service. AWS showcased the AI model using an ml.p5e.48xlarge instance, powered by eight Nvidia H200 GPUs delivering 1128GB of GPU memory. It's early days for both cloud offerings though, and they work out much more expensive than SiliconFlow's super-low pricing. The collaboration between Huawei, SiliconFlow and DeepSeek highlights China's broader strategy to strengthen its domestic AI capabilities while reducing reliance on Nvidia hardware. The South China Morning Post notes, "The move to launch DeepSeek's models on a homegrown hardware backbone highlights China's progress in cutting dependency on foreign technology and bolstering its domestic AI industry amid growing efforts by the US to choke off China's access to high-end chips that the US government said could be used to advance military aims."
[10]
Chinese chip makers, cloud providers rush to embrace homegrown DeepSeek
SHANGHAI (Reuters) - Chinese companies, from chip makers to cloud service providers, are rushing to support DeepSeek's artificial intelligence models, spurring analysts to hail a "watershed moment" for the industry. Moore Threads and Hygon Information Technology, which makes AI chips and looks to compete with Nvidia, said on Monday their computing clusters and accelerators would be able to support DeepSeek's R1 and V3 models. "We pay tribute to DeepSeek," Moore Threads headlined its post on WeChat, adding that progress by the firm's models using domestically made graphic processing units (GPU) could "set on fire" China's AI industry. On Saturday, Huawei Technologies, which also has its own line of AI chips, said it was working with AI infrastructure start-up SiliconFlow to make DeepSeek's models available to customers on its Ascend cloud service. Their performance was comparable to models run on global, high-end chips, it added. The news that Huawei had integrated DeepSeek's models with its Ascend chips marked a "watershed moment," Bernstein analysts said in a note on Sunday. "DeepSeek demonstrates that competitive large language models (LLM) can be deployed on China's 'good enough' chips, easing reliance on cutting-edge U.S. hardware"," they added, citing Ascend and planned chips from Cambricon and Hygon. Alibaba, Baidu and Tencent's cloud arms have also said they have made DeepSeek's models accessible via their services. Last month, DeepSeek launched a free AI assistant that it says uses less data at a fraction of the cost of existing services. Within a few days, its app overtook U.S. rival ChatGPT in downloads from Apple's App Store, triggering a global selloff in tech shares. Earlier the company earlier drew attention in global AI circles with a research paper in December that said the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia's H800 chips, versus the billions of dollars spent by the likes of tech giants Meta and Microsoft. China has welcomed its success, turning the startup based in the eastern city of Hangzhou, and the firm's founder, Liang Wenfeng, into pop culture celebrities. Microsoft and Amazon's cloud services have also started offering DeepSeek's models but several countries such as Italy and the Netherlands have blocked, or are investigating, DeepSeek's AI app on concerns of privacy. (Reporting by Brenda Goh and Lin Yuhan; Editing by Clarence Fernandez)
[11]
DeepSeek's rise was no accident: Here's the master plan behind it
Artificial intelligence is reshaping industries at a breakneck pace, but few transitions have been as bold -- or as successful -- as the one pulled off by DeepSeek. Born from the quantitative trading powerhouse High-Flyer, DeepSeek has emerged as a leader in AI research. But how did they do it? What made their pivot from financial markets to artificial general intelligence (AGI) not just plausible, but a record breaking success? Here's a deep dive into the strategy, infrastructure, and vision that set them apart. High-Flyer was never just another hedge fund. At its peak, it managed nearly 100 billion yuan (about $13.79 billion) using algorithmic trading that relied heavily on artificial intelligence. The firm had always been at the cutting edge, leveraging advanced machine learning for high-frequency trading. But by 2023, High-Flyer saw the writing on the wall -- AI's potential extended far beyond financial markets. Instead of simply refining trading algorithms, they went all in on AGI. The result? DeepSeek, a dedicated research lab focused on pushing AI beyond narrow tasks and into broader, human-level intelligence. How DeepSeek caused the largest tech sell-off in history DeepSeek's edge comes down to raw computational power. Years before the global AI boom, High-Flyer had already been stockpiling high-performance GPUs for its trading operations. Those investments turned out to be DeepSeek's secret weapon. By securing this infrastructure before U.S. export restrictions on high-end AI chips, DeepSeek insulated itself from global supply chain disruptions -- a move that gave them a critical advantage over competitors scrambling for hardware. DeepSeek's strategy is a masterclass in resource allocation and AI-first thinking. Here's what set them apart: DeepSeek didn't need to start from scratch. The same GPU clusters that once fueled high-frequency trading were seamlessly transitioned into AI model training. That meant a zero-to-one jump without massive upfront costs. They didn't just buy GPUs -- they brought in the brains to match. DeepSeek aggressively recruited top AI researchers from China's elite universities and global AI firms, creating a think tank dedicated to AGI. Building a cutting-edge AI model isn't about running a single training process. It's about iteration, experimentation, and failure. DeepSeek dedicates up to four times the compute power for research as it does for final model deployment. This commitment to experimentation fuels breakthroughs. Even as DeepSeek advances AGI, High-Flyer continues to refine its financial models. The synergy between the two allows for AI models to be tested in real-world financial applications, giving DeepSeek a practical edge that few pure AI labs can match. None of this would be possible without a leader with the vision and risk appetite to bet big on AI. That leader is Liang Wenfeng, the founder and controlling shareholder of High-Flyer. While many in finance see AI as a tool, Wenfeng saw it as the future of everything. Under his leadership, DeepSeek isn't just trying to compete with OpenAI and Google DeepMind -- they're aiming to leapfrog them by developing AI that can perform economically valuable tasks at human or superhuman levels. DeepSeek isn't slowing down. Here's where they're heading: DeepSeek's meteoric rise is a case study in seizing opportunity. They didn't just dabble in AI; they bet the house on it. By simply using existing trading infrastructure, securing high-performance GPUs before the global chip crunch, and building a world-class AI research team, DeepSeek has made itself as a serious contender in the AGI race.
[12]
A paradigm shift? The view from China on DeepSeek and the global AI race
DeepSeek-R1, available under an MIT license, is fully open-source, allowing developers to use, modify, and distribute it freely. Ni Tao is IE's columnist, giving exclusive insight into China's technology and engineering ecosystem. His Inside China column explores the issues that shape discussions and understanding about Chinese innovation, providing fresh perspectives not found elsewhere. DeepSeek's bombshell in January continues to reverberate across the global AI community. One remarkable thing about this AI assistant is that, through algorithmic improvements, it achieved ChatGPT-4o-level performance using 2,048 Nvidia H800 graphic processing units (GPUs).
[13]
Top Chinese cloud platforms rally behind DeepSeek as AI 'black swan' redefines cloud intelligence
Amid rising US restrictions on China's tech sector, leading Chinese cloud computing and AI companies are forging alliances with DeepSeek to challenge global industry leaders and expand their footprint in the AI and cloud ecosystem. Baidu Cloud, Alibaba Cloud, Tencent Cloud, and Huawei Cloud have integrated the DeepSeek AI model into their platforms, reflecting China's AI progress through collaboration with cloud service providers (CSPs). DeepSeek surged in popularity during the Lunar New Year, and shortly after the holidays, Baidu AI Cloud and Alibaba Cloud announced their successful integration. Domestic CSPs embrace DeepSeek Baidu AI Cloud's Qianfan Platform has launched DeepSeek-R1 and DeepSeek-V3, providing ultra-low pricing plans and limited-time free access. Users can test these models for free until February 18 via ModelBuilder. Since DeepSeek's open-source launch of V3, R1, and Janus-Pro in December 2024, platforms including Huawei Cloud, Tencent Cloud, Qihoo 360, and ZStack Cloud have integrated the DeepSeek AI model. On February 2, Tencent Cloud announced that developers can deploy DeepSeek-R1 on its HAI platform in just three minutes. Huawei Cloud, in partnership with SiliconFlow, has launched a DeepSeek model optimized for Ascend processors. On February 3, Alibaba Cloud announced that the PAI Model Gallery now supports one-click deployment of DeepSeek-V3 and DeepSeek-R1. The platform allows users to complete training, deployment, and inference without writing code. ZStack's AIOS platform Zhita now supports enterprise-grade private deployment of DeepSeek V3, R1, and Janus-Pro. It is compatible with Hygon and Ascend chips, as well as Nvidia and Intel CPUs/GPUs. DeepSeek's market position Since DeepSeek AI became open-source in December 2024, it has quickly established itself as a key player in China's AI ecosystem. As more platforms adopt the model, DeepSeek is emerging as a formidable competitor to OpenAI, influencing the global AI market landscape. The DeepSeek model has attracted interest from global tech leaders, including AWS, Microsoft Azure, and Nvidia, which have started integrating it into their platforms. This signals DeepSeek's growing presence in the international AI market. Despite tightening US restrictions on China's tech sector, the country's AI industry continues to advance, driven by the emergence of DeepSeek, an open-source AI model that has become a "black swan" event drawing attention from the US tech industry. As Chinese cloud computing platforms increasingly integrate DeepSeek AI, industry experts anticipate heightened competition among cloud providers. This shift is expected to drive AI investments from Baidu, Alibaba, Huawei, and Tencent, fueling cost-efficient AI solutions for enterprises and developers. DeepSeek's integration with cloud computing highlights China's push for AI self-sufficiency. Its rollout is expected to bolster China's cloud market, encouraging broader enterprise AI adoption. AI, natural language processing (NLP), and intelligent customer service are gaining traction across industries. This expansion positions China's AI sector closer to global leaders. DeepSeek's ability to rival OpenAI and other global AI giants remains a key industry focus.
[14]
DeepSeek might not be as disruptive as claimed, firm reportedly has 50,000 Nvidia GPUs and spent $1.6 billion on buildouts
The fabled $6 million was just a portion of the total training cost. Chinese startup DeepSeek recently took center stage in the tech world with its startlingly low usage of compute resources for its advanced AI model called R1, a model that is believed to be competitive with Open AI's o1. However, SemiAnalysis reports that it took DeepSeek $1.6 billion in hardware costs and 50,000 Hopper GPUs to develop its next model, a finding that undermines the idea that DeepSeek reinvented AI training and inference. DeepSeek operates an extensive computing infrastructure with approximately 50,000 Hopper GPUs, the report claims. This includes 10,000 H800s and 10,000 H100s, with additional purchases of H20 units, according to SemiAnalysis. These resources are distributed across multiple locations and serve purposes such as AI training, research, and financial modeling. The company's total capital investment in servers is around $1.6 billion, with an estimated $944 million spent on operating costs, according to SemiAnalysis. DeepSeek took the attention of the AI world by storm when it disclosed the minuscule hardware requirements of its DeepSeek-V3 Mixture-of-Experts (MoE) AI model that are vastly lower when compared to those of U.S.-based models. Then DeepSeek shook the high-tech world with an Open AI-competitive R1 AI model. But then the reputable market intelligence company SemiAnalysis revealed its findings to indicate that DeepSeek used some $1.6 billion of hardware for R1. DeepSeek originates from High-Flyer, a Chinese hedge fund that adopted AI early and heavily invested in GPUs. In 2023, High-Flyer launched DeepSeek as a separate venture solely focused on AI. Unlike many competitors, DeepSeek remains self-funded, giving it flexibility and speed in decision-making. Despite claims that it is a minor offshoot, the company has invested over $500 million into its technology, according to SemiAnalysis. A major differentiator for DeepSeek is its ability to run its own datacenters, unlike most other AI startups that rely on external cloud providers. This independence allows for full control over experiments and AI model optimizations. In addition, it enables rapid iteration without external bottlenecks, making DeepSeek highly efficient compared to traditional players in the industry. Then there is something that one would not expect from a Chinese company: talent acquisition from mainland China, with no poaching from Taiwan or the U.S. DeepSeek exclusively hires from within China, focusing on skills and problem-solving abilities rather than formal credentials, according to SemiAnalysis. Recruitment efforts target institutions like Peking University and Zhejiang University, offering highly competitive salaries. Some AI researchers at DeepSeek earn over $1.3 million, exceeding compensation at other leading Chinese AI firms such as Moonshot, according to the research. Due to the talent inflow, DeepSeek has pioneered innovations like Multi-Head Latent Attention (MLA), which required months of development and substantial GPU usage, SemiAnalysis reports. DeepSeek emphasizes efficiency and algorithmic improvements over brute-force scaling, reshaping expectations around AI model development. This approach has for many reasons led some to believe that rapid advancements may reduce the demand for high-end GPUs, impacting companies like Nvidia. A recent claim that DeepSeek trained its latest model for just $6 million has fueled much of the hype. However, this figure refers only to a portion of the total training cost -- specifically, the GPU time required for pre-training. It does not account for research, model refinement, data processing, or overall infrastructure expenses. In reality, DeepSeek has spent well over $500 million on AI development since its inception. Unlike larger firms burdened by bureaucracy, DeepSeek's lean structure enables it to push forward aggressively in AI innovation, SemiAnalysis believes. DeepSeek's rise underscores how a well-funded, independent AI company can challenge industry leaders. However, the public discourse has been driven by hype. Reality is more complex: DeepSeek's success is built on strategic investments of billions of dollars, technical breakthroughs, and a competitive workforce. What it means is that there are no wonders. As Elon Musk noted a year or so ago, if you want to be competitive in AI, spend billions per year, this is apparently what was spent.
[15]
DeepSeek-R1: Why So Startled?
Last week saw the unmatched hysteria surrounding DeepSeek's release of an open-source model that purportedly matches GPT-4's capabilities at a fraction of the cost. Now that the dust is settling, it's helpful to take a step back and understand all the kerfuffle in proper context. Understanding DeepSeek's Innovations and Optimizations While DeepSeek's breakthrough in cost efficiency is noteworthy, treating it as a "Sputnik moment" for AI misses a fundamental truth: both incremental improvements and sudden breakthroughs in price-to-performance ratios are natural parts of any emerging technology's evolution. Over the last two years, diverse research efforts have been underway across academia and commercial organizations, focused not only on enhancing reasoning performance but also on improving price-performance ratios of generative AI models. DeepSeek-R1 (and the earlier V3 model, where many of these innovations were introduced before they made their way into, or were improved in R1) represents a significant leap forward in reducing the price-to-performance ratio through several technical innovations and optimizations that span the entire AI stack: These innovations collectively enable DeepSeek to achieve competitive performance at a fraction of the traditional cost. However, these innovations are more like a series of sensible optimizations rather than unexpected miracles. Moreover, the , MIT-licensed nature of these innovations, the relative openness of DeepSeek to sharing their approach through published papers and the stage of innovation in the AI industry at large ensures that these innovations will cross pollinate and will be improved upon by others in the coming months. The rapid pace of AI development means that today's cost breakthrough often becomes tomorrow's baseline. The swift introduction of efficiency-optimized models like Alibaba's Qwen 2.5-Max and OpenAI's o3-mini just days after the Deepseek-R1 announcement illustrates how DeepSeek's cost-efficiency breakthrough is already accelerating the industry's shift toward more resource-efficient AI development The Economics of AI Will Continue To Evolve Disruptions like DeepSeek's breakthrough serve as a reminder that AI progress will be shaped by a mix of gradual improvements and step-function changes contributed by a broad ecosystem of startups, open-source communities, and established tech giants. Focusing too heavily on singular breakthrough risks missing the forest for the trees. DeepSeek's success highlights the growing importance of open-source innovation in driving down AI development costs and the potential for resource-efficient methodologies to accelerate progress. The AI revolution is a marathon, not a sprint, and this is but one leg of the race. The winners in this race will be those who can nimbly navigate an environment of perpetual disruption, not those who react with knee-jerk hysteria to every new development.
[16]
The DeepSeek Competitors Vying to Be China's Next AI Champion
The next DeepSeek to disrupt the global artificial intelligence industry could also come from China. The country of 1.4 billion has seeded several promising AI startups and projects, while its leading internet players have spent years investing and developing the infrastructure to support such new ventures. With the DeepSeek experience casting doubt over the need for bleeding-edge expensive AI hardware from the likes of Nvidia Corp. -- which the US has banned from sale in China -- the prospects of China's up-and-coming AI companies look to be improving.
[17]
DeepSeek's AI costs far exceed $5.5 million claim, may have reached $1.6 billion with 50,000 Nvidia GPUs
Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. In brief: China's DeepSeek threw the multi-billion-dollar AI industry into chaos recently with the release of its R1 model, which is said to compete with OpenAI's o1 despite being trained on 2,048 Nvidia H800s and at a cost of $5.576 million. However, a new report claims that the true costs incurred by the firm were $1.6 billion, and that DeepSeek has access to around 50,000 Hopper GPUs. The claim that DeepSeek was able to train R1 using a fraction of the resources required by big tech companies invested in AI wiped a record $600 billion off Nvidia's share price in one day. If the Chinese startup to could make a model this powerful without spending billions on Team Green's most powerful AI GPUs, what would stop everyone else doing it? But did DeepSeek really create its Mixture-of-Experts model, which still tops the Apple App Store charts, at such a low cost? SemiAnalysis claims that it didn't. The market intelligence firm writes that DeepSeek has access to around 50,000 Hopper GPUs, including 10,000 H800s and 10,000 H100. It also has orders for many more China-specific H20s. The GPUs are shared between High-Flyer, the quantitative hedge fund behind DeepSeek, and the startup. They are distributed across several geographical locations and are used for trading, inference, training, and research. Courtesy of SemiAnalysis SemiAnalysis writes that DeepSeek has invested much more than the claimed $5.5 million figure that sent the stock market into a tailspin - the report states that this pre-training cost is a very narrow portion of the total. The company's overall investment in servers is around $1.6 billion, with around $944 million spent on operating costs. The GPU investments, meanwhile, account for more than $500 million. As a reference example, Anthropic's Claude 3.5 Sonnet cost tens of millions of dollars to train, but the company still needed to raise billions of dollars of investment from Google and Amazon. It's noted that DeepSeek has sourced all its talent exclusively from China. That is a contrast to reports of other Chinese tech companies, such as Huawei, trying to poach workers from overseas, with Taiwanese employees of TSMC being highly sought-after targets. DeepSeek allegedly offers salaries of over $1.3 million for promising candidates, much more than competing Chinese AI firms pay. DeepSeek also has the advantage of mostly running its own datacenters, rather than having to rely on external cloud providers. This allows for more experimentation and innovation across its AI product stack. SemiAnalysis writes that it is the single best "open weights" lab today, beating out Meta's Llama effort, Mistral, and others.
[22]
DeepSeek's Hedge Fund Owner Had Multiple Top Projects Before Creating The Firm That Fueled $1 Trillion Market Cap Erosion Of AI Stocks - NVIDIA (NASDAQ:NVDA)
Listed are High-Flyer's top three investments before DeepSeek. Chinese hedge fund High-Flyer, also known as Hangzhou Huanfang Technology Ltd. Co., founded DeepSeek in 2023 intending to explore artificial general intelligence (AGI), however, much before the AI startup became popular and sent shockwaves through the world of technology, this quantitative fund was involved with several innovations in China. Top Projects Before DeepSeek High-Flyer developed its first AI supercomputer "Yinghuo - 1" in 2020, which translates to, Fire-Flyer I, with an approximate investment of 200 million Yuan or $27.44 million, according to ChinaTalk. Deployed in the same year with 1,100 graphic processing units (GPUs), it provided computing power for AI-led research to facilitate quantitative trading. With an investment of one billion Yuan or $137.204 million, it launched "Yinghuo - 2" or Fire-Flyer II in 2021 which featured around 10,000 Nvidia Corp.'s NVDA A100 GPUs. It also featured cutting-edge hardware and 200 Gbps high-bandwidth connectivity. After losing over a third of its investors' money in 2022, dropping from a peak of over $12 billion, as reported by Reuters, the hedge fund announced a shift in strategy. High-Flyer, via its official WeChat account, said that it would focus exclusively on developing AI technology for the benefit of humanity, forming a new independent research group dedicated to exploring the potential of AGI. This restructuring led to the creation of DeepSeek in 2023. See Also: Meta AI Models 'Clearly Behind' DeepSeek, Says Analyst After Mark Zuckerberg Admits DeepSeek Did Some Novel Things 'We're Still Digesting' History Of High-Flyer The company was founded by Liang Wenfeng, Xu Jin, and Zheng Dawei in February 2016 in Ningbo, Zhejiang Province. Relying on mathematics and AI for quantitative investment it took the first stock position by using the deep learning algorithm model and GPUs on Oct. 21, 2016. Its subsidiaries, Zhejiang High-Flyer Asset Management Co., Ltd., and Ningbo High-Flyer Quant Investment Management Partnership LLP are regulated by the Asset Management Association of China. The company won the Golden Bull Fund Award for the first time in 2018 and it established High-Flyer Capital Management (Hong Kong) Ltd in 2019, licensed and regulated by the Securities and Futures Commission of Hong Kong. Wenfeng is now the CEO of DeepSeek, who owns approximately 51% of High-Flyer. As of December 2024, High-Flyer has assets under management worth $8 billion, as per the report. See Also: Qualcomm Rides AI Wave To Record Quarter, But Lowered Q2 Guidance Thanks To Seasonality And Apple Shipments Drags Down Stock What Is A Quantitative Hedge Fund? Quantitative hedge funds or 'quant' funds use sophisticated mathematical models, statistical analysis, and computer algorithms to identify trading opportunities. Relying on data analysis rather than traditional fundamental analysis, they employ strategies like statistical arbitrage, risk parity, and factor investing, using complex computer programs to analyze massive datasets and uncover exploitable market patterns. Read Next: Uber's Strong Q4 Offset By $1.1 Billion Forex Headwind, Currency Devaluation: Ongoing Tariff Wars Drive Weak Q1 Outlook Photo courtesy: Shutterstock NVDANVIDIA Corp$125.430.48%Overview Rating:Good75%Technicals Analysis1000100Financials Analysis600100WatchlistOverviewMarket News and Data brought to you by Benzinga APIs
[0]
DeepSeek faces Internet censorship hurdles in China
Chinese AI startup DeepSeek, founded by Wenfeng Liang, has developed a generative AI (GenAI) service that claims to surpass large language models (LLMs) in the US, such as OpenAI's, with a low-cost, open-source approach that does not rely on advanced GPUs or HBM. With a reported development cost of under US$6 million, DeepSeek developed its V3 model in just two months, positioning itself as one of the most significant disruptors in AI for 2025. The company's R1 model has exceeded expectations with impressive depth, breadth, objectivity, and rigor at an extraordinarily low price. Many believe DeepSeek's success could signify a shift in generative AI from the cloud to edge devices, potentially opening up new opportunities for AI applications in laptops and smartphones. However, the company's operations are constrained by internet censorship in China. As DeepSeek's data centers are located in China, content generation is subject to government regulations. Users often encounter interruptions when querying sensitive topics such as Xinjiang Uygur Autonomous Region or the Princelings, with the AI abruptly halting content generation. Furthermore, concerns have arisen regarding the legality of DeepSeek's training data, with some questioning whether it contains intellectual property used without proper consent. In contrast, US companies like OpenAI, Meta, Google, and Microsoft have invested billions into acquiring AI hardware. A collaborative project known as the Stargate Project, spearheaded by OpenAI, SoftBank, and Oracle, is set to invest US$500 billion, mostly in AI hardware procurement. This massive initiative will also require an immense amount of electricity to power its operations. While maintaining a lean approach, DeepSeek has managed to develop a competitive product with R1, showing remarkable performance at a fraction of the cost compared to Western counterparts. Analysts predict that the company's success could accelerate the shift of generative AI from cloud to edge, driving AI applications on consumer devices like laptops and smartphones. However, DeepSeek's operations are heavily influenced by China's Interim Measures for the Management of Generative Artificial Intelligence Services, which came into effect on August 15, 2023. The measures require that AI services ensure their data sources are legal, avoid intellectual property infringement, and comply with stringent data protection regulations. Any illegal content must be stopped immediately and reported to authorities -- explaining why DeepSeek halts content generation when it encounters sensitive subjects. The Chinese government also mandates that any generative AI must operate using legally sourced data and models, and AI systems involving personal data must obtain consent or meet other legal criteria. Furthermore, companies must take effective measures to ensure data quality, accuracy, and diversity, while adhering to cybersecurity regulations and oversight. In Taiwan, a report by the National Communications Commission (NCC) indicates that 52.5% of people have used AI services, primarily in the forms of smart navigation, voice input, and biometric authentication. Meanwhile, global AI investment continues to rise. A KPMG survey for the fourth quarter of 2024 found that AI companies attracted massive venture capital funding, with Databricks securing US$10 billion, OpenAI raising US$6.6 billion, xAI acquiring US$6 billion, and Anthropic pulling in US$4 billion. Despite a second consecutive year of declines in angel and seed-stage investments, the AI industry remains a notable exception, with startups in the sector still attracting substantial venture capital across all stages of funding. DeepSeek's journey highlights both the vast potential of generative AI and the significant challenges it faces, particularly in a geopolitically charged environment. As the company seeks to disrupt the AI landscape, its ability to navigate these obstacles while maintaining its edge will be critical to its future success.
[19]
DeepSeek's Parent Company High-Flyer Once Sued for Poaching Trader With 'Core Commercial Secrets' in $5M AI Case
In 2023, High-Flyer began focusing more on developing foundational AI and launched DeepSeek. The parent company of China's famous new AI firm, DeepSeek, was embroiled in a legal dispute over accusations it poached a trading strategist from a rival firm. In March 2023, High-Flyer, the Liang Wenfeng-led hedge fund that later branched out into DeepSeek, was sued by rival Shanghai Ruitian Investment over potential "infringement on technology secrets." Talent Poaching Accusations Shanghai Ruitian Investment claimed its former trader, Yang Yunhao, had been poached while having knowledge of the company's quantitative strategy codes, which were likened to "core commercial secrets." "Technology development is the core competitiveness of quantitative investment managers," Ruitian said, according to court documents seen by Bloomberg. High-Flyer's hiring of Yang meant Ruitian could suffer significant losses "measured in hundreds of millions of yuan," it added. The hedge fund sought 5 million yuan in compensation from High-Flyer in an anti-competition lawsuit, Bloomberg reported at the time. High-Flyer Denied Wrongdoing High-Flyer denied the claims, stating its hiring processes complied with relevant laws. The Wenfeng-led hedge fund said the employee had joined after all non-compete obligations had ended. In May 2023, the court ruled in favor of High-Flyer and rejected Ruitian's requests for an appeal. The court agreed that Ruitian should pay Yang 3.5 million yuan ($495,000) in economic damages. High-Flyer to DeepSeek Launched in 2016, High-Flyer began as a highly successful quantitative hedge fund that leveraged cutting‐edge AI algorithms to trade stocks and deliver record returns. Founded by Wenfeng and two former classmates, the firm grew rapidly in China -- managing over $14 billion in assets -- and earned a reputation for using innovative, AI-powered learning models to spot market opportunities. However, in 2021, despite years of outperformance, even beating the CSI 500 index in 2020, High-Flyer's proprietary AI models started to falter. Reports indicate that the models began mistiming trades during periods of high market volatility, leading to poor returns. The situation was so severe that in December 2021, the firm was forced to apologize for the losses publicly. This setback, compounded by regulatory pressures on quantitative funds in China, highlighted the limitations of using AI solely for rapid trading decisions. Recognizing both their AI-driven approach's potential and shortcomings, Wenfeng decided to pivot. Rather than continuing to pour resources into a trading system that was proving vulnerable under certain market conditions, High-Flyer redirected its focus toward developing foundational AI. Likely influenced by OpenAI's ChatGPT, the firm announced the creation of a new, independent research arm, DeepSeek, dedicated to pursuing the long-term goal of artificial general intelligence.
[0]
Security firm discovers DeepSeek has 'direct links' to Chinese government servers
Beyond investor and CEO panic, DeepSeek presents a host of security concerns. Here's what the experts think you should know. Just weeks into its new-found fame, Chinese AI startup DeepSeek is moving at breakneck speed, toppling competitors and sparking axis-tilting conversations about the virtues of open source software. Also: US sets AI safety aside in favor of 'AI dominance' However, numerous security concerns have surfaced about the company, prompting private and government organizations to ban the use of DeepSeek. Here's what you need to know. Founded by Liang Wenfeng in May 2023 (and thus not even two years old), the Chinese startup has challenged established AI companies with its open-source approach. According to Forbes, DeepSeek's edge may lie in the fact that it is funded only by High-Flyer, a hedge fund also run by Wenfeng, which gives the company a funding model that supports fast growth and research. The company's ability to create successful models by strategically optimizing older chips -- a result of the export ban on US-made chips, including Nvidia -- and distributing query loads across models for efficiency is impressive by industry standards. Released in full on January 21st, R1 is DeepSeek's flagship reasoning model, which performs at or above OpenAI's lauded o1 model on several math, coding, and reasoning benchmarks. Built on V3 and based on Alibaba's Qwen and Meta's Llama, what makes R1 interesting is that, unlike most other top models from tech giants, it's open source, meaning anyone can download and use it. That said, DeepSeek has not disclosed R1's training dataset. So far, all other models it has released are also open source. Also: I tested DeepSeek's R1 and V3 coding skills - and we're not all doomed (yet) DeepSeek is cheaper than comparable US models. For reference, R1 API access starts at $0.14 for a million tokens, a fraction of the $7.50 that OpenAI charges for the equivalent tier. DeepSeek claims in a company research paper that its V3 model, which can be compared to a standard chatbot model like Claude, cost $5.6 million to train, a number that's circulated (and disputed) as the entire development cost of the model. As the AP reported, some lab experts believe the paper only refers to the final training run for V3, not its entire development cost (which would be a fraction of what tech giants have spent to build competitive models). Some experts suggest DeepSeek's costs don't include earlier infrastructure, R&D, data, and personnel costs. One drawback that could impact the model's long-term competition with o1 and US-made alternatives is censorship. Chinese models often include blocks on certain subject matter, meaning that while they function comparably to other models, they may not answer some queries (see how DeepSeek's AI assistant responds to questions about Tiananmen Square and Taiwan here). As DeepSeek use increases, some are concerned its models' stringent Chinese guardrails and systemic biases could be embedded across all kinds of infrastructure. You can access uncensored, US-based versions of DeepSeek through platforms like Perplexity, which have removed its censorship weights and run it on local servers to avoid security concerns. Also: Is DeepSeek's new image model another win for cheaper AI? In December, ZDNET's Tiernan Ray compared R1-Lite's ability to explain its chain of thought to that of o1, and the results were mixed. That said, DeepSeek's AI assistant reveals its train of thought to the user during queries, a novel experience for many chatbot users given that ChatGPT does not externalize its reasoning. Of course, all popular models come with red-teaming backgrounds, community guidelines, and content guardrails. However, at least at this stage, American-made chatbots are unlikely to refrain from answering queries about historical events. Data privacy worries that have circulated on TikTok -- the Chinese-owned social media app now somewhat banned in the US -- are also cropping up around DeepSeek. Also: ChatGPT's Deep Research just identified 20 jobs it will replace. Is yours on the list? On Wednesday, Ivan Tsarynny, CEO of Feroot Security, told ABC that his firm had discovered "direct links to servers and to companies in China that are under control of the Chinese government," which he said they "have never seen in the past." After decrypting some of DeepSeek's code, Feroot found hidden programming that can send user data -- including identifying information, queries, and online activity -- to China Mobile, a Chinese government-operated telecom company that has been banned from operating in the US since 2019 due to national security concerns. On Thursday, NowSecure recommended organizations "forbid" the use of DeepSeek's mobile app after finding several flaws including unencrypted data (meaning anyone monitoring traffic can intercept it) and poor data storage. Last week, research firm Wiz discovered that an internal DeepSeek database was publicly accessible "within minutes" of conducting a security check. The "completely open and unauthenticated" database contained chat histories, user API keys, and other sensitive data. Also: Why rebooting your phone daily is your best defense against zero-click hackers "More critically, the exposure allowed for full database control and potential privilege escalation within the DeepSeek environment, without any authentication or defense mechanism to the outside world," Wiz's report explains. According to Wired, which initially published the research, though Wiz did not receive a response from DeepSeek, the database appeared to be taken down within 30 minutes of Wiz notifying the company. It's unclear how long it was accessible or if any other entity discovered it before it was taken down. Even without this alarming development, DeepSeek's privacy policy raises some flags. "The personal information we collect from you may be stored on a server located outside of the country where you live," it states. "We store the information we collect in secure servers located in the People's Republic of China." Also: 'Humanity's Last Exam' benchmark is stumping top AI models - can you do any better? The policy outlines that DeepSeek collects plenty of information, including but not limited to: The policy continues: "Where we transfer any personal information out of the country where you live, including for one or more of the purposes as set out in this Policy, we will do so in accordance with the requirements of applicable data protection laws." The policy does not mention GDPR compliance. Also: Apple researchers reveal the secret sauce behind DeepSeek AI "Users need to be aware that any data shared with the platform could be subject to government access under China's cybersecurity laws, which mandate that companies provide access to data upon request by authorities," Adrianus Warmenhoven, a member of NordVPN's security advisory board, told ZDNET via email. According to some observers, the fact that R1 is open source means increased transparency, allowing users to inspect the model's source code for signs of privacy-related activity. However, DeepSeek also released smaller versions of R1, which can be downloaded and run locally to avoid any concerns about data being sent back to the company (as opposed to accessing the chatbot online). Also: ChatGPT privacy tips: Two important ways to limit the data you share with OpenAI All chatbots, including ChatGPT, collect some degree of user data when queried via the browser. AI safety researchers have long been concerned that powerful open-source models could be applied in dangerous and unregulated ways once out in the wild. Tests by AI safety firm Chatterbox found DeepSeek R1 has "safety issues across the board." Also: We're losing the battle against complexity, and AI may or may not help Even in varying degrees, US AI companies employ some kind of safety oversight team. DeepSeek has not publicized whether it has a safety research team, and has not responded to ZDNET's request for comment on the matter. "Most companies will keep racing to build the strongest AI they can, irrespective of the risks, and will see enhanced algorithmic efficiency as a way to achieve higher performance faster," said Peter Slattery, a researcher on MIT's FutureTech team who led its Risk Repository project. "That leaves us even less time to address the safety, governance, and societal challenges that will come with increasingly advanced AI systems." "DeepSeek's breakthrough in training efficiency also means we should soon expect to see a large number of local, specialized 'wrappers' -- apps built on top of DeepSeek R1 engine -- which will each introduce their own privacy risks, and which could each be misused if they fell into the wrong hands," added Ryan Fedasiuk, director of US AI governance at The Future Society, an AI policy nonprofit. Some analysts note that DeepSeek's lower-lift compute model is more energy efficient than that of US AI giants. "DeepSeek's new AI model likely does use less energy to train and run than larger competitors' models," said Slattery. "However, I doubt this marks the start of a long-term trend in lower energy consumption. AI's power stems from data, algorithms, and compute -- which rely on ever-improving chips. When developers have previously found ways to be more efficient, they have typically reinvested those gains into making even bigger, more powerful models, rather than reducing overall energy usage." "DeepSeek isn't the only AI company that has made extraordinary gains in computational efficiency. In recent months, U.S.-based Anthropic and Google Gemini have boasted similar performance improvements," Fedasiuk said. Also: $450 and 19 hours is all it takes to rival OpenAI's o1-preview "DeepSeek's achievements are remarkable in that they seem to have independently engineered breakthroughs that promise to make large language models much more efficient and less expensive, sooner than many industry professionals were expecting -- but in a field as dynamic as AI, it's hard to predict just how long the company will be able to bask in the limelight." R1's success highlights a sea change in AI that could empower smaller labs and researchers to create competitive models and diversify the options. For example, organizations without the funding or staff of OpenAI can download R1 and fine-tune it to compete with models like o1. Just before R1's release, researchers at UC Berkeley created an open-source model on par with o1-preview, an early version of o1, in just 19 hours and for roughly $450. Given how exorbitant AI investment has become, many experts speculate that this development could burst the AI bubble (the stock market certainly panicked). Some see DeepSeek's success as debunking the thought that cutting-edge development means big models and spending. It also casts Stargate, a $500 billion infrastructure initiative spearheaded by several AI giants, in a new light, creating speculation around whether competitive AI requires the energy and scale of the initiative's proposed data centers. Also: Anthropic offers $20,000 to whoever can jailbreak its new AI safety system DeepSeek's ascent comes at a critical time for Chinese-American tech relations, just days after the long-fought TikTok ban went into partial effect. Ironically, DeepSeek lays out in plain language the fodder for security concerns that the US struggled to prove about TikTok in its prolonged effort to enact the ban. The US Navy has already banned DeepSeek, and lawmakers are trying to ban the app from all government devices.
[0]
OpenAI Seems to Be Low Key Panicking
Chinese AI startup DeepSeek shocked the tech industry with its extremely cost-effective AI model, trained at a fraction of the cost of its far more resource-intensive rivals in the US. DeepSeek turned out to be a force to be reckoned with, triggering a more than $1 trillion selloff, with spooked investors wondering whether they had grossly overpaid the likes of OpenAI and Meta for years. And while OpenAI CEO Sam Altman has since congratulated DeepSeek for its "impressive" R1 reasoning model, he also promised investors to "deliver much better models." But considering the big changes the ChatGPT maker is planning, as Bloomberg reports, OpenAI seems genuinely shaken, not just invigorated by a "new competitor," as Altman termed DeepSeek. In fact, it seems a lot like the company is quietly panicking as it tries to deliver a worthy retort to DeepSeek's foundations-shaking offering. For one, OpenAI executives revealed the company would mimic DeepSeek's R1 by showing more of its AI models' "reasoning steps," according to Bloomberg. There's also the question of pricing. It made its o3-mini reasoning model free for all users last month, in another apparent attempt to compete with DeepSeek. The company is also reportedly reconsidering its approach to open source. Despite its misleading name, OpenAI has become a largely for-profit, closed-source company that keeps its cards extremely close to its chest. By contrast, DeepSeek's R1 is an open-source model. During a Reddit AMA earlier this week, Altman said that he thinks "we have been on the wrong side of history here and need to figure out a different open source strategy." "Not everyone at OpenAI shares this view, and it's also not our current highest priority," he added. And it's not just OpenAI. Meta CEO Mark Zuckerberg has also addressed the topic during a recent earnings call. "I think that there's a number of novel things that they did that I think we're still digesting," he told investors last month, admitting that Meta is hoping to "implement" some aspects of DeepSeek's tech "in our systems." "And that's part of the nature of how this works, whether it's a Chinese competitor or not," Zuckerberg said. Other executives are openly hoping that US protectionism will save them. Anthropic CEO Dario Amodei also called for US lawmakers to implement new restrictions on AI chip exports to stop China from catching up too quickly. Still, the US AI industry hasn't changed its plan to pour hundreds of billions of dollars into expanding infrastructure. At the heart of this movement is president Donald Trump's OpenAI-backed Stargate initiative, which aims to raise a whopping half a trillion dollars in four years. (Whether it'll be successful in doing so remains to be seen.) DeepSeek, meanwhile, has ironically become a victim of its own success, limiting access to its AI models due to shortages in server capacity. In other words, even with its extremely efficient reasoning model, scaling up the operation to meet demand remains an ongoing issue -- which could offer OpenAI at least a little bit of breathing room.
[0]
World's top AI brains debate if DeepSeek's model is game changer
OpenAI CEO Sam Altman attends a talk session with SoftBank Group chairman and CEO during the event 'Transforming Business through AI' in Tokyo, Feb. 3. EPA-Yonhap Major figures in artificial intelligence (AI) acknowledge the accomplishment of Chinese start-up DeepSeek, but caution against exaggerating the company's success, as the tech industry weighs the implications of the firm's advanced models developed at a fraction of the usual cost. Industry heavyweights from OpenAI CEO Sam Altman to former Baidu and Google scientist Andrew Ng have praised the open-source approach of DeepSeek, following its release of two advanced AI models. Based in Hangzhou, capital of eastern Zhejiang province, DeepSeek stunned the global AI industry with its open-source reasoning model, R1. Released on Jan. 20, the model showed capabilities comparable to closed-source models from ChatGPT creator OpenAI, but was said to be developed at significantly lower training costs. DeepSeek said its foundation large language model, V3, released a few weeks earlier, cost only $5.5 million to train. That statement stoked concerns that tech companies had been overspending on graphics processing units for AI training, leading to a major sell-off of AI chip supplier Nvidia's shares last week. OpenAI "has been on the wrong side of history here and needs to figure out a different open-source strategy," Altman said last week in an "Ask Me Anything" session on internet forum Reddit. The U.S. start-up has been taking a closed-source approach, keeping information such as the specific training methods and energy costs of its models tightly guarded. Still, "not everyone at OpenAI shares this view" and "it's also not our current highest priority," Altman added. Ng, founder and former lead of Google Brain and former chief scientist at Baidu, said products from DeepSeek and its local rivals showed that China was quickly catching up to the U.S. in AI. Computer Scientist Andrew Ng / AP-Yonhap "When ChatGPT was launched in November 2022, the U.S. was significantly ahead of China in generative AI ... but in reality, this gap has rapidly eroded over the past two years," Ng wrote on X, formerly Twitter. "With models from China such as Qwen, Kimi, InternVL and DeepSeek, China had clearly been closing the gap, and in areas such as video generation there were already moments where China seemed to be in the lead," he said. The Qwen model series is developed by Alibaba Group Holding, owner of the South China Morning Post, while Kimi and InternVL are from start-up Moonshot AI and the state-backed Shanghai Artificial Intelligence Laboratory, respectively. "If the U.S. continues to stymie open source, China will come to dominate this part of the supply chain and many businesses will end up using models that reflect China's values much more than America's," said Ng. Recognition of DeepSeek's achievements comes as large U.S. tech companies are "fully promoting" the Chinese start-up, Shawn Kim, equity analyst at Morgan Stanley, wrote in a research note on Monday. Nvidia has made DeepSeek's R1 model available to users of its NIM microservice since Thursday, while OpenAI investor Microsoft last week launched support for R1 on its Azure cloud computing platform and GitHub. Amazon.com also enabled clients to create applications with R1 through Amazon Web Services. However, some experts said the significance of DeepSeek's breakthrough might have been overblown. Meta Platforms chief AI scientist Yann LeCun said it was wrong to think that "China is surpassing the U.S. in AI "because of DeepSeek." The correct reading is: open-source models are surpassing proprietary ones," he wrote on Threads. DeepSeek, which was spun off in May 2023 from founder Liang Wenfeng's hedge fund High-Flyer Quant, still faces plenty of doubts about the true cost and training methodology of its AI models. Fudan University computer science professor Zheng Xiaoqing pointed out that DeepSeek's reported training expenditure for its V3 model excluded the costs associated with prior research and experiments, according to the start-up's technical report. DeepSeek's success stemmed from "engineering optimisation," which "will not have a huge impact on chip purchases or shipments," Zheng was quoted as saying in an interview with Chinese newspaper National Business Daily. Read the full story at SCMP.
[18]
Three things to know as the dust settles from DeepSeek
The launch of a single new AI model does not normally cause much of a stir outside tech circles, nor does it typically spook investors enough to wipe out $1 trillion in the stock market. Now, a couple of weeks since DeepSeek's big moment, the dust has settled a bit. The news cycle has moved on to calmer things, like the dismantling of long-standing US federal programs, the purging of research and data sets to comply with recent executive orders, and the possible fallouts from President Trump's new tariffs on Canada, Mexico, and China. Within AI, though, what impact is DeepSeek likely to have in the longer term? Here are three seeds DeepSeek has planted that will grow even as the initial hype fades. First, it's forcing a debate about how much energy AI models should be allowed to use up in pursuit of better answers. You may have heard (including from me) that DeepSeek is energy efficient. That's true for its training phase, but for inference, which is when you actually ask the model something and it produces an answer, it's complicated. It uses a chain-of-thought technique, which breaks down complex questions--like whether it's ever okay to lie to protect someone's feelings -- into chunks, and then logically answers each one. The method allows models like DeepSeek to do better at math, logic, coding, and more. The problem, at least to some, is that this way of "thinking" uses up a lot more electricity than the AI we've been used to. Though AI is responsible for a small slice of total global emissions right now, there is increasing political support to radically increase the amount of energy going toward AI. Whether or not the energy intensity of chain-of-thought models is worth it, of course, depends on what we're using the AI for. Scientific research to cure the world's worst diseases seems worthy. Generating AI slop? Less so. Some experts worry that the impressiveness of DeepSeek will lead companies to incorporate it into lots of apps and devices, and that users will ping it for scenarios that don't call for it. (Asking DeepSeek to explain Einstein's theory of relativity is a waste, for example, since it doesn't require logical reasoning steps, and any typical AI chat model can do it with less time and energy.) Read more from me here. Second, DeepSeek made some creative advancements in how it trains, and other companies are likely to follow its lead. Advanced AI models don't just learn on lots of text, images, and video. They rely heavily on humans to clean that data, annotate it, and help the AI pick better responses, often for paltry wages. One way human workers are involved is through a technique called reinforcement learning with human feedback. The model generates an answer, human evaluators score that answer, and those scores are used to improve the model. OpenAI pioneered this technique, though it's now used widely by the industry.
[20]
Was it a lie by the Chinese startup? Industry analyst says DeepSeek incurred $1.6 billion in hardware costs and has a fleet of 50,000 Nvidia Hopper GPUs
Chinese startup DeepSeek's claims of cost-efficient AI development have been debunked. Industry analysts revealed that DeepSeek invested $1.6 billion on hardware with an estimated $944 million spent on operating costs.Chinese startup DeepSeek shocked the AI world last week, with its claims of a cost-efficient approach to AI model development, as per reports. But now industry analysts have found that the costs were much higher. The company touted training its advanced R1 model with only $6 million and 2,048 GPUs, as per reports. However, a report from industry analysts at SemiAnalysis painted a very different picture. According to their findings, DeepSeek has actually invested $1.6 billion in hardware, including a fleet of 50,000 Nvidia Hopper GPUs -- far more than initially reported, according to Tom's Hardware report. While DeepSeek gained attention for its impressive AI models, including the R1 and the DeepSeek-V3 Mixture-of-Experts (MoE) model, SemiAnalysis suggested the company's public claims about minimal hardware usage may have been an oversimplification. The 50,000 GPUs spread across multiple data centres form the backbone of DeepSeek's AI training, research, and development, as per the report. This includes 10,000 H800s and 10,000 H100s, with additional purchases of H20 units, according to SemiAnalysis. The company's total capital investment in servers is around $1.6 billion, with an estimated $944 million spent on operating costs, according to SemiAnalysis. Beyond its massive hardware investment, DeepSeek has also stood out for its self-funded operations and its recruitment strategy. It hired talent exclusively from mainland China, who belonged to prestigious universities like Peking and Zhejiang. They were offered competitive salaries -- some researchers earn over $1.3 million, industry analysts found. This influx of talent drove innovations like Multi-Head Latent Attention (MLA), emphasizing efficiency and algorithmic improvements over sheer scaling, reported Tom's Hardware. As a result, DeepSeek's approach could challenge the need for high-end GPUs, potentially impacting major players like Nvidia. How much did DeepSeek actually invest in its AI models? DeepSeek's actual investment is around $1.6 billion, with a fleet of 50,000 Nvidia Hopper GPUs, as per a report by industry analysts. What makes DeepSeek stand out in AI development? DeepSeek focuses on hiring top talent from China's best universities and prioritizes efficient algorithms which has led to groundbreaking innovations like Multi-Head Latent Attention.
[21]
DeepSeek misled the AI market, training costs range in the billions not millions
TL;DR: A research firm, SemiAnalysis, revealed that DeepSeek's R1 model cost significantly more than the claimed $5 million. A research firm has found after conducting an extensive analysis that DeepSeek spent much more to train its R1 model than the touted $5 million. Research firm SemiAnalysis looked at the new AI model that shook up the tech industry when it was announced, as the Chinese creators, DeepSeek, stated it only cost $6 million to create a model that was on par with OpenAI's most advanced o1 model. The announcement of the cost of DeepSeek's R1 model threw the tech space into a tailspin, wiping approximately $1 trillion away from the stock market and $600 billion from NVIDIA. For comparison, OpenAI's "GPT-4 model" is estimated to have cost anywhere between $100 - $200 million to train. After DeepSeek's announcement, the market sentiment was the cost of training new AI models was completely overvalued, but now, according to SemiAnalysis, DeepSeek's R1 model cost the company a lot more than $5 million. The report states that DeepSeek purchased 10,000 units of NVIDIA's A100 GPUs in 2021, then acquired 10,000 "China-specific" NVIDIA H8000 AI GPUs and an additional 10,000 NVIDIA H100 GPUs. All of these GPUs, and training costs translates to $1.6 billion in capital expenditures for DeepSeek, with an estimated running cost of approximately $944 million.
[0]
DeepSeek: Everything you need to know about the AI chatbot app | TechCrunch
Chinese AI lab DeepSeek broke into the mainstream consciousness this week after its chatbot app rose to the top of the Apple App Store charts (and Google Play, as well). DeepSeek's AI models, which were trained using compute-efficient techniques, have led Wall Street analysts -- and technologists -- to question whether the U.S. can maintain its lead in the AI race and whether the demand for AI chips will sustain. But where did DeepSeek come from, and how did it rise to international fame so quickly? DeepSeek is backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions. AI enthusiast Liang Wenfeng co-founded High-Flyer in 2015. Wenfeng, who reportedly began dabbling in trading while a student at Zhejiang University, launched High-Flyer Capital Management as a hedge fund in 2019 focused on developing and deploying AI algorithms. In 2023, High-Flyer started DeepSeek as a lab dedicated to researching AI tools separate from its financial business. With High-Flyer as one of its investors, the lab spun off into its own company, also called DeepSeek. From day one, DeepSeek built its own data center clusters for model training. But like other AI companies in China, DeepSeek has been affected by U.S. export bans on hardware. To train one of its more recent models, the company was forced to use Nvidia H800 chips, a less-powerful version of a chip, the H100, available to U.S. companies. DeepSeek's technical team is said to skew young. The company reportedly aggressively recruits doctorate AI researchers from top Chinese universities. DeepSeek also hires people without any computer science background to help its tech better understand a wide range of subjects, per The New York Times. DeepSeek unveiled its first set of models -- DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat -- in November 2023. But it wasn't until last spring, when the startup released its next-gen DeepSeek-V2 family of models, that the AI industry started to take notice. DeepSeek-V2, a general-purpose text- and image-analyzing system, performed well in various AI benchmarks -- and was far cheaper to run than comparable models at the time. It forced DeepSeek's domestic competition, including ByteDance and Alibaba, to cut the usage prices for some of their models, and make others completely free. DeepSeek-V3, launched in December 2024, only added to DeepSeek's notoriety. According to DeepSeek's internal benchmark testing, DeepSeek V3 outperforms both downloadable, openly available models like Meta's Llama and "closed" models that can only be accessed through an API, like OpenAI's GPT-4o. Equally impressive is DeepSeek's R1 "reasoning" model. Released in January, DeepSeek claims R1 performs as well as OpenAI's o1 model on key benchmarks. Being a reasoning model, R1 effectively fact-checks itself, which helps it to avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer -- usually seconds to minutes longer -- to arrive at solutions compared to a typical non-reasoning model. The upside is that they tend to be more reliable in domains such as physics, science, and math. There is a downside to R1, DeepSeek V3, and DeepSeek's other models, however. Being Chinese-developed AI, they're subject to benchmarking by China's internet regulator to ensure that its responses "embody core socialist values." In DeepSeek's chatbot app, for example, R1 won't answer questions about Tiananmen Square or Taiwan's autonomy. If DeepSeek has a business model, it's not clear what that model is, exactly. The company prices its products and services well below market value -- and gives others away for free. The way DeepSeek tells it, efficiency breakthroughs have enabled it to maintain extreme cost competitiveness. Some experts dispute the figures the company has supplied, however. Whatever the case may be, developers have taken to DeepSeek's models, which aren't open source as the phrase is commonly understood but are available under permissive licenses that allow for commercial use. According to Clem Delangue, the CEO of Hugging Face, one of the platforms hosting DeepSeek's models, developers on Hugging Face have created over 500 "derivative" models of R1 that have racked up 2.5 million downloads combined. DeepSeek's success against larger and more established rivals has been described as "upending AI" and "over-hyped." The company's success was at least in part responsible for causing Nvidia's stock price to drop by 18% on Monday, and for eliciting a public response from OpenAI CEO Sam Altman. Microsoft announced that DeepSeek is available on its Azure AI Foundry service, Microsoft's platform that brings together AI services for enterprises under a single banner. When asked about DeepSeek's impact on Meta's AI spending during its first-quarter earnings call, CEO Mark Zuckerberg said spending on AI infrastructure will continue to be a "strategic advantage" for Meta. At the same time, some companies are banning DeepSeek, and so are entire countries and governments. As for what DeepSeek's future might hold, it's not clear. Improved models are a given. But the U.S. government appears to be growing wary of what it perceives as harmful foreign influence. TechCrunch has an AI-focused newsletter! Sign up here to get it in your inbox every Wednesday.
Share
Copy Link
Chinese AI startup DeepSeek has shaken the tech industry with its highly efficient and cost-effective AI models, challenging established players and raising questions about the future of AI development.
Chinese AI startup DeepSeek has emerged as a formidable player in the artificial intelligence industry, challenging established tech giants with its cost-effective and efficient AI models. Founded by Liang Wenfeng in May 2023, DeepSeek has quickly gained attention for its innovative approach to AI development [1][3].
DeepSeek's flagship models, R1 and V3, have demonstrated impressive capabilities:
What sets DeepSeek apart is its claim of developing these models at a fraction of the cost compared to its competitors. The company reported that its V3 model cost only $5.5 million to train, significantly less than the billions invested by U.S. tech giants [4][5].
DeepSeek's open-source strategy has garnered praise from industry leaders:
The company's success has led to widespread adoption of its models:
Despite its rapid rise, DeepSeek faces several challenges:
Chinese internet censorship: As a China-based company, DeepSeek's AI models are subject to strict content regulations, limiting responses on sensitive topics [2][4].
Data privacy concerns: Security researchers have discovered potential vulnerabilities in DeepSeek's systems, including links to Chinese government servers [2].
Questions about development costs: Some experts dispute DeepSeek's claimed development costs, suggesting that the reported figures may not include all associated expenses [3][5].
DeepSeek's success has sparked debates about the future of AI development:
The rise of DeepSeek has elicited varied responses from different stakeholders:
As DeepSeek continues to disrupt the AI landscape, its ability to navigate regulatory challenges, address security concerns, and maintain its competitive edge will be crucial in determining its long-term impact on the global AI industry.
NVIDIA announces significant upgrades to its GeForce NOW cloud gaming service, including RTX 5080-class performance, improved streaming quality, and an expanded game library, set to launch in September 2025.
9 Sources
Technology
6 hrs ago
9 Sources
Technology
6 hrs ago
As nations compete for dominance in space, the risk of satellite hijacking and space-based weapons escalates, transforming outer space into a potential battlefield with far-reaching consequences for global security and economy.
7 Sources
Technology
22 hrs ago
7 Sources
Technology
22 hrs ago
OpenAI updates GPT-5 to make it more approachable following user feedback, sparking debate about AI personality and user preferences.
6 Sources
Technology
14 hrs ago
6 Sources
Technology
14 hrs ago
A pro-Russian propaganda group, Storm-1679, is using AI-generated content and impersonating legitimate news outlets to spread disinformation, raising concerns about the growing threat of AI-powered fake news.
2 Sources
Technology
22 hrs ago
2 Sources
Technology
22 hrs ago
A study reveals patients' increasing reliance on AI for medical advice, often trusting it over doctors. This trend is reshaping doctor-patient dynamics and raising concerns about AI's limitations in healthcare.
3 Sources
Health
14 hrs ago
3 Sources
Health
14 hrs ago