Curated by THEOUTPOST
On Fri, 24 Jan, 12:02 AM UTC
102 Sources
[1]
Is DeepSeek's rise a threat to American AI dominance? Here is all you need to know
In a stunning turn of events, Chinese AI startup DeepSeek has unseated OpenAI's ChatGPT as the most downloaded free app on the U.S. App Store. Powered by its R1 reasoning model, DeepSeek's chatbot has captivated users with its performance, triggering ripples across the global tech sector. Shares of AI giants Nvidia, Microsoft, and Meta took a hit on Monday, as industry leaders grapple with the implications of this low-cost yet effective rival. Adding to the frenzy, DeepSeek announced temporary limits on user registrations due to "large-scale malicious attacks" on its systems. However, existing users can continue to access the app, keeping the buzz alive. DeepSeek is the creation of Liang Wenfeng, a 40-year-old engineer and co-founder of the AI-focused hedge fund High-Flyer. Based in Hangzhou, China, Liang launched DeepSeek in 2023, utilizing an innovative approach that blends advanced Nvidia A100 chips with lower-end alternatives. Remarkably, DeepSeek's R1 model was developed at a fraction of the cost incurred by its American rivals -- reportedly as low as $6 million compared to the billions spent by OpenAI, Meta, and others. In a 2024 interview, Liang revealed his surprise at the global reaction to DeepSeek's affordability. "We were simply following our own pace, calculating costs, and setting prices accordingly," he explained. The rise of DeepSeek has been described by Silicon Valley venture capitalist Marc Andreessen as "AI's Sputnik moment," likening it to the Soviet Union's 1957 satellite launch that shocked the United States. The app's swift ascent has raised critical questions about the future of AI dominance, particularly for U.S. companies pouring billions into infrastructure projects. Last week, OpenAI and other firms pledged $500 billion to develop AI infrastructure in the U.S., with President Donald Trump hailing it as "the largest AI investment in history." Yet, DeepSeek's success suggests that costly infrastructure may not be the only path to cutting-edge AI development. DeepSeek stands out not just for its cost efficiency but also for its open-source model, DeepSeek-V3, which allows developers worldwide to access and modify its technology for free. Despite operating under U.S. export restrictions on advanced chips, Chinese AI firms like DeepSeek have adapted by sharing resources and experimenting with new, low-power approaches. This has led to AI models that require less computing power, slashing development costs and shaking up industry norms. While DeepSeek's innovation has won accolades, it also faces criticism for its limitations. Like other Chinese chatbots, it avoids politically sensitive topics, raising questions about its usability in global markets. The emergence of DeepSeek has sent shockwaves through Wall Street and beyond. Shares of European chipmaker ASML and Siemens Energy plummeted, while analysts voiced concerns about the profitability of high-cost AI models. "This low-cost alternative has caught the market off guard," said Fiona Cincotta, a senior market analyst at City Index. Others, however, believe U.S. access to advanced chips will continue to provide an edge in the long run. Despite challenges, DeepSeek's rise underscores the need for adaptability in a rapidly evolving AI landscape. As the global race for artificial intelligence heats up, DeepSeek has not only disrupted the market but also highlighted the potential for low-cost innovation to redefine industry norms. As the world watches the unfolding AI arms race, the spotlight remains on DeepSeek and its ambitious goals. With its R1 model already making waves, the company may signal a new era of AI development -- one where ingenuity trumps investment. For now, DeepSeek's success serves as a wake-up call to Silicon Valley: innovation isn't always about who spends the most but who thinks the smartest.
[2]
What is DeepSeek? The Chinese AI start-up rivalling OpenAI
DeepSeek's success is a win for open source, says Meta VP and chief AI scientist Yann LeCun. DeepSeek, a Chinese AI start-up, released its latest reasoning model last week, and now, the company's AI chat assistant app has taken the top spots in the Apple App stores in both the UK and the US, overthrowing ChatGPT. Released on 20 January, DeepSeek's large language model R1 left Silicon Valley leaders in a flurry, especially as the start-up claimed that its model is leagues cheaper than its US competitors - taking only $5.6m to train - while performing on par with industry heavyweights like OpenAI's GPT-4 and Anthropic's Claude 3.5 Sonnet models. However, in order to build its models, DeepSeek, which was founded in 2023 by Liang Wenfeng - who is also the founder of one of China's top hedge funds, High-Flyer - needed to strategically adapt to the increasing constraints imposed by the US on its AI chip exports. Earlier this month, the outgoing US administration capped the number of AI chips that could be exported from the US to most countries, while maintaining a block on exports to countries including China and Russia. Although seen as a measure to ensure the US its leadership in AI innovation, the regulations have seemingly allowed China to reduce its reliance on American-made technology. Zihan Wang, a former DeepSeek employee told MIT Technology Review that, in order to create R1, DeepSeek had to rework its training process to reduce strain on the GPUs it uses - a variety specifically released by Nvidia for the Chinese market that caps its performance at half the speed. Moreover, rather than weakening China's AI capabilities, US sanctions on the country seem to be pushing start-ups like DeepSeek to innovate in newer ways that prioritise efficiency, resource-pooling, and collaboration, the publication further reported. "Deepseek's surprising rise to the top of the Apple download charts in the United States, even under the weight of sanctions, poses an interesting question around the prevailing narrative of US dominance in artificial intelligence," said John Clancy, the founder and CEO of Galvia AI. "Could this be an indicator of over investment in the sector, and could the market be overestimating the long-term demand for chips? If so, valuations of companies like Nvidia might need a reality check." World-leading chip maker Nvidia invested $1bn in AI companies in 2024, becoming a crucial backer of start-ups in the sector, including OpenAI, as more and more tech giants employ and offer an increasing number of AI products and services. The company, that has heavily invested in AI over the recent years, reported a "record" revenue of $35.1bn for the latest financial quarter. However, following R1's release, Nvidia stocks have plummeted, falling down by more than 11pc today. Although, Yann LeCun, Meta's VP and chief AI scientist, said that DeepSeek's capabilities should be seen as a win for open source models, and not as a competition between US and China. "DeepSeek has profited from open research and open source (eg PyTorch and Llama from Meta). They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it. That is the power of open research and open source." Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
[3]
China's DeepSeek AI model tops App Store charts, rivaling OpenAI with fewer resources
Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. What just happened? Chinese AI company DeepSeek released an open version of its reasoning model, R1, on January 20, 2025. The model has garnered much attention in the tech industry for its performance, which reportedly matches or exceeds OpenAI's o1 on certain AI benchmarks. Since its release, conversations on social media have been fast and furious about its potential impact on AI development and competition between Chinese and American tech companies. Prominent venture capitalist Marc Andreessen was one of those impressed by the feat, writing on X that DeepSeek's model was "one of the most amazing and impressive breakthroughs I've ever seen." DeepSeek's accomplishment is particularly noteworthy given the company's claim to have trained a model with 671 billion parameters using just 2,048 Nvidia H800s and $5.6 million, a fraction of the resources typically required by industry giants like OpenAI and Google. This cost-effectiveness is even more remarkable considering the U.S. sanctions that restrict the sale of advanced chips to Chinese companies. Commentators said that for these reasons, the model also has geopolitical implications. "The impressive performance of DeepSeek's distilled models [...] means that very capable reasoners will continue to proliferate widely and be runnable on local hardware, far from the eyes of any top-down control regime," Dean Ball, an AI researcher at George Mason University, wrote. Some observers believe that DeepSeek's success could potentially benefit the entire AI industry. "If training models get cheaper faster and easier, the demand for inference (actual real world use of AI) will grow and accelerate even faster, which assures the supply of compute will be used," Garry Tan, CEO of Y Combinator, wrote on X. However, not all reactions have been uniformly positive. Neal Khosla, CEO of Curai, expressed skepticism, suggesting that the company might be a "ccp state psyop" aimed at undermining U.S. AI competitiveness. However, this claim has been challenged for lack of evidence. DeepSeek-R1 is a reasoning model that employs a step-by-step approach to problem-solving, making it particularly adept at tasks in physics, science, and mathematics. The model contains 671 billion parameters, which contribute to its problem-solving capabilities. DeepSeek has also released smaller "distilled" versions of R1, ranging from 1.5 billion to 70 billion parameters, with the smallest capable of running on a laptop. R1 is available under an MIT license, allowing for commercial use without restrictions. According to DeepSeek, the model outperforms OpenAI's o1 on benchmarks such as AIME, MATH-500, and SWE-bench Verified. These assess various aspects of AI performance, including mathematical problem-solving and programming tasks. One notable limitation of R1 is its adherence to Chinese regulatory requirements. As a Chinese model, it's subject to benchmarking by China's internet regulator to ensure compliance with "core socialist values." Consequently, R1 refrains from answering questions about sensitive topics such as Tiananmen Square or Taiwan's autonomy. Despite these constraints, DeepSeek's achievement has sparked significant interest. As of Sunday afternoon, DeepSeek's AI assistant has become the top free app in the Apple App Store, surpassing even ChatGPT. The success of DeepSeek has catapulted its creator Liang Wenfeng into the national spotlight. Recently, he was the sole AI industry representative invited to a high-profile meeting with Li Qiang, China's Premier and second-most powerful leader. Liang, a Chinese entrepreneur and hedge fund manager, began his journey to AI prominence in the world of quantitative finance. In 2015, Liang founded High-Flyer, a quantitative hedge fund that quickly rose to one of China's "Big Four" quantitative private funds. Under Liang's leadership, High-Flyer pioneered the integration of AI-driven strategies in quantitative investment, transitioning to a fully AI-based approach by 2017. Liang's foray into AI development began in earnest in 2021 when he started acquiring thousands of Nvidia GPUs for what was initially perceived as an eccentric side project. This prescient move laid the groundwork for DeepSeek, which Liang founded in 2023 with the ambitious goal of developing human-level AI. Liang's unconventional background has proven to be a unique advantage in the AI field. His team's experience in utilizing Nvidia chips for stock trading has translated well into the challenges posed by U.S. export restrictions on advanced AI chips to China. This adaptability has allowed DeepSeek to innovate in the face of limited access to cutting-edge hardware.
[4]
DeepSeek: Chinese AI model overtakes ChatGPT to top app charts
After DeepSeek-R1 was launched earlier this month the company boasted of "performance on par with" one of ChatGPT maker OpenAI's latest models - when used for tasks such as maths, coding and natural language reasoning. Silicon Valley venture capitalist and Donald Trump advisor Marc Andreessen described DeepSeek-R1 as "AI's Sputnik moment", in a reference to the first artificial Earth satellite that was launched by the Soviet Union in 1957. Advanced chips power the training of AI models like ChatGPT and DeepSeek. But since 2021 the US government has widened its restrictions on advanced chips being sold to China. In order to continue their work without steady supplies of imported advanced chips, Chinese AI developers have shared their work with each other and experimented with new approaches to the technology. This has resulted in AI models that require far less computing power than before. It also means that they cost a lot less than previously thought possible, which has the potential to upend the industry. Shares in AI-related companies based in the US, such as Nvidia, Microsoft and Meta were down on Monday morning. Some estimates put the cost of training DeepSeek at a fraction of the large US AI firms. "It can potentially derail the investment case for the entire AI supply chain, which is driven by high spending from a small handful of hyperscalers," Singapore-based technology equity advisor Vey-Sern Ling told the BBC. But Wall Street banking giant Citi cautioned that while DeepSeek could challenge the dominant positions of American companies like OpenAI issues faced by Chinese firms could hamper their development. "We estimate that in an inevitably more restrictive environment, US' access to more advanced chips is an advantage," its analysts said in a report. Last week, a consortium of US tech firms and foreign investors announced The Stargate Project, a company which is putting $500bn into AI infrastructure in Texas.
[5]
DeepSeek: Chinese AI firm making waves in Silicon Valley
Chinese firm DeepSeek's artificial intelligence chatbot has soared to the top of the Apple Store's download charts, stunning industry insiders and analysts with its ability to match its US competitors. The program has shaken up the tech industry and hit US titans including Nvidia and Meta, which has spent vast sums of cash to get ahead in the fast-developing AI sector. DeepSeek was developed by a start-up based in the eastern Chinese city of Hangzhou, known for its high density of tech firms. Available as an app or on desktop, DeepSeek can do many of the things that its Western competitors can do -- write song lyrics, help work on a personal development plan, or even write a recipe for dinner based on what's in the fridge. It can communicate in multiple languages, though it told AFP that it was strongest in English and Chinese. It is subject to many of the limitations seen in other Chinese-made chatbots like Baidu's Ernie Bot -- asked about leader Xi Jinping or Beijing's policies in the western region of Xinjiang, it implored AFP to "talk about something else". But from writing complex code to solving difficult sums, industry insiders have been astonished by just how well DeepSeek's abilities match the competition. "What we've found is that DeepSeek... is the top performing, or roughly on par with the best American models," Alexandr Wang, CEO of Scale AI, told CNBC. That's all the more surprising given what is known about how it was made. In a paper detailing its development, the firm said the model was trained using only a fraction of the chips used by its Western competitors. 'Sputnik moment'? Analysts had long thought that the United States' critical advantage over China when it comes to producing high-powered chips -- and its ability to prevent the Asian power from accessing the technology -- would give it the edge in the AI race. But DeepSeek said they spent only $5.6 million developing their model -- peanuts when compared with the billions US tech giants have poured into AI. Shares in major tech firms in the US and Japan have tumbled as the industry takes stock of the challenge from DeepSeek. Chip making giant Nvidia -- the world's dominant supplier of AI hardware and software -- closed down more than three percent on Wall Street on Friday. And Japanese firm SoftBank, a key investor in US President Donald Trump's announcement of a new $500 billion venture to build infrastructure for artificial intelligence in the United States, lost more than eight percent Monday. Venture capitalist Marc Andreessen, a close advisor to Trump, described it as "AI's Sputnik moment" -- a reference to the Soviet satellite launch that sparked the Cold War space race. "DeepSeek R1 is one of the most amazing and impressive breakthroughs I've ever seen," he wrote on X. Open source model Like its Western competitors Chat-GPT, Meta's Llama and Claude, DeepSeek uses a large-language model -- massive quantities of texts to train their everyday language use. But unlike Silicon Valley rivals, who have developed proprietary LLMs, DeepSeek is open source, meaning anyone can access the app's code, see how it works and modify it themselves. "We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive -- truly open, frontier research that empowers all," Jim Fan, a senior research manager at Nvidia, wrote on X. DeepSeek said it "tops the leaderboard among open-source models" -- and "rivals the most advanced closed-source models globally". Scale AI's Wang wrote on X that "DeepSeek is a wake up call for America". 'Great things' Beijing's leadership has vowed to be the world leader in AI technology by 2030 and is projected to spend tens of billions in support for the industry over the next few years. And the success of DeepSeek suggests that Chinese firms may have begun leaping the hurdles placed in their way. Last week DeepSeek's founder, hedge fund manager Liang Wenfeng, sat alongside other entrepreneurs at a symposium with Chinese Premier Li Qiang -- highlighting the firm's rapid rise. Its viral success also sent it to the top of the trending topics on China's X-like Weibo website Monday, with related hashtags pulling in tens of millions of views. "This really is an example of spending a little money to do great things," one user wrote.
[7]
DeepSeek: Chinese AI model rockets to top of app charts
It is powered by the open-source DeepSeek-V3 model, which its researchers claim was developed for less than $6m - significantly less than the billions spent by rivals. After DeepSeek-R1 was launched earlier this month the company boasted of "performance on par with" one of ChatGPT maker OpenAI's latest models - when used for tasks such as maths, coding and natural language reasoning. Silicon Valley venture capitalist and Donald Trump advisor Marc Andreessen described DeepSeek-R1 as "AI's Sputnik moment", in a reference to the first artificial Earth satellite that was launched by the Soviet Union in 1957. Advanced chips power the training of AI models like ChatGPT and DeepSeek. But since 2021 the US government has widened its restrictions on advanced chips being sold to China. In order to continue their work without steady supplies of imported advanced chips, Chinese AI developers have shared their work with each other and experimented with new approaches to the technology. This has resulted in AI models that require far less computing power than before. It also means that they cost a lot less than previously thought possible, which has the potential to upend the industry. Shares in AI-related companies based in the US, such as Nvidia, Microsoft and Meta were down on Monday morning. Some estimates put the cost of training DeepSeek at a fraction of the large US AI firms. "It can potentially derail the investment case for the entire AI supply chain, which is driven by high spending from a small handful of hyperscalers," Singapore-based technology equity advisor Vey-Sern Ling told the BBC. But Wall Street banking giant Citi cautioned that while DeepSeek could challenge the dominant positions of American companies like OpenAI issues faced by Chinese firms could hamper their development. "We estimate that in an inevitably more restrictive environment, US' access to more advanced chips is an advantage," its analysts said in a report. Last week, a consortium of US tech firms and foreign investors announced The Stargate Project, a company which is putting $500bn into AI infrastructure in Texas.
[8]
DeepSeek praised by Silicon Valley: The $6 million AI disruption
In the world of AI, where American tech giants like OpenAI, Google, and Nvidia have long reigned supreme, a new player has emerged to disrupt the status quo. DeepSeek, a Chinese AI startup founded in 2023, has stunned the global tech community with its R1 model. Developed on a modest budget of under $6 million using reduced-capability NVIDIA H800 chips, R1 has outperformed some of the most advanced AI systems in the world. Its success has sparked both admiration and anxiety across Silicon Valley, raising questions about the future of global AI dominance. The shockwaves began when DeepSeek's R1 system surpassed OpenAI's GPT-40 and Meta's Llama 3.1 in third-party benchmark tests across tasks like complex problem-solving, coding, and mathematics. Unlike its American counterparts, which have poured billions into AI development, DeepSeek achieved this feat with remarkable efficiency. Also read: Deepseek R1 vs Llama 3.2 vs ChatGPT o1: Which AI model wins? As Yann LeCun, Chief AI Scientist at Meta, noted on LinkedIn: "DeepSeek has profited from open research and open source (e.g., PyTorch and Llama from Meta). They came up with new ideas and built them on top of other people's work. That is the power of open research and open source." LeCun's comments reflect the collaborative ethos that underpins DeepSeek's success - a stark contrast to the heavily capitalised approaches of U.S. tech giants. Marc Andreessen, co-founder of Andreessen Horowitz and a prominent venture capitalist, called DeepSeek's achievement "one of the most amazing and impressive breakthroughs I've ever seen." In a post on X (formerly Twitter), he likened it to an "AI Sputnik moment," drawing parallels to the Soviet Union's launch of Sputnik during the Cold War - a moment that spurred intense competition between global superpowers. Andreessen also praised DeepSeek's commitment to open-source development, describing its R1 model as "a profound gift to the world." While many celebrated DeepSeek's innovation, others expressed concerns about its implications for U.S. competitiveness in AI. Neal Khosla, CEO of Curai, took a sceptical stance, labelling DeepSeek's narrative as "a CCP state psyop" designed to destabilise American dominance by faking low costs to disrupt market pricing. Though his claims were met with little evidence and some criticism for their speculative nature, they highlight broader anxieties about China's growing influence in AI. Also read: DeepSeek R1 on Raspbery Pi: Future of offline AI in 2025? The emergence of DeepSeek has also had tangible economic repercussions. On Monday morning, Nasdaq futures dropped nearly 2%, reflecting investor fears about potential disruptions to the U.S. tech market. Chris Mattmann, Chief Data and AI Officer at UCLA, explained why DeepSeek is causing such alarm: "Absolutely it [scares American companies] because [DeepSeek] democratises the capabilities for solving complex math and also more principled logic, which previously was only available generally by purchasing OpenAI and other licenses." The idea that cutting-edge AI could now be developed with fewer resources has shaken confidence in Silicon Valley's resource-intensive strategies. Garry Tan, CEO of Y Combinator, offered a more optimistic perspective on X: "If training models get cheaper faster and easier, the demand for inference will grow and accelerate even faster." His comments reflect a belief that innovations like DeepSeek's could lower barriers to entry in AI development, fostering greater competition and innovation across the industry. Also read: DeepSeek AI: Beyond ChatGPT, 5 ways DeepSeek is rewriting AI rules Aravind Srinivas, CEO of Perplexity AI, echoed this sentiment while congratulating DeepSeek for becoming the #1 app on Apple's App Store in the U.S.: "Congrats to @deepseek_ai for getting to #1 on the App Store... Look forward to using all their models for search, assistant, and agents this year." Srinivas highlighted how necessity drove DeepSeek's efficiency: "Necessity drives innovation... As long as it's mathematically possible, you can always find more efficient approaches." The implications of DeepSeek's rise extend beyond Silicon Valley. At the World Economic Forum in Davos last week, Microsoft CEO Satya Nadella urged global leaders not to underestimate China's advancements in AI: "We should take the developments out of China very, very seriously." His remarks come as President Donald Trump announced Stargate - a $500 billion private-sector initiative aimed at bolstering America's AI infrastructure. However, scepticism abounds regarding whether such projects can match China's rapid pace of innovation. Founded by Liang Wenfeng under Chinese hedge fund High-Flyer Quant in 2023, DeepSeek represents a new wave of Chinese startups challenging Western dominance in technology. Its success has reignited debates about resource efficiency versus capital intensity in AI development. While U.S. companies like Meta are expected to spend upwards of $65 billion on AI this year alone, DeepSeek has demonstrated that ingenuity can often outpace sheer financial might. As Silicon Valley scrambles to respond to this unexpected challenger, one thing is clear: DeepSeek has redefined what is possible in artificial intelligence - and it has everyone watching closely.
[9]
Budget AI Model DeepSeek Overtakes ChatGPT on App Store
A new China-based AI chatbot challenger called DeepSeek has reached the number one position on Apple's App Store free charts in multiple countries, including the US, raising questions about Silicon Valley's perceived leadership in artificial intelligence development. Released last week, the iOS app has garnered attention for its ability to match or exceed the performance of leading AI models like ChatGPT, while reportedly requiring only a fraction of the development costs. The Chinese AI startup behind DeepSeek was founded by quant fund chief Liang Wenfeng, and claims its R1 model was developed for approximately $6 million, compared to the billions invested by U.S. competitors. For example, Microsoft and Meta alone have committed over $65 billion each this year largely to AI infrastructure. The app uses DeepSeek-V3 model with over 600B parameters, and supports file uploads and web search. Investor Marc Andreessen is already calling the app "one of the most amazing and impressive breakthroughs" for its ability to show its work and reasoning as it addresses a user's written query or prompt. DeepSeek has also taken an open-source approach, allowing developers to freely inspect and build upon its technology. What's particularly notable is that DeepSeek apparently achieved this breakthrough despite US export restrictions on advanced AI chips to China. The company's success suggests Chinese developers have found ways to create more efficient AI models with limited computing resources, potentially challenging the assumption that cutting-edge AI development requires massive computing infrastructure investments. The emergence of DeepSeek has already sparked debate in Silicon Valley. While some view it as a concerning development for US technological leadership, others, like Y Combinator CEO Garry Tan, suggest it could benefit the entire AI industry by making model training more accessible and accelerating real-world AI applications. The app's success has already impacted financial markets, with some AI-related stocks experiencing volatility as investors reconsider the necessity of extensive capital expenditure for AI development. Shares of Nvidia for example slid 10% in premarket trading on Monday on the news of DeepSeek's popularity.
[10]
Explainer-What is DeepSeek and why is it disrupting the AI sector?
DeepSeek's AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States.Its success has already been noticed in China's top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang.Chinese startup DeepSeek's launch of its latest AI models, which it says are on a par or better than industry-leading models in the United States at a fraction of the cost, is threatening to upset the technology world order. The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips. DeepSeek's AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States. This has raised doubts about the reasoning behind some US tech companies' decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit. Below are some facts about the company shaking up the AI sector worldwide. Deepseek causing a stir The release of OpenAI's ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence. But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu , there was widespread disappointment in China at the gap in AI capabilities between US and Chinese firms. The quality and cost efficiency of DeepSeek's models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and US tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta's most advanced models, the Chinese startup has said. They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek's official WeChat account. But some have publicly expressed scepticism about DeepSeek's success story. Scale AI CEO Alexandr Wang said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington's export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation. Bernstein analysts on Monday highlighted in a research note that DeepSeek's total training costs for its V3 model were unknown but were much higher than the $5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed. Who is behind DeepSeek? DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng, cofounder of quantitative hedge fund High-Flyer, based on Chinese corporate records. Liang's fund announced in March 2023 on its official WeChat account that it was "starting again", going beyond trading to concentrate resources on creating a "new and independent research group, to explore the essence of AGI" (Artificial General Intelligence). DeepSeek was created later that year. ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks. It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, according to Chinese corporate records. High-Flyer's AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips. How does Beijing view DeepSeek? DeepSeek's success has already been noticed in China's top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang, according to state news agency Xinhua. Liang's presence at the gathering is potentially a sign that DeepSeek's success could be important to Beijing's policy goal of overcoming Washington's export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li.
[11]
DeepSeek: Chinese AI firm making waves in Silicon Valley
Chinese firm DeepSeek's artificial intelligence chatbot has soared to the top of the Apple Store's download charts, stunning industry insiders and analysts with its ability to match its US competitors. But from writing complex code to solving difficult sums, industry insiders have been astonished by just how well DeepSeek's abilities match the competition.Chinese firm DeepSeek's artificial intelligence chatbot has soared to the top of the Apple Store's download charts, stunning industry insiders and analysts with its ability to match its US competitors. The programme has shaken up the tech industry and hit US titans including Nvidia and Meta, which has spent vast sums of cash to get ahead in the fast-developing AI sector. Here's what you need to know about DeepSeek: Top performer DeepSeek was developed by a start-up based in the eastern Chinese city of Hangzhou, known for its high density of tech firms. Available as an app or on desktop, DeepSeek can do many of the things that its Western competitors can do -- write song lyrics, help work on a personal development plan, or even write a recipe for dinner based on what's in the fridge. It can communicate in multiple languages, though it told AFP that it was strongest in English and Chinese. It is subject to many of the limitations seen in other Chinese-made chatbots like Baidu's Ernie Bot -- asked about leader Xi Jinping or Beijing's policies in the western region of Xinjiang, it implored AFP to "talk about something else". But from writing complex code to solving difficult sums, industry insiders have been astonished by just how well DeepSeek's abilities match the competition. "What we've found is that DeepSeek... is the top performing, or roughly on par with the best American models," Alexandr Wang, CEO of Scale AI, told CNBC. That's all the more surprising given what is known about how it was made. In a paper detailing its development, the firm said the model was trained using only a fraction of the chips used by its Western competitors. 'Sputnik moment'? Analysts had long thought that the United States' critical advantage over China when it comes to producing high-powered chips -- and its ability to prevent the Asian power from accessing the technology -- would give it the edge in the AI race. But DeepSeek said they spent only $5.6 million developing their model -- peanuts when compared with the billions US tech giants have poured into AI. Shares in major tech firms in the US and Japan have tumbled as the industry takes stock of the challenge from DeepSeek. Chip making giant Nvidia -- the world's dominant supplier of AI hardware and software -- closed down more than three percent on Wall Street on Friday. And Japanese firm SoftBank, a key investor in US President Donald Trump's announcement of a new $500 billion venture to build infrastructure for artificial intelligence in the United States, lost more than eight percent Monday. Venture capitalist Marc Andreessen, a close advisor to Trump, described it as "AI's Sputnik moment" -- a reference to the Soviet satellite launch that sparked the Cold War space race. "DeepSeek R1 is one of the most amazing and impressive breakthroughs I've ever seen," he wrote on X. Open source model Like its Western competitors Chat-GPT, Meta's Llama and Claude, DeepSeek uses a large-language model -- massive quantities of texts to train their everyday language use. But unlike Silicon Valley rivals, who have developed proprietary LLMs, DeepSeek is open source, meaning anyone can access the app's code, see how it works and modify it themselves. "We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive -- truly open, frontier research that empowers all," Jim Fan, a senior research manager at Nvidia, wrote on X. DeepSeek said it "tops the leaderboard among open-source models" -- and "rivals the most advanced closed-source models globally". Scale AI's Wang wrote on X that "DeepSeek is a wake up call for America". 'Great things' Beijing's leadership has vowed to be the world leader in AI technology by 2030 and is projected to spend tens of billions in support for the industry over the next few years. And the success of DeepSeek suggests that Chinese firms may have begun leaping the hurdles placed in their way. Last week DeepSeek's founder, hedge fund manager Liang Wenfeng, sat alongside other entrepreneurs at a symposium with Chinese Premier Li Qiang -- highlighting the firm's rapid rise. Its viral success also sent it to the top of the trending topics on China's X-like Weibo website Monday, with related hashtags pulling in tens of millions of views. "This really is an example of spending a little money to do great things," one user wrote.
[12]
What is China's DeepSeek and Why is it freaking out the AI world?
DeepSeek, a Chinese AI startup that's just over a year old, has stirred awe and consternation in Silicon Valley after demonstrating breakthrough artificial intelligence models that offer comparable performance to the world's best chatbots at seemingly a fraction of the cost. DeepSeek's emergence may offer a counterpoint to the widespread belief that the future of AI will require ever-increasing amounts of power and energy to develop. Global technology stocks tumbled in late January as hype around DeepSeek's innovation snowballed and investors began to digest the implications for its U.S.-based rivals and their hardware suppliers. What exactly is DeepSeek? DeepSeek was founded in 2023 by Liang Wenfeng, the chief of AI-driven quant hedge fund High-Flyer. The company develops AI models that are open-source, meaning the developer community at large can inspect and improve the software. Its mobile app surged to the top of the iPhone download charts in the US after its release in early January. The app distinguishes itself from other chatbots like OpenAI's ChatGPT by articulating its reasoning before delivering a response to a prompt. The company claims its R1 release offers performance on par with OpenAI's latest and has granted license for individuals interested in developing chatbots using the technology to build on it. How does DeepSeek R1 compare to OpenAI or Meta AI? Though not fully detailed by the company, the cost of training and developing DeepSeek's models appears to be only a fraction of what's required for OpenAI or Meta Platforms' best products. The much better efficiency of the model puts into question the need for vast expenditures of capital to acquire the latest and most powerful AI accelerators from the likes of Nvidia. That also amplifies attention on US export curbs of such advanced semiconductors to China -- which were intended to prevent a breakthrough of the sort that DeepSeek appears to represent. DeepSeek says R1 is near or better than rival models in several leading benchmarks such as AIME 2024 for mathematical tasks, MMLU for general knowledge and AlpacaEval 2.0 for question-and-answer performance. It also ranks among the top performers on a UC Berkeley-affiliated leaderboard called Chatbot Arena. What's raising alarm in the U.S.? Washington has banned the export of high-end technologies like GPU semiconductors to China, in a bid to stall the country's advances in AI, the key frontier in the US-China contest for tech supremacy. But DeepSeek's progress suggests Chinese AI engineers have worked their way around the restrictions, focusing on greater efficiency with limited resources. While it remains unclear how much advanced AI-training hardware DeepSeek has had access to, the company's demonstrated enough to suggest the trade restrictions have not been entirely effective in stymieing China's progress. When did DeepSeek spark global interest? The AI developer has been closely watched since the release of its earliest model in 2023. Then in November, it gave the world a glimpse of its DeepSeek R1 reasoning model, designed to mimic human thinking. That model underpins its mobile chatbot app, which together with the web interface in January rocketed to global renown as a much cheaper OpenAI alternative, with investor Marc Andreessen calling it "AI's Sputnik moment." The DeepSeek mobile app was downloaded 1.6 million times by Jan. 25 and ranked No. 1 in iPhone app stores in Australia, Canada, China, Singapore, the US and the UK, according to data from market tracker App Figures. Who is DeepSeek's founder? Born in Guangdong in 1985, Liang received bachelor's and masters' degrees in electronic and information engineering from Zhejiang University. He founded DeepSeek with 10 million yuan ($1.4 million) in registered capital, according to company database Tianyancha. The bottleneck for further advances is not more fundraising, Liang said in an interview with Chinese outlet 36kr, but US restrictions on access to the best chips. Most of his top researchers were fresh graduates from top Chinese universities, he said, stressing the need for China to develop its own domestic ecosystem akin to the one built around Nvidia and its AI chips. "More investment does not necessarily lead to more innovation. Otherwise, large companies would take over all innovation," Liang said. Where does DeepSeek stand in China's AI landscape? China's technology leaders, from Alibaba Group Holding Ltd. and Baidu Inc. to Tencent Holdings, have poured significant money and resources into the race to acquire hardware and customers for their AI ventures. Alongside Kai-Fu Lee's 01.AI startup, DeepSeek stands out with its open-source approach -- designed to recruit the largest number of users quickly before developing monetization strategies atop that large audience. Because DeepSeek's models are more affordable, it's already played a role in helping drive down costs for AI developers in China, where the bigger players have engaged in a price war that's seen successive waves of price cuts over the past year and a half. What are the implications for the global AI marketplace? DeepSeek's success may push OpenAI and other U.S. providers to lower their pricing to maintain their established lead. It also calls into question the vast spending by companies like Meta and Microsoft -- each of which has committed to capex of $65 billion or more this year, largely on AI infrastructure -- if more efficient models can compete with a much smaller outlay. That roiled global stock markets as investors sold off companies like Nvidia and ASML that have benefited from booming demand for AI services. Shares in Chinese names linked to DeepSeek, such as Iflytek Co., climbed. Already, developers around the world are experimenting with DeepSeek's software and looking to build tools with it. That could quicken the adoption of advanced AI reasoning models -- while also potentially touching off additional concern about the need for guardrails around their use. DeepSeek's advances may hasten regulation to control how AI is developed. What are DeepSeek's shortcomings? Like all other Chinese AI models, DeepSeek self-censors on topics deemed sensitive in China. It deflects queries about the 1989 Tiananmen Square protests or geopolitically fraught questions such as the possibility of China invading Taiwan. In tests, the DeepSeek bot is capable of giving detailed responses about political figures like Indian Prime Minister Narendra Modi, but declines to do so about Chinese President Xi Jinping. DeepSeek's cloud infrastructure is likely to be tested by its sudden popularity. The company briefly experienced a major outage on Jan. 27 and will have to manage even more traffic as new and returning users pour more queries into its chatbot.
[13]
What is DeepSeek? The AI chatbot is topping app store charts
China-based AI app DeepSeek, which sits atop the app store charts, made its presence widely known Monday by triggering a sharp drop in share prices for some tech giants. Shares of Nvidia, the top AI chipmaker, plunged more than 17% in early trading on Monday, losing nearly $590 billion in market value. The tech-heavy Nasdaq plummeted 3.5%. The DeepSeek chatbot, known as R1, responds to user queries just like its U.S.-based counterparts. Early testing released by DeepSeek suggests that its quality rivals that of other AI products, while the company says it costs less and uses far fewer specialized chips than do its competitors. The product could upend the AI industry, putting pressure on other firms to lower their costs while intensifying competition between U.S. and Chinese companies, analysts told ABC News. Concerns about data security and censorship also may expose DeepSeek to the type of scrutiny endured by social media platform TikTok, the experts added. "This is a big moment," Gary Marcus, a professor emeritus of psychology and neuroscience at New York University, who specializes in AI, told ABC News. DeepSeek did not immediately respond to ABC News' request for comment. Here's what to know about DeepSeek, and its implications for the future of AI. DeepSeek, like OpenAI's ChatGPT, is a chatbot fueled by an algorithm that selects words based on lessons learned from scanning billions of pieces of text across the internet. While made in China, the app is available in multiple languages, including English. DeepSeek soared to the top of Apple's App Store chart over the weekend and remained there as of Monday. The company said it experienced some outages on Monday affecting user signups. "Due to large-scale malicious attacks on DeepSeek's services, we are temporarily limiting registrations to ensure continued service," DeepSeek said in a statement. DeepSeek is fully available to users free of charge. By contrast, ChatGPT retains a version available for free, but offers paid monthly tiers of $20 and $200 to access additional capabilities. Developers of the system powering the DeepSeek AI, called DeepSeek-V3, published a research paper indicating that the technology relies on much fewer specialized computer chips than its U.S. competitors do. DeepSeek also says that it developed the chatbot for only $5.6 million, which if true is far less than the hundreds of millions of dollars spent by U.S. companies in the sector. "The lower cost to train DeepSeek means it can be offered at a lower cost," Oren Etzioni, CEO of Allen Institute for AI and a computer science professor at the University of Washington, told ABC News. "That puts pressure on other companies to lower their prices." DeepSeek's developers opted to release it as an open-source product, meaning the code that underlies the AI system is publicly available for other companies to adapt and build upon. Llama, the AI model released by Meta in 2017, is also open source. By contrast, ChatGPT as well as Alphabet's Gemini are closed-source models. Proponents of open source say that publicly available code allows a wide community of AI engineers to identify and fix flaws in a system, or tailor it for a purpose separate from its originally intended function. Supporters of closed source argue that the best way to safeguard AI is to keep the computer code private so it stays out of the hands of bad actors who might repurpose it for malicious ends. DeepSeek flexes the strength of the Chinese tech sector in an industry previously viewed by many analysts as one dominated by U.S. firms, experts said. For some, the sudden arrival of a top China-based competitor raises concerns about data security and content manipulation similar to those leveled at TikTok, which is owned by Chinese company ByteDance. Regarding the former users on social media have posted numerous examples of DeepSeek choosing to forego answers on topics potentially sensitive to the Chinese government, such as the 1989 Tiananmen Square massacre of anti-government protestors. "It's clear that there's some censorship built into the system," Marcus said. "We don't want governments in control of what people can say." Both foreign-owned and U.S.-based AI chatbots pose data concerns, however, since users sometimes enter personal or intimate information, which in turn could be used for commercial purposes, Marcus added. "You have to assume that anything you type into a large language model could be seen by someone else," Marcus said. Still, DeepSeek's low-cost technological advance using outdated chips marks a major new entry into the AI market, Sosnick said, in turn forcing people to "rethink assumptions that they had." The emergence of a relatively cost-efficient alternative to other AI chatbots helps explain Monday's dip in the share value of Nvidia, which premises future growth on increased demand for new, expensive chips, Etzioni said. In a statement to ABC News, Nvidia described DeepSeek as "an excellent AI advancement," noting how the firm leveraged "widely-available models and compute that is fully export control compliant" - the latter apparently referring to controls put in place by the Biden administration that prohibit U.S. manufacturers from selling such specialized chips to firms in China. Etzioni said the open-source nature of DeepSeek also could challenge the profitability of other AI firms aiming to charge third parties for access to their products. "We're soon going to see many models built on DeepSeek," Etzioni said.
[14]
Chinese startup DeepSeek threatens ChatGPT's dominance of AI sector
The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips. DeepSeek's AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States. This has raised doubts about the reasoning behind some U.S. tech companies' decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit. Below are some facts about the company shaking up the AI sector worldwide. Why is DeepSeek causing a stir? The release of OpenAI's ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence. But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu, there was widespread disappointment in China at the gap in AI capabilities between U.S. and Chinese firms. The quality and cost efficiency of DeepSeek's models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and U.S. tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta's most advanced models, the Chinese startup has said. Watch moreChinese chatbot DeepSeek leaps to top spot on App Store, overtaking ChatGPT They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek's official WeChat account. But some have publicly expressed scepticism about DeepSeek's success story. Scale AI CEO Alexandr Wang said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington's export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation. Bernstein analysts on Monday highlighted in a research note that DeepSeek's total training costs for its V3 model were unknown but were much higher than the $5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed. Watch moreWhy does Donald Trump see China as a threat on AI, but not on TikTok? Who is behind DeepSeek? DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng, co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records. Liang's fund announced in March 2023 on its official WeChat account that it was "starting again", going beyond trading to concentrate resources on creating a "new and independent research group, to explore the essence of "AGI" (Artificial General Intelligence). DeepSeek was created later that year. ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks. It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, according to Chinese corporate records. High-Flyer's AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips. How does Beijing view DeepSeek? DeepSeek's success has already been noticed in China's top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang, according to state news agency Xinhua. Liang's presence at the gathering is potentially a sign that DeepSeek's success could be important to Beijing's policy goal of overcoming Washington's export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li.
[15]
Chinese AI App DeepSeek Soars in Popularity, Startling Rivals
The company said Monday it was temporarily limiting new sign ups due to "large-scale malicious attacks" on its services. An AI assistant created by Chinese startup DeepSeek became the number one most-downloaded app in Apple's US App Store over the weekend, sending shockwaves through Silicon Valley and causing the price of major tech stocks to plummet. Nvidia saw more than $460 billion erased from its market capitalization on Monday, a drop Bloomberg characterized as the "biggest in US stock market history." The shakeup stems from an open source model developed by DeepSeek called R1, which debuted earlier this month. The company said that it rivals the current industry leader: OpenAI's 01. But what stunned the tech industry the most was that DeepSeek claimed to have built its model using only a small fraction of the specialized computer chips that AI companies typically need to develop cutting-edge systems. On Monday, DeepSeek said it was temporarily limiting new registrations, citing "large-scale malicious attacks" on the company's services, according to a message on its website. DeepSeek's R1 model "challenges the notion that Western AI companies hold a significant lead over Chinese ones," Jack Clark, cofounder of the AI startup Anthropic, wrote in his newsletter. Venture capitalist Marc Andreessen called it "AI's Sputnik moment." Cheng Lu, a research scientist at OpenAI, said DeepSeek's chatbot demonstrated impressive Chinese conversational skills. "It's the first time I can feel the beauty of Chinese language created by a chatbot," he said in an X post on Sunday. DeepSeek's AI assistant is currently available for free and comes with three main functions. First, users can ask the chatbot questions and receive direct answers. For example, when WIRED asked for recipe ideas incorporating pomegranate seeds, DeepSeek's chatbot quickly provided a list of 15 options ranging from yogurt parfaits to a "Middle Eastern-inspired" rice pilaf, but it didn't cite any specific chefs or recipes. DeepSeek's app also has a search mode that surfaces answers from the internet. When WIRED asked "What are some important news stories today?" DeepSeek's chatbot cited the Israel-Hamas ceasefire and linked to several Western news outlets such as BBC News, but not all of the stories appeared to be relevant to the topic. Ironically, one was a New York Times story about DeepSeek's impact on the stock market. Lastly, there's a "DeepThink" mode that allows users to tap into DeepSeek's R1 model, which was built upon the company's existing V3 model. The difference between the two is that R1 has so-called "reasoning" abilities that allow it to explain step by step how it reached its conclusions. For example, when asked "What are the most important historical events of the 20th century?" DeepSeek initially provided a long meandering answer that began with a number of broad questions. "That's a hundred years, so there's a lot that happened," read part of its reply. "I should probably break it down by decades or major themes like wars, political changes, technological advancements, social movements, etc." DeepSeek's chatbot then went on to cite World War II, the Cold War, and the Holocaust. But before R1 could finish its reply, the entire answer disappeared and was replaced by a message that read "Sorry, I'm not sure how to approach this type of question yet. Let's chat about math, coding, and logic problems instead!" A number of experts and early adopters have noted that DeepSeek, like other tech platforms that operate in China, appears to extensively censor topics deemed sensitive by the Chinese Communist Party But despite these limitations, DeepSeek's free chatbot could pose a serious threat to competitors like OpenAI, which charges $20 per month to access its most powerful AI models. Unlike its Chinese counterpart, OpenAI doesn't disclose the underlying "weights" of its models, which determine how the AI processes information. It also has declined to make public the full "chains of thought" produced by its own reasoning models.
[16]
China's DeepSeek AI dethrones ChatGPT on App Store: Here's what you should know
Tech leaders, analysts, investors and developers say that the hype -- and ensuing fear of falling behind in the ever-changing AI hype cycle -- may be warranted. Especially in the era of the generative AI arms race, where tech giants and startups alike are racing to ensure they don't fall behind in a market predicted to top $1 trillion in revenue within a decade. DeepSeek was founded in 2023 by Liang Wenfeng, co-founder of High-Flyer, a quantitative hedge fund focused on AI. The AI startup reportedly grew out of the hedge fund's AI research unit in April 2023 to focus on large language models and reaching artificial general intelligence, or AGI -- a branch of AI that equals or surpasses human intellect on a wide range of tasks, which OpenAI and its rivals say they're fast pursuing. DeepSeek is still wholly owned by and funded by High-Flyer, according to analysts at Jefferies. The buzz around DeepSeek began picking up steam earlier this month, when the startup released R1, its reasoning model that rivals OpenAI's o1. It's open-source, meaning that any AI developer can use it, and has rocketed to the top of app stores and industry leaderboards, with users praising its performance and reasoning capabilities. Like other Chinese chatbots, it has its limitations when asked about certain topics: When asked about some of Chinese leader Xi Jinping's policies, for instance, DeepSeek reportedly steers the user away from similar lines of questioning. Another key part of the discussion: DeepSeek's R1 was built despite the U.S. curbing chip exports to China three times in three years. Estimates differ on exactly how much DeepSeek's R1 costs, or how many GPUs went into it. Jefferies analysts estimated that a recent version had a "training cost of only US$5.6m (assuming US$2/H800 hour rental cost). That is less than 10% of the cost of Meta's Llama." But regardless of the specific numbers, reports agree that the model was developed at a fraction of the cost of rival models by OpenAI, Anthropic, Google and others. As a result, the AI sector is awash with questions, including whether the industry's increasing number of astronomical funding rounds and billion-dollar valuations is necessary -- and whether a bubble is about to burst.
[17]
What is DeepSeek and why is it disrupting the AI sector?
BEIJING (Reuters) - Chinese startup DeepSeek's launch of its latest AI models, which it says are on a par or better than industry-leading models in the United States at a fraction of the cost, is threatening to upset the technology world order. The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips. DeepSeek's AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States. This has raised doubts about the reasoning behind some U.S. tech companies' decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit. Below are some facts about the company shaking up the AI sector worldwide. WHY IS DEEPSEEK CAUSING A STIR? The release of OpenAI's ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence. But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu , there was widespread disappointment in China at the gap in AI capabilities between U.S. and Chinese firms. The quality and cost efficiency of DeepSeek's models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and U.S. tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta's most advanced models, the Chinese startup has said. They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek's official WeChat account. But some have publicly expressed scepticism about DeepSeek's success story. Scale AI CEO Alexandr Wang said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington's export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation. Bernstein analysts on Monday highlighted in a research note that DeepSeek's total training costs for its V3 model were unknown but were much higher than the $5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed. WHO IS BEHIND DEEPSEEK? DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng, co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records. Liang's fund announced in March 2023 on its official WeChat account that it was "starting again", going beyond trading to concentrate resources on creating a "new and independent research group, to explore the essence of AGI" (Artificial General Intelligence). DeepSeek was created later that year. ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks. It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, according to Chinese corporate records. High-Flyer's AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips. HOW DOES BEIJING VIEW DEEPSEEK? DeepSeek's success has already been noticed in China's top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang, according to state news agency Xinhua. Liang's presence at the gathering is potentially a sign that DeepSeek's success could be important to Beijing's policy goal of overcoming Washington's export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li. (Reporting by Eduardo Baptista. Editing by Jane Merriman)
[18]
What Is China's DeepSeek and Why Is It Freaking Out the AI World?
(Bloomberg) -- DeepSeek, a Chinese AI startup that's just over a year old, has stirred awe and consternation in Silicon Valley after demonstrating breakthrough artificial-intelligence models that offer comparable performance to the world's best chatbots at seemingly a fraction of the cost. DeepSeek's emergence may offer a counterpoint to the widespread belief that the future of AI will require ever-increasing amounts of power and energy to develop. Global technology stocks tumbled in late January as hype around DeepSeek's innovation snowballed and investors began to digest the implications for its US-based rivals and their hardware suppliers. What exactly is DeepSeek? DeepSeek was founded in 2023 by Liang Wenfeng, the chief of AI-driven quant hedge fund High-Flyer. The company develops AI models that are open-source, meaning the developer community at large can inspect and improve the software. Its mobile app surged to the top of the iPhone download charts in the US after its release in early January. The app distinguishes itself from other chatbots like OpenAI's ChatGPT by articulating its reasoning before delivering a response to a prompt. The company claims its R1 release offers performance on par with OpenAI's latest and has granted license for individuals interested in developing chatbots using the technology to build on it. How does DeepSeek R1 compare to OpenAI or Meta AI? Though not fully detailed by the company, the cost of training and developing DeepSeek's models appears to be only a fraction of what's required for OpenAI or Meta Platforms Inc.'s best products. The much better efficiency of the model puts into question the need for vast expenditures of capital to acquire the latest and most powerful AI accelerators from the likes of Nvidia Corp. That also amplifies attention on US export curbs of such advanced semiconductors to China -- which were intended to prevent a breakthrough of the sort that DeepSeek appears to represent. DeepSeek says R1 is near or better than rival models in several leading benchmarks such as AIME 2024 for mathematical tasks, MMLU for general knowledge and AlpacaEval 2.0 for question-and-answer performance. It also ranks among the top performers on a UC Berkeley-affiliated leaderboard called Chatbot Arena. What's raising alarm in the US? Washington has banned the export of high-end technologies like GPU semiconductors to China, in a bid to stall the country's advances in AI, the key frontier in the US-China contest for tech supremacy. But DeepSeek's progress suggests Chinese AI engineers have worked their way around the restrictions, focusing on greater efficiency with limited resources. While it remains unclear how much advanced AI-training hardware DeepSeek has had access to, the company's demonstrated enough to suggest the trade restrictions have not been entirely effective in stymieing China's progress. When did DeepSeek spark global interest? The AI developer has been closely watched since the release of its earliest model in 2023. Then in November, it gave the world a glimpse of its DeepSeek R1 reasoning model, designed to mimic human thinking. That model underpins its mobile chatbot app, which together with the web interface in January rocketed to global renown as a much cheaper OpenAI alternative, with investor Marc Andreessen calling it "AI's Sputnik moment." The DeepSeek mobile app was downloaded 1.6 million times by Jan. 25 and ranked No. 1 in iPhone app stores in Australia, Canada, China, Singapore, the US and the UK, according to data from market tracker App Figures. Who is DeepSeek's founder? Born in Guangdong in 1985, Liang received bachelor's and masters' degrees in electronic and information engineering from Zhejiang University. He founded DeepSeek with 10 million yuan ($1.4 million) in registered capital, according to company database Tianyancha. The bottleneck for further advances is not more fundraising, Liang said in an interview with Chinese outlet 36kr, but US restrictions on access to the best chips. Most of his top researchers were fresh graduates from top Chinese universities, he said, stressing the need for China to develop its own domestic ecosystem akin to the one built around Nvidia and its AI chips. "More investment does not necessarily lead to more innovation. Otherwise, large companies would take over all innovation," Liang said. Where does DeepSeek stand in China's AI landscape? China's technology leaders, from Alibaba Group Holding Ltd. and Baidu Inc. to Tencent Holdings Ltd., have poured significant money and resources into the race to acquire hardware and customers for their AI ventures. Alongside Kai-Fu Lee's 01.AI startup, DeepSeek stands out with its open-source approach -- designed to recruit the largest number of users quickly before developing monetization strategies atop that large audience. Because DeepSeek's models are more affordable, it's already played a role in helping drive down costs for AI developers in China, where the bigger players have engaged in a price war that's seen successive waves of price cuts over the past year and a half. What are the implications for the global AI marketplace? DeepSeek's success may push OpenAI and other US providers to lower their pricing to maintain their established lead. It also calls into question the vast spending by companies like Meta and Microsoft Corp. -- each of which has committed to capex of $65 billion or more this year, largely on AI infrastructure -- if more efficient models can compete with a much smaller outlay. Subscribe to the Bloomberg Daybreak podcast on Apple, Spotify or anywhere you listen. That roiled global stock markets as investors sold off companies like Nvidia Corp. and ASML Holding NV that have benefited from booming demand for AI services. Shares in Chinese names linked to DeepSeek, such as Iflytek Co., climbed. Already, developers around the world are experimenting with DeepSeek's software and looking to build tools with it. That could quicken the adoption of advanced AI reasoning models -- while also potentially touching off additional concern about the need for guardrails around their use. DeepSeek's advances may hasten regulation to control how AI is developed. What are DeepSeek's shortcomings? Like all other Chinese AI models, DeepSeek self-censors on topics deemed sensitive in China. It deflects queries about the 1989 Tiananmen Square protests or geopolitically fraught questions such as the possibility of China invading Taiwan. In tests, the DeepSeek bot is capable of giving detailed responses about political figures like Indian Prime Minister Narendra Modi, but declines to do so about Chinese President Xi Jinping. DeepSeek's cloud infrastructure is likely to be tested by its sudden popularity. The company briefly experienced a major outage on Jan. 27 and will have to manage even more traffic as new and returning users pour more queries into its chatbot. --With assistance from Luz Ding, Zheping Huang, Claire Che, Ville Heiskanen and Mayumi Negishi.
[19]
What is Deepseek, Chinese AI model that rattled Chatgpt, Nvidia and freaked out AI world: Explained in 10 points
There's a new player in Artificial Intelligence market, Deepseek. Deepseek is a new AI model from China, has captured the attention of the tech world, surpassing the likes of ChatGPT, Gemini, and Claude AI in performance. The DeepSeek app, which was launched last week, has overtaken rivals including ChatGPT to become the most downloaded free app in the United States. Shares in US-based and AI-related tech firms such as Nvidia, Microsoft and Meta were all down on Monday ahead of the US markets opening, reports BBC. ALSO READ: DeepSeek vs OpenAI: How China's AI giant is outpacing ChatGPT? -DeepSeek is a Hangzhou-based AI research lab and the R1 is its latest AI model. The company was founded in 2023 by Liang Wenfeng, an engineer and entrepreneur with a background in AI and quantitative finance, according to India Today. -Before creating DeepSeek, he led a hedge fund known for using AI to analyse financial data. Liang's team at DeepSeek mostly consists of fresh graduates from top Chinese universities like Tsinghua and Peking University. -The company has now developed AI models that are open-source and helping developers across the world to improve their technologies. It is powered by the open-source DeepSeek-V3 model, which its researchers claim was developed for less than $6m (£7.5m) - significantly less than the billions spent by rivals. ALSO READ: Video: Barron Trump's reaction when President Donald Trump introduces him goes viral, netizens call him 'modest' -DeepSeek is an advanced open-source AI system, it surpassed OpenAI's ChatGPT to become the top-rated free app on Apple's App Store in several countries, including the US, the UK, and China. DeepSeek's rise to prominence made a splash in Silicon Valley, challenging the belief that the US held a dominant position in this space. -DeepSeek's latest release, R1, rivals industry leaders such as OpenAI and Anthropic but stands out for being cost-efficient, open source, and offering unlimited free usage. It enables high-performance AI access without the steep costs typically associated with such technologies, according to a NDTV report. -The one thing that sets DeepSeek R1, which is said to mimic human thinking, apart from other AI models is that it also provides its reasoning before delivering a response to a prompt. The company claims its performance is on par with OpenAI's latest technology. ALSO READ: Amid Obamas' divorce rumours, Jason Whitlock makes an explosive claim about their marriage -DeepSeek's AI models are developed at a fraction of the cost compared to leading products from OpenAI and Meta, raising questions about the need for huge investments in high-end AI accelerators like those from Nvidia. -One of the key reasons why DeepSeek has managed to create a buzz is its cost to end users: it is free. In a way, this is the first extremely advanced AI system that is available for free to users. Other systems, like the OpenAI o1 and Claude Sonnet, require a paid subscription. Google Gemini, too, is available for free, but it is also limited to older models for free versions and does have usage limits. -The key reason for the buzz around DeepSeek is that the Chinese AI assistant is celebrated for its transparency, efficiency, and ability to make AI accessible to everyone. DeepSeek's breakthrough comes at a time when the US has placed restrictions on advanced semiconductor exports to China, with the aim of limiting China's ability to advance in AI. DeepSeek has managed to work around these restrictions by developing models that require fewer resources. -The rise of DeepSeek has had an immediate effect on stock markets. Shares of major US AI-related companies, including Nvidia, Meta, and Microsoft, dropped in response to DeepSeek's success. Analysts have warned that the rise of low-cost Chinese alternatives to established US AI models could disrupt the investment landscape, forcing companies to reconsider their AI strategies.
[20]
What is China government's view on DeepSeek? How is Chinese startup disrupting the AI sector?
Chinese startup DeepSeek has shaken the AI world after the its latest models hogged all the limelight.Chinese startup DeepSeek's launch of its latest AI models, which it says are on a par or better than industry-leading models in the United States at a fraction of the cost, is threatening to upset the technology world order. The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips. DeepSeek's AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States. This has raised doubts about the reasoning behind some U.S. tech companies' decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit. DeepSeek's success has already been noticed in China's top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang, according to state news agency Xinhua. Liang's presence at the gathering is potentially a sign that DeepSeek's success could be important to Beijing's policy goal of overcoming Washington's export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li. The release of OpenAI's ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence. But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu , there was widespread disappointment in China at the gap in AI capabilities between U.S. and Chinese firms. The quality and cost efficiency of DeepSeek's models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and U.S. tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta's most advanced models, the Chinese startup has said. They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek's official WeChat account. But some have publicly expressed scepticism about DeepSeek's success story. Scale AI CEO Alexandr Wang said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington's export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation. Bernstein analysts on Monday highlighted in a research note that DeepSeek's total training costs for its V3 model were unknown but were much higher than the $5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed. Q1. When was OpenAI's ChatGPT released? A1. OpenAI's ChatGPT was released in late 2022. Q2. How is Chinese government viewing DeepSeek? A2. DeepSeek's success has already been noticed in China's top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang, according to state news agency Xinhua.
[21]
Chinese AI startup DeepSeek is rattling markets. Here's what to know
A Chinese artificial intelligence startup is rattling Silicon Valley and Wall Street after it demonstrated AI models on par with OpenAI's -- for a fraction of the cost and energy. At just over a year old, Hangzhou-based DeepSeek released results for its latest open-source reasoning models, DeepSeek-R1, last week. It showed a comparable performance to OpenAI's reasoning models, o1-mini and o1, on several industry benchmarks. According to DeepSeek, training and developing one of the latest AI models cost $5.6 million. Meanwhile, U.S. rivals such as OpenAI and Meta have touted spending tens of billions on cutting-edge chips from Nvidia (NVDA-14.91%). The results sparked a global sell-off of tech stocks, with Nasdaq, Dow Jones Industrial Average, and S&P500 futures all falling Monday morning. Here's what to know about DeepSeek and its AI models. The Chinese AI startup was founded in 2023 by Liang Wenfeng, co-founder of Chinese AI quantitative hedge fund High-Flyer. DeepSeek develops open-source models, which means developers have access to and can work on its software. DeepSeek introduced its first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1 last week. The former model was trained by large-scale reinforcement learning and without supervised fine-tuning, DeepSeek said. DeepSeek-R1-Zero "demonstrates remarkable reasoning capabilities," but has challenges with "poor readability" and mixing language, according to the startup. The mobile app for DeepSeek's AI chatbot, also called DeepSeek, has surged to the top of Apple's (AAPL+2.73%) App Store downloads, while the DeepSeek site is experiencing outages from an influx of new users. The startup announced "large-scale malicious attacks" on Monday, prompting a temporary limit on registrations. The chatbot was powered by DeepSeek-V3, which DeepSeek said performed comparably with Meta's (META+1.02%) Llama 3.1 and OpenAI's 4o at its release in December. Unlike ChatGPT and its other chatbot competitors, DeepSeek explains its "reasoning" before responding to inquiries. However, the Chinese-developed chatbot does not directly answer prompts about politically sensitive topics such as President Xi Jinping or Taiwan. According to DeepSeek, R1 performed comparably with OpenAI's and Meta's models on leading benchmarks such as the AIME 2024, which tests mathematics, and the Massive Multitask Language Understanding (MMLU) which evaluates general knowledge. On the community-driven Chatbot Arena leaderboard, DeepSeek-R1 comes in under Google's (GOOGL-3.06%) Gemini 2.0 Flash Thinking model and and ChatGPT-4o. DeepSeek-V3, meanwhile, fell just below OpenAI's o1-preview and full o1 models. Meta, which also develops open-source models, is reportedly concerned that the next version of its flagship Llama will fall behind DeepSeek's models. Specialized groups of researchers at Meta are looking into DeepSeek's models for ways to improve the next Llama model, The Information reported, citing unnamed people familiar with the matter. In a technical report for its V3 model, DeepSeek said it used a cluster of more than 2,000 graphics processing units (GPUs) from Nvidia for training -- much less than the tens of thousands of chips U.S. firms are buying to train similarly-sized models. DeepSeek's seemingly efficient and competitive models could challenge Nvidia's business, which relies on major AI firms such as OpenAI, Meta, and Google spending billions of dollars on its GPUs. Last week, Meta chief executive Mark Zuckerberg said the tech giant is planning to invest between $60 billion and $65 billion in capital expenditures on AI in 2025. He added that Meta's Llama 4 model is expected to "become the leading state of the art model" this year, and that the company plans to "build an AI engineer" that can contribute more code to its research and development efforts. Meanwhile, OpenAI, SoftBank (SFTBY-9.33%), and Oracle (ORCL-11.19%) recently announced a half-a-trillion-dollar AI infrastructure plan with the Trump administration called Stargate. The new joint venture "intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States," the AI startup said in a statement. Aside from prompting questions about AI chip spending, DeepSeek's success challenges U.S. efforts to curb advanced chips from entering the country. Before leaving office earlier this month, the Biden administration introduced even more measures focused on keeping AI chips out of China. The new regulations reinforce and build upon previous U.S. export controls aimed at restricting China from advanced semiconductors that can be used for AI and military development. Under the rules, foundries and packaging companies that want to export certain chips are subject to a broader license requirement unless certain conditions are met. The U.S. also published new guidelines aimed at curbing AI chip sales from U.S. firms, including Nvidia, to specific countries and companies. The new export controls include three tiers of chip restrictions, which give friendly nations full access to U.S.-made chips but add new limitations to others.
[22]
'Everyone Can Profit From It': What Is DeepSeek? China's 'Cheap' to Make AI Chatbot Climbs to the Top of Apple, Google U.S. App Stores
R1 is free and offers capabilities on par with OpenAI's latest ChatGPT model but at a lower development cost. The Chinese startup DeepSeek released its flagship AI model R1 on January 20, surprising Silicon Valley with the model's advanced capabilities. R1 matched or surpassed the functionality of AI released by OpenAI, Google, and Meta -- on a much smaller budget and without the latest AI chips. Over the past week, the DeepSeek app has proven popular with the public. It surged past ChatGPT in popularity, reaching No. 1 on the U.S. Apple app store and within the top free Android apps on the Google Play Store at the time of publication. DeepSeek's R1 release has prompted questions about whether the billions of dollars of AI spending in the past few years was worth it -- and challenged the notion that the U.S. is the world's leader in AI, per BBC. Related: What Is Stargate? OpenAI, Oracle, Softbank, and President Trump Team Up for $500B AI Infrastructure Initiative. DeepSeek's AI arrives as the U.S. looks to ramp up spending on AI. Last week, President Donald Trump announced a joint project with OpenAI, Oracle, and Softbank called Stargate that commits up to $500 billion over the next four years to data centers and other AI infrastructure. DeepSeek is a Chinese AI startup that creates open AI models -- so any developer can access and build on the technology. DeepSeek is different from ChatGPT because it states its chain-of-thought reasoning before giving a response to a prompt. Apple App Store and Google Play Store reviews praised that level of transparency, per Bloomberg. It is free to download and use, though it does require users to sign up before they can access the AI. In a paper released last month, DeepSeek researchers stated that they built and trained the AI model for under $6 million in only two months. Related: I Co-Founded an App That Uses AI -- Here's Why I'm Worried About My Child's Future They stated that they used around 2,000 Nvidia H800 chips, which Nvidia tailored exclusively for China with lower data transfer rates, or slowed-down speeds when compared to the H100 chips used by U.S. companies. In October, the U.S. Department of Commerce banned the sale of the H800 chip to China with the goal of preventing access to chips that could fuel AI breakthroughs, especially for military purposes. In contrast, Dario Amodei, the CEO of U.S AI startup Anthropic, said in July that it takes $100 million to train AI -- and there are models today that cost closer to $1 billion to train. So how did DeepSeek pull ahead of the competition with fewer resources? Meta's chief AI scientist Yann LeCun stated in a Threads post on Saturday that DeepSeek had "profited from open research and open source." Related: Meta's AI Chief Is in the Middle of a Public Clash With Elon Musk: 'Secrecy Hampers Progress' "They came up with new ideas and built them on top of other people's work," LeCun stated. "Because their work is published and open source, everyone can profit from it." DeepSeek is free and offers top-of-the-line performance. Last week, the scientific journal Nature published an article titled, "China's cheap, open AI model DeepSeek thrills scientists." The article showed that R1's performances on certain chemistry, math, and coding tasks were on par with one of OpenAI's most advanced AI models, the o1 model OpenAI released in September. In addition to high performance, R1 is open-weight, so researchers can study, reuse, and build on it. It isn't considered fully open source because DeepSeek hasn't made its training data public. The AI model has also received stellar reviews. Investor Marc Andreessen called it "one of the most amazing and impressive breakthroughs" he had "ever seen" in a Friday post on X while Microsoft CEO Satya Nadella called it "super impressive" at last week's World Economic Forum in Switzerland. Nvidia shares fell by 13% after the opening bell on Monday, wiping $465 billion from the AI chipmaker's market cap. It was the biggest drop in value in U.S. stock market history, per Bloomberg. According to NBC News, Microsoft and Alphabet fell 4% each while Meta fell nearly 2%. NBC News also notes that the Nasdaq Composite fell by 3.4% at the opening bell, while the Dow fell 180 points and the S&P 500 fell nearly 2%.
[23]
What is DeepSeek? How a small Chinese startup shook up the AI sector
Chinese startup DeepSeek released a cost-efficient artificial intelligence AI model that is rivaling U.S. AI programs. U.S. tech stocks went reeling after a small Chinese artificial intelligence startup said it can compete with the likes of ChatGPT and other U.S.-based AI models at a fraction of the cost. DeepSeek, a Hangzhou-based startup founded in 2023, shot to the top of Apple's App Store free app chart after releasing a new open-source AI model it says rivals OpenAI's o1 model. Its website was hit by outages amid a spike in interest. Tech stocks, including those for leading AI chip manufacturer Nvidia, tumbled Monday. "Clearly tech stocks are under massive pressure led by Nvidia as the Street will view DeepSeek as a major perceived threat to US tech dominance and owning this AI Revolution," Wedbush Securities analyst Daniel Ives said in a note. What is DeepSeek? DeepSeek is an artificial intelligence lab. The startup says its AI models, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta's most advanced models. Chinese corporate records show the controlling shareholder is Liang Wenfeng, co-founder of the hedge fund High-Flyer. Analysts say the technology is impressive, especially since DeepSeek says it used less-advanced chips to power its AI training. The former President Joe Biden's administration had limited the export of certain advanced AI chips. "In the wrong hands, powerful AI systems have the potential to exacerbate significant national security risks, including by enabling the development of weapons of mass destruction, supporting powerful offensive cyber operations, and aiding human rights abuses, such as mass surveillance," the Biden administration said in a statement earlier this month. But DeepSeek says the chip restrictions haven't stopped it from releasing a model that is 20 to 50 times cheaper than the OpenAI o1 model, depending on the task. "We were primed to expect that AI development would grow by leaps and bounds since the public launch of ChatGPT, but the U.S. was caught by surprise when the latest leap came from China," Damian Rollison, director of market insights for AI marketing firm SOCi, told USA TODAY in an emailed statement. Some have disputed the startup's claims. Scale AI CEO Alexandr Wang argued during a CNBC interview last week that the startup used advanced Nvidia chips. Bernstein analysts also said in a note that total training costs were higher than DeepSeek claims. DeepSeek did not respond to a request for comment from USA TODAY. How is the stock market reacting to DeepSeek? Tech stocks tumbled Monday morning. Chip manufacturer Nvidia fell as much as 18%, wiping out about $560 billion from the company's market cap ‒ the largest in market history, according to Bloomberg. The benchmark S&P 500 was down 1.9%, the Dow Jones Industrial Average was up 0.3% and the tech-heavy Nasdaq composite was down 3.4% early Monday afternoon. Is the DeepSeek app free? DeepSeek's app is powered by the DeepSeek-V3 model. The startup describes its app as using "state-of-the-art" AI that "leads global standards and matches top-tier international models." The app hit the top of Apple's App Store "top free apps" chart after a Jan. 10 launch. Are there security concerns? DeepSeek is the latest app with connections to China to hit the top of the Apple App Store charts. TikTok competitor RedNote shot to the top of the social networking app rankings earlier this month. DeepSeek, even more than TikTok and RedNote, is expected to raise security concerns, according to Rollison of SOCi. "It seems likely that the AI arms race, as it's already being called, will have geopolitical implications that go beyond mere economic competition, which will in turn impact the future of these transformative technologies," Rollison said. DeepSeek hit by cyberattack DeepSeek's website on Monday said registration may be busy "due to large-scale malicious attacks" on services. It suggested new users wait and try again. Registered users can log in normally, according to the company.
[25]
China's DeepSeek AI is suddenly on top of Apple's App Store -- dethroning ChatGPT
DeepSeek, a Chinese artificial intelligence startup, has recently captured significant attention by surpassing ChatGPT on Apple Inc.'s App Store download charts. The company's AI assistant reached the number one position shortly after the release of its latest open-source AI model, DeepSeek-R1. This advancement is creating ripples in the global AI landscape, as companies and experts -- particularly those based in the United States -- reassess their positions in the competitive AI market. The rise of DeepSeek is underscored by its performance benchmarks, which show it outperforming some of the industry's leading models, including OpenAI's ChatGPT. The open-source nature of DeepSeek's platform further enhances its appeal, as it allows users and developers worldwide to access and contribute to its ongoing development. This approach is seen by many as a cornerstone in the escalation of an international AI arms race, threatening the technological advantage traditionally held by U.S.-based firms. DeepSeek's emergence in the spotlight has been attributed to its innovative resource optimization strategies. Unlike many AI developers that focus heavily on acquiring advanced hardware, DeepSeek has concentrated its efforts on maximizing the potential of software. Marina Zhang, an associate professor at the University of Technology Sydney, highlights that this approach differentiates DeepSeek from other firms and contributes significantly to its competitiveness. The app itself, currently topping the App Store charts, is praised for its functionality. Users appreciate the seamless performance comparable to premium versions of other popular AI models, notably ChatGPT. Reviews highlight the transparency offered by DeepSeek, as it demonstrates its processes and reasoning, instilling greater confidence in the accuracy of its outputs. This level of transparency, coupled with the app being entirely free to use without any subscription tiers, adds a compelling edge in attracting a growing user base. The unexpected ascent of DeepSeek has generated a palpable sense of urgency among Silicon Valley technology leaders. Questions are mounting about the future dynamics of the AI industry as Chinese innovations continue to gain strength. Nigel Green, chief executive of financial advisory firm deVere Group, noted that DeepSeek's advancements are disrupting the global tech landscape and raising the stakes in the AI arms race. The broader implications of DeepSeek's rise are being closely watched by industry experts. The open-source nature of DeepSeek-R1 invites broader collaboration and potential enhancements from the global developer community, a factor that could accelerate its evolution and competitiveness. Analysts suggest that this model of open research could reshape how AI is developed and deployed, potentially setting new benchmarks for collaboration and innovation. In the face of DeepSeek's rapid success, other AI companies, including those from China such as Kimi AI, are also making moves to establish a foothold in this burgeoning market. Kimi AI's recent announcement of its Kimi k1.5 AI model is indicative of the rapidly intensifying competition within the AI sector, suggesting that the push for innovation is far from over. As DeepSeek continues to gain traction, the global AI community remains attentive to its developments. The unfolding scenario underscores the dynamic nature of technological advancements and the unpredictable shifts in market leadership. The progress made by DeepSeek is a testament to the growing influence of Chinese tech companies in the global arena, and a reminder of the ever-evolving landscape of artificial intelligence development.
[26]
Chinese AI DeepSeek Deep Sixes OpenAI on the App Store, Stocks Tank
Chinese AI chatbot DeepSeek has displaced OpenAI’s ChatGPT as the most downloaded app on the Apple App store and the market is panicking. Stocks for major AI connected companies like NVIDIA fell on Monday morning following the news. DeepSeek released its R1 chatbot at the end of December and it’s only taken the app about a month to claim its spot at the top of the App store. Upon the news that it reached the top, share prices for GPU maker NVIDIA fell 11%, Oracle dropped 8%, and Palantir was down 5% (Cramer effect). Stocks are adjusting to the revelation that China can build AI faster, cheaper, and just as good as America. The reason the market is flipping out is because DeepSeek-R1 took two months to build, cost $6 million, and was trained using cheaper and lower-powered NVIDIA H800 GPUs. It’s also open source so anyone can go to a GitHub right now, pull it apart, and tinker with it. When it comes to training AI models, Silicon Valley has always said that bigger and more expensive is better. OpenAI CEO Sam Altman seems to believe he’s building some kind of literal god on earth called AGI and that the construction of this beast would require massive resources. OpenAI’s business models require sacrificing the present to construct a terrifying future in its own image. OpenAI said it needed to steal the creative output of everyone on the planet to refine its models and that it couldn’t pay them for it. It would need to build vast data centers to process information powered by new nuclear power plants. Meta, Anthropic, and other companies are investing billions to build infrastructure to power AI with the hope of massive returns and a future made easier and simpler for tech billionaires. Softbank and OpenAI teamed up with President Trump to announce Stargate, a $500 billion AI initiative. The U.S. passed the CHIPs Act to make sure China didn’t get the most advanced GPUs and wasn’t able to build AI as fast as America. Silicon Valley’s AI lords were set to spend billions to rebuild pieces of the country and, they hoped, change the way America worked. Six million dollars. Two months. NVIDIA H800 SXM5s. That’s all it took to call into question this dream of the future. “To see the DeepSeek new model, it’s super impressive in terms of both how they have really effectively done an open-source model that does this inference-time compute, and is super-compute efficient,†Satya Nadella, Microsoft CEO and OpenAI business partner said at Davos last week. “We should take the developments out of China very, very seriously.â€
[27]
DeepSeek gets Silicon Valley talking | TechCrunch
Since Chinese AI company DeepSeek released an open version of its reasoning model R1 at the beginning of this week, many in the tech industry have been making grand pronouncements about what the company achieved, and what it means for the state of AI. Venture capitalist Marc Andreessen, for example, posted that DeepSeek is "one of the most amazing and impressive breakthroughs I've ever seen." R1 seemingly matches or beats OpenAI's o1 model on certain AI benchmarks. And the company claims one of its models only cost $5.6 million to train, compared to the hundreds of millions of dollars that leading American companies pay to train theirs. It also seems to have achieved that in the face of U.S. sanctions that prohibit the sale of advanced chips to Chinese companies. The MIT Technology Review writes that the company's success illustrates how sanctions are "driving startups like DeepSeek to innovate in ways that prioritize efficiency, resource-pooling, and collaboration." (On the other hand, the Wall Street Journal reports that DeepSeek's Liang Wenfeng recently told China's premier that American export restrictions still pose a bottleneck.) Curai CEO Neal Khosla offered a simpler explanation, claiming that the company is a "ccp state psyop" that's "faking the cost was low to justify setting price low and hoping everyone switches to it [to] damage AI competitiveness in the us." (A Community Note has been attached to his post pointing out that Khosla offers no evidence for this, and that his father Vinod is an OpenAI investor.) Meanwhile, journalist Holger Zschaepitz suggested DeepSeek could "could represent the biggest threat to US equity markets" -- if a Chinese company can build a cutting-edge model at low cost, without access to advanced chips, it would call into question "the utility of the hundreds of billions worth of capex being poured into this industry." In response, Y Combinator CEO Garry Tan argued DeepSeek's success would actually be good for its American competitors: "If training models get cheaper faster and easier, the demand for inference (actual real world use of AI) will grow and accelerate even faster, which assures the supply of compute will be used." And Meta's Chief AI Scientist Yann LeCun argued against looking at DeepSeek's announcement through the lens of China vs. the United States. Instead, he suggested the real lesson is that "open source models are surpassing proprietary ones." "DeepSeek has profited from open research and open source (e.g. PyTorch and Llama from Meta)," LeCun wrote. "They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it." All of the debate seems to be driving consumers to try the product -- as of Sunday afternoon, DeepSeek's AI assistant is the top free app in the Apple App Store, just ahead of ChatGPT.
[28]
DeepSeek vs OpenAI: How China's AI giant is outpacing the ChatGPT?
Chinese AI research lab DeepSeek has gained global attention with its new AI model, DeepSeek-R1, which rivals industry leaders like OpenAI in reasoning, code generation, and efficiency. Operating independently of major Chinese tech firms, DeepSeek has focused on resource-efficient AI development despite US chip restrictions. With an open-source approach and young research talent, the company aims to democratise AI technology and challenge Western dominance in the field.DeepSeek, a Chinese AI research lab, has introduced an advanced AI model, DeepSeek-R1, which is reported to rival leading AI systems from global giants such as OpenAI. The company claims that its model outperforms competitors in areas including mathematical reasoning, code generation, and cost efficiency. The release of this model signals a potential shift in the global AI landscape, where DeepSeek is emerging as a serious competitor, as reported by Wired. DeepSeek originated as Fire-Flyer, a deep-learning research branch of High-Flyer, a Chinese quantitative hedge fund founded in 2015. High-Flyer gained recognition by using advanced computing to analyse financial data. In 2023, Liang Wenfeng, the fund's founder, shifted its focus towards AI research, establishing DeepSeek to develop pioneering AI models. Unlike many Chinese AI firms, DeepSeek operates independently of major corporations such as Baidu and Alibaba. Liang's motivation behind DeepSeek was scientific curiosity rather than immediate financial gain. He stated, "Basic science research rarely offers high returns on investment." DeepSeek-R1 employs reinforcement learning (RL) techniques and multi-stage training to enhance its capabilities. The company has also open-sourced its flagship model along with six smaller variants, ranging from 1.5 billion to 70 billion parameters, under an MIT licence. This allows developers to refine and commercialise the models freely. In contrast to conventional models reliant on supervised fine-tuning, DeepSeek-R1-Zero developed strong reasoning abilities through RL training alone. To address language inconsistencies and enhance usability, DeepSeek later introduced DeepSeek-R1, which reportedly matches OpenAI's o1 model in reasoning performance. DeepSeek has implemented several cost-effective strategies, making its models highly resource-efficient. It incorporated innovations such as multi-head latent attention (MLA) and a mixture of experts, which allowed it to achieve significant computational efficiency. According to Epoch AI, DeepSeek's model required just one-tenth of the computing power used by Meta's Llama 3.1 model. "DeepSeek represents a new wave of Chinese companies focused on long-term innovation over short-term gains," a tech analyst told Wired. DeepSeek's workforce is composed mainly of young graduates from prestigious Chinese institutions such as Peking University and Tsinghua University. Liang noted in an interview with 36Kr that hiring fresh graduates fosters a collaborative culture ideal for tackling complex challenges. "Our core technical positions are mostly filled by people who graduated this year or in the past one or two years," Liang stated. He emphasised that these young researchers are driven by a mission to elevate China's status in AI innovation. DeepSeek's progress comes despite US export controls imposed in October 2022, which limited China's access to advanced computing hardware, including Nvidia's H100 chips. Initially, DeepSeek had a stockpile of 10,000 H100s, but it soon faced challenges in acquiring additional resources. Liang remarked, "The problem we are facing has never been funding, but the export control on advanced chips." To overcome these challenges, DeepSeek focused on software-driven resource optimisation and alternative engineering approaches. This strategic adaptation allowed the company to progress without heavily relying on high-end chips. DeepSeek's decision to open-source its AI models has gained it significant recognition within the AI research community. By providing access to its model weights and outputs, the company aims to empower developers worldwide and challenge Western dominance in AI. "DeepSeek has embraced open source methods, pooling collective expertise and fostering collaborative innovation," said Marina Zhang, an associate professor at the University of Technology Sydney. DeepSeek's advancements have placed pressure on Western AI firms to remain competitive. Industry analysts suggest that the company's focus on resource efficiency and innovation could disrupt the current AI landscape, which traditionally depends on extensive computational power. As the competition in AI development intensifies, DeepSeek's success underscores the potential of alternative approaches in the face of technological restrictions. The company's unique strategy of blending scientific curiosity with cost-effective AI solutions could redefine global AI development trends.
[29]
What is DeepSeek, the Chinese AI company upending the stock market?
A frenzy over an artificial intelligence chatbot made by Chinese tech startup DeepSeek was upending stock markets Monday and fueling debates over the economic and geopolitical competition between the U.S. and China in developing AI technology. DeepSeek's AI assistant became the No. 1 downloaded free app on Apple's iPhone store Monday, propelled by curiosity about the ChatGPT competitor. Part of what's worrying some U.S. tech industry observers is the idea that the Chinese startup has caught up with the American companies at the forefront of generative AI at a fraction of the cost. That, if true, calls into question the huge amounts of money U.S. tech companies say they plan to spend on the data centers and computer chips needed to power further AI advancements. But hype and misconceptions about DeepSeek's technological advancements also sowed confusion. "The models they built are fantastic, but they aren't miracles either," said Bernstein analyst Stacy Rasgon, who follows the semiconductor industry and was one of several stock analysts describing Wall Street's reaction as overblown. "They're not using any innovations that are unknown or secret or anything like that," Rasgon said. "These are things that everybody's experimenting with." The startup DeepSeek was founded in 2023 in Hangzhou, China and released its first AI large language model later that year. Its CEO Liang Wenfeng previously co-founded one of China's top hedge funds, High-Flyer, which focuses on AI-driven quantitative trading. DeepSeek began attracting more attention in the AI industry last month when it released a new AI model that it boasted was on par with similar models from U.S. companies such as ChatGPT maker OpenAI, and was more cost-effective in its use of expensive Nvidia chips to train the system on huge troves of data. The chatbot became more widely accessible when it appeared on Apple and Google app stores early this year. But it was a follow-up research paper published last week -- on the same day as President Donald Trump's inauguration -- that set in motion the panic that followed. That paper was about another DeepSeek AI model called R1 that showed advanced "reasoning" skills -- such as the ability to rethink its approach to a math problem -- and was significantly cheaper than a similar model sold by OpenAI called o1. "What their economics look like, I have no idea," Rasgon said. "But I think the price points freaked people out." Behind the drama over DeepSeek's technical capabilities is a debate within the U.S. over how best to compete with China on AI. "Deepseek R1 is AI's Sputnik moment," said venture capitalist Marc Andreessen in a Sunday post on social platform X, referencing the 1957 satellite launch that set off a Cold War space exploration race between the Soviet Union and the U.S. Andreessen, who has advised Trump on tech policy, has warned that over regulation of the AI industry by the U.S. government will hinder American companies and enable China to get ahead. But the attention on DeepSeek also threatens to undermine a key strategy of U.S. foreign policy in recent years to restrict the sale of American-designed AI semiconductors to China. Some experts on U.S.-China relations don't think that is an accident. "The technology innovation is real, but the timing of the release is political in nature," said Gregory Allen, director of the Wadhwani AI Center at the Center for Strategic and International Studies. Allen compared DeepSeek's announcement last week to U.S.-sanctioned Chinese company Huawei's release of a new phone during diplomatic discussions over Biden administration export controls in 2023. "Trying to show that the export controls are futile or counterproductive is a really important goal of Chinese foreign policy right now," Allen said. Trump signed an order on his first day in office last week that said his administration would "identify and eliminate loopholes in existing export controls," signaling that he is likely to continue and harden Biden's approach.
[30]
DeepSeek defies America's AI supremacy
America's superinflated tech stocks had seemed due for a correction for months, but the trigger has come from an unexpected source. The latest large language model from China's artificial intelligence start-up DeepSeek may not be quite a "Sputnik moment". There are parallels, even so, with the early space race, when ingenuity helped Soviet engineers keep pace with and sometimes surpass US rivals, despite their relative lack of computing power and cutting-edge technologies. The US went on to win the Moon race, and establish a sustainable edge in space. But DeepSeek's breakthrough upturns the assumptions that have underpinned US tech valuations, of an unassailable supremacy in AI that would be extended by spending billions of dollars on chips and infrastructure. DeepSeek's achievement is to have developed an LLM that AI experts say achieves a performance similar to US rivals OpenAI and Meta but claims to use far fewer -- and less advanced -- Nvidia chips, and to have been trained for a fraction of the cost. Some of its assertions remain to be verified. If they are true, however, it represents a potentially formidable competitor. First, like Meta of the US but unlike OpenAI or Google's Gemini, it is open source -- ready to share the recipe for its secret sauce rather than keep it locked away in hope of extracting maximum financial gain. That makes it appealing for developers to use and build on. Second, it can be put together on much more of a shoestring budget and with much less computing power. This explains Monday's plunging stock prices not just of front-line tech companies but of those that make chip equipment and supply electrical hardware for data centres. The big US companies are focused on a particular model of scaling AI by throwing vast amounts of capital, data and computing power at the problem. First Mistral, a French AI start-up, and now DeepSeek have demonstrated that companies with fewer resources may end up with smarter and more efficient models. The increasing efficiency of these models is also likely to spur increased demand for their use. DeepSeek's advance highlights, too, that China is managing to make technological leaps in AI despite export controls introduced by the Biden administration intended to deprive it of both the most powerful chips and the advanced tools needed to make them. Chinese AI start-ups have been compelled to find inventive ways of extracting the most juice from the chips they do have. Far from stifling Chinese innovation, Washington may have stimulated it. And the success of often domestically trained Chinese engineers in increasing efficiency and finding workarounds raises questions over whether the technological "moat" established by high-spending US groups such as Meta, Google, OpenAI and Anthropic is as wide and impregnable as they had thought. Many advanced democracies will be wary of a Chinese government seen in many ways as inimical to western interests potentially acquiring leadership in the most transformative technology of our era. Yet some political leaders elsewhere, along with many consumers and developers, may welcome a market that is less dominated by a handful of American companies. The open question now is not necessarily who will develop the best AI models but who can apply them best to real-world tasks. Kai-Fu Lee, a Chinese AI pioneer, has long argued that China excels on the application front even if it may lag behind in infrastructure. That was before the Chinese start-up world was squeezed by the political clampdown on tech entrepreneurs and the surge of investment in US AI start-ups. But after DeepSeek's achievement, it looks a much more even game.
[31]
DeepSeek Tops iPhone Downloads: What to Know About the AI Assistant
Samantha Kelly is a freelance writer with a focus on consumer technology, AI, social media, Big Tech, emerging trends and how they impact our everyday lives. Her work has been featured on CNN, NBC, NPR, the BBC, Mashable and more. China's AI assistant has officially entered the chat - and it's putting the US on notice. DeepSeek surged to the top of Apple's App Store over the weekend, surpassing OpenAI's ChatGPT with lightning-fast, highly logical responses. Some users claim its natural language processing, writing quality and reasoning surpass US counterparts including OpenAI, Meta and Google. At the World Economic Forum in Davos, Switzerland, last week, excitement intensified around DeepSeek, a startup founded by Chinese hedge fund manager Liang Wenfeng, following the release of its AI model DeepSeek R1. The model was reportedly built in just a few months, at a fraction of the cost of US models and runs on less advanced Nvidia chips, raising questions about how China is managing to compete without access to cutting-edge US technology. Here's a deeper dive on what you need to know. DeepSeek is capturing attention for its speed, efficiency and reasoning capabilities, prompting comparisons to leading US models like ChatGPT. It's also sparked discussion about its high performance despite limited computing resources that stem from US export restrictions on advanced AI chips. DeepSeek researchers previously claimed they spent about $6 million to develop an earlier AI model, using approximately 2,000 H800 Nvidia chips - hardware with lower data transfer rates. Meanwhile, US companies are pouring billions of dollars into AI development. DeepSeek launched on Jan. 20 - the same day as President Donald Trump's inauguration - coinciding with renewed US efforts to maintain a lead in the growing AI arms race. Last week, Trump announced a new AI infrastructure initiative pledging up to $500 million in partnership with OpenAI and other tech firms. The timing also follows growing scrutiny of Chinese tech companies, with tensions already high over TikTok's data privacy concerns. Unlike TikTok, however, DeepSeek is an open-source model, which means users can run it on their own computers. But its Chinese origins and long-term strategic implications remain under scrutiny. DeepSeek's debut is already being hailed as a potential turning point in the global AI race. Prominent Silicon Valley investor Marc Andreessen even called it "AI's Sputnik moment," referring to the Russian satellite launch in the late 1950s that kicked off the space race. The launch highlights how China may be closing the AI gap faster than expected.
[32]
Why you should pay attention to DeepSeek AI
Despite a cyber attack, the open-source startup is rapidly climbing over its more established competitors. Here's what we know. A new player has very much entered the artificial intelligence (AI) villa, and it's already causing some significant disruption. DeepSeek, a Chinese AI startup, made waves last week when it released the full version of R1, the company's open-source reasoning model that can outperform OpenAI's o1. On Monday, downloads of DeepSeek's AI assistant topped ChatGPT in the App Store, which had previously been the most downloaded free app. DeepSeek has also already climbed to the third spot overall on HuggingFace's Chatbot Arena, under several Gemini models as well as ChatGPT-4o. Also: DeepSeek's new open-source AI model can outperform o1 for a fraction of the cost But almost as soon as it dethroned OpenAI, DeepSeek began limiting signups due to a cyberattack. ZDNET is currently testing DeepSeek, as we do all other popular AI chatbots, to see how it shapes up, pending signup limitations. Founded in May 2023 by Liang Wenfeng (and thus not even two years old), the Chinese startup has challenged established AI companies with its open-source approach. According to Forbes, DeepSeek's edge may lie in the fact that it is funded only by High-Flyer, a hedge fund also run by Wenfeng, which gives the company a funding model that supports fast growth and research. Released in full last week, R1 is DeepSeek's flagship reasoning model, which performs at or above OpenAI's lauded o1 model on several math, coding, and reasoning benchmarks. What makes R1 most interesting is that, unlike other top models from tech giants, it's open-source, meaning anyone can download and use it. The model also cost significantly less to train than comparable options and is therefore cheaper to access. For reference, R1 API access starts at $0.14 for a million tokens, which is a a fraction of the $7.50 OpenAI charges for the equivalent tier. One drawback that could impact its long-term competition with o1 and other American-made models is censorship. Chinese models often include blocks on certain subject matter, meaning that while they function comparably to other models, they may not answer some queries. In December, ZDNET's Tiernan Ray compared R1-Lite's ability to explain its chain of thought to that of o1, and the results were mixed. Also: Enterprises are hitting a 'speed limit' in deploying Gen AI - here's why Of course, all popular models come with their own red-teaming background, community guidelines, and content guardrails -- but at least at this stage, American-made chatbots are unlikely to refrain from answering queries about historical events. Data privacy worries that have circulated around TikTok -- the Chinese-owned social media app that is now somewhat banned in the US, ostensibly for that reason -- are also cropping up about DeepSeek. It's unclear what user data the company may be collecting or potentially sharing with the Chinese government (claims made by the US government that TikTok owner ByteDance has repeatedly denied). "The personal information we collect from you may be stored on a server located outside of the country where you live," DeepSeek's privacy policy reads. "We store the information we collect in secure servers located in the People's Republic of China." Also: 'Humanity's Last Exam' benchmark is stumping top AI models - can you do any better? The policy continues that "Where we transfer any personal information out of the country where you live, including for one or more of the purposes as set out in this Policy, we will do so in accordance with the requirements of applicable data protection laws." However, some believe the fact that R1 is open-source means increased transparency, giving users the opportunity to inspect the model's source code for signs of privacy-related activity. Regardless, DeepSeek also released smaller versions of R1, which can be downloaded and run locally to avoid any concerns about data being sent back to the company (a. opposed to accessing the chatbot online). All chatbots, including ChatGPT, are collecting some degree of user data when queried via the browser. R1's success highlights sea change in AI that could empower smaller labs and researchers to create competitive models and diversify the field of available options. For example, organizations without the funding or staff of OpenAI can download R1 and fine-tune it to compete with models like o1. Just before R1's release, researchers at UC Berkeley created an open-source model that is on par with o1-preview, an early version of o1, in just 19 hours and for roughly $450. Given how exhorbitant AI investment has become, many are speculating that this development could burst the AI bubble. Multiple reports indicate the stock market is already panicking. Also: $450 and 19 hours is all it takes to rival OpenAI's o1-preview DeepSeek's ascent comes at a critical time for Chinese-American tech relations, just days after the long-fought TikTok ban went into (at least partial?) effect.
[33]
DeepSeek: China's open-sourced AI challenges Google, Meta dominance
Last week, the one-year-old company unveiled its open-sourced AI model, DeepSeek-R1, rivaling the likes of OpenAI, Google, and Meta and questioning the widely held view that AI development needs increasing money and energy. DeepSeek's arrival on the AI scene has sent stocks of US-based global technology companies tumbling, raising questions about their valuations and the position of the US as the technology leader in this area. Founded in 2023, DeepSeek works to develop open-sourced AI models. Its founder, Liang Wenfeng, runs a quantitative hedge fund, High Flyer, which has been known for using advanced computing to analyze financial data since 2015. Liang's scientific curiosity soon took over, and he created DeepSeek to develop groundbreaking models that could be inspected and improved by the developer community and not aimed at generating monetary returns. DeepSeek's mobile app was released in January this year and has surged to the top of the iPhone download charts. Unlike OpenAI's ChatGPT, DeepSeek's chatbot articulates its reasoning before giving a response to a prompt, Bloomberg reported.
[34]
Why China's DeepSeek could burst our AI bubble
The startup's open-source reasoning model R1 is clobbering AI rivals and panicking investors. But there's more. A new player has made a big entrance in the AI villa, and it's creating significant disruption. Chinese AI startup DeepSeek made waves last week when it released the full version of R1, the company's open-source reasoning model that can outperform OpenAI's o1. On Monday, App Store downloads of DeepSeek's AI assistant topped ChatGPT, which had previously been the most downloaded free app. DeepSeek has also already climbed to the third spot overall on HuggingFace's Chatbot Arena, under several Gemini models as well as ChatGPT-4o. Also: DeepSeek's new open-source AI model can outperform o1 for a fraction of the cost But almost as soon as it dethroned OpenAI, DeepSeek began limiting signups due to a cyberattack. ZDNET is currently testing DeepSeek, as we do all other popular AI chatbots, to see how it shapes up, pending signup limitations. Founded by Liang Wenfeng in May 2023 (and thus not even two years old), the Chinese startup has challenged established AI companies with its open-source approach. According to Forbes, DeepSeek's edge may lie in the fact that it is funded only by High-Flyer, a hedge fund also run by Wenfeng, which gives the company a funding model that supports fast growth and research. Released in full last week, R1 is DeepSeek's flagship reasoning model, which performs at or above OpenAI's lauded o1 model on several math, coding, and reasoning benchmarks. What makes R1 most interesting is that, unlike other top models from tech giants, it's open-source, meaning anyone can download and use it. The model also costs significantly less to train than comparable options and is therefore cheaper to access. For reference, R1 API access starts at $0.14 for a million tokens, which is a fraction of the $7.50 that OpenAI charges for the equivalent tier. One drawback that could impact its long-term competition with o1 and other US-made models is censorship. Chinese models often include blocks on certain subject matter, meaning that while they function comparably to other models, they may not answer some queries. In December, ZDNET's Tiernan Ray compared R1-Lite's ability to explain its chain of thought to that of o1, and the results were mixed. Also: Enterprises are hitting a 'speed limit' in deploying Gen AI - here's why Of course, all popular models come with their own red-teaming background, community guidelines, and content guardrails -- but at least at this stage, American-made chatbots are unlikely to refrain from answering queries about historical events. Data privacy worries that have circulated around TikTok -- the Chinese-owned social media app that is now somewhat banned in the US -- are also cropping up about DeepSeek. It's unclear what user data DeepSeek may be collecting or potentially sharing with the Chinese government (according to claims made by the US government that TikTok owner ByteDance has repeatedly denied). "The personal information we collect from you may be stored on a server located outside of the country where you live," DeepSeek's privacy policy states. "We store the information we collect in secure servers located in the People's Republic of China." Also: 'Humanity's Last Exam' benchmark is stumping top AI models - can you do any better? The policy continues: "Where we transfer any personal information out of the country where you live, including for one or more of the purposes as set out in this Policy, we will do so in accordance with the requirements of applicable data protection laws." According to some observers, the fact that R1 is open-source means increased transparency, giving users the opportunity to inspect the model's source code for signs of privacy-related activity. Regardless, DeepSeek also released smaller versions of R1, which can be downloaded and run locally to avoid any concerns about data being sent back to the company (as opposed to accessing the chatbot online). All chatbots, including ChatGPT, are collecting some degree of user data when queried via the browser. R1's success highlights a sea change in AI that could empower smaller labs and researchers to create competitive models and diversify the field of available options. For example, organizations without the funding or staff of OpenAI can download R1 and fine-tune it to compete with models like o1. Just before R1's release, researchers at UC Berkeley created an open-source model that is on par with o1-preview, an early version of o1, in just 19 hours and for roughly $450. Given how exhorbitant AI investment has become, many are speculating that this development could burst the AI bubble. Multiple reports indicate the stock market is already panicking. Also: $450 and 19 hours is all it takes to rival OpenAI's o1-preview DeepSeek's ascent comes at a critical time for Chinese-American tech relations, just days after the long-fought TikTok ban went into (partial?) effect.
[35]
Chinese startup stuns AI world with low-cost model that rivals the best from US firms - SiliconANGLE
Chinese startup stuns AI world with low-cost model that rivals the best from US firms Technology stocks were hammered in early Monday trading on news that a Chinese startup has built and released to open source a chatbot based on an AI model that rivals the performance of the most capable models built by US companies. DeepSeek, which is operated by Hangzhou DeepSeek Artificial Intelligence Co. Ltd. and Beijing DeepSeek Artificial Intelligence Co. Ltd., made waves last week when it published a paper outlining the development process for its two primary models called DeepSeek-R1-Zero and DeepSeek-R1. The R1 model is reported to have been trained for just $5.6 million, a stark contrast to the hundreds of millions or billions spent by U.S. companies like OpenAI LLC, Google LLC, and Meta Platforms Inc. The cost advantage suggests that significant progress can be made without the massive capital expenditures that have been a cornerstone of AI development. DeepSeek's claims that it needs fewer and less advanced chips than other AI models immediately raised doubts about whether the massive spending on artificial intelligence that is expected over the next few years is necessary. Stocks of US companies with heavy investments in AI were hit hard on Monday, led by Nvidia Corp., which was down more than 15% at noon. Shares of other AI-focused chipmakers were also slammed; Broadcom Inc. fell 16%, Taiwan Semiconductor Manufacturing Corp. dropped more than 14% and Marvell Technology, Inc. sank 14%. The Nasdaq Composite Index was off more than 3% in early trading. Meanwhile, DeepSeek quickly rocketed up the charts to become the number one productivity app on the Apple App Store. The market reaction reflects anxiety over whether the U.S. can maintain its dominance in AI technology amid growing competition. The advancements sow doubts about the "need for huge western hardware investment," wrote the Financial Times. Writing on X, venture capitalist Marc Andreessen called Deepseek R1 "AI's Sputnik moment," referring to the Soviet Union's surprise 1957 launch of a satellite that kicked off the space race. He added that the model is "one of the most amazing and impressive breakthroughs I've ever seen." China is regarded as the U.S.'s biggest competitor in AI, but China's competitive position has been hobbled by difficulty obtaining high-end chips, many of which are covered by export controls. DeepSeek's ability to achieve comparable performance to the largest and costliest AI models may now force investors and chipmakers to rethink their approach to AI investment. Some analysts argued that the market sell-off may be an overreaction, noting that U.S. companies still hold a strong position in AI and have invested heavily in hardware infrastructure. "Markets had gotten too complacent on the beginning of the Trump 2.0 era and may have been looking for an excuse to pull back," Michael Block, a market strategist at Third Seven Capital LLC, told CNN. The news comes as U.S. companies and the U.S. government have signaled their intentions to step up spending on AI. Last week a consortium of companies announced plans to invest $500 billion to build a cluster of AI data centers. Meta Platforms last week also said it plans to spend $65 billion this year to expand AI infrastructure. Keith Lerner, an analyst at Truist Financial Corp., told CNN, "The DeepSeek model rollout is leading investors to question the lead that US companies have and how much is being spent and whether that spending will lead to profits (or overspending)." In their paper posted on arXiv, a consortium of authors explained how for two primary models - DeepSeek-R1-Zero and DeepSeek-R1 - were initially trained using via large-scale reinforcement learning without the initial supervised fine-tuning stage. Early models demonstrated "impressive reasoning capabilities" but suffered from poor readability and language mixing issues, they wrote. DeepSeek-R1 was based on and incorporated a small amount of cold-start data and multi-stage training. The approach improved reasoning performance and readability, achieving results comparable to OpenAI-o1-12171, a generative pre-train transformer that was released last September. The authors described the reinforcement learning process, emphasizing the role of reward modeling and the challenges encountered, such as reward hacking. Smaller models distilled from DeepSeek-R1 showed notable performance improvements with distilled models using less than 70 billion parameters, compared to the trillions that are used to train the largest AI models. DeepSeek-R1 achieved high scores on multiple benchmarks and performs well on coding, general knowledge and open-ended generation tasks.
[36]
DeepSeek #1 on App Store, shocks AI experts, sends shares tumbling
China's DeepSeek - an AI chatbot intended to rival ChatGPT - is currently the number one download in the App Store, after its performance took US companies by surprise. The Chinese startup appears to be rivalling the performance of OpenAI's ChatGPT despite having cost far less to develop, and that's hitting the market valuations of major US AI players ... Bloomberg reports. Chinese startup DeepSeek's eponymous AI assistant rocketed to the top of Apple Inc.'s iPhone download charts, stirring doubts in Silicon Valley about the strength of America's lead in AI. The app's underlying artificial intelligence model is widely seen as competitive with OpenAI and Meta Platforms Inc.'s latest. Its claim that it cost much less to train and develop triggered share moves across Asia's supply chain. Reviews praise the app as being as good as the pro version of ChatGPT, with DeepSeek also liked for showing its work, lending greater confidence in its results. We recently flagged it as something to watch. VentureBeat says AI experts are echoing the view that DeepSeek truly is competitive with the top US models. The open-source availability of DeepSeek-R1, its high performance, and the fact that it seemingly "came out of nowhere" to challenge the former leader of generative AI, has sent shockwaves throughout Silicon Valley and far beyond, based on my conversations with and readings of various engineers, thinkers and leaders [...] Web entrepreneur Arnaud Bertrand didn't mince words about the startling implications of DeepSeek's success either, writing on X: "There's no overstating how profoundly this changes the whole game. And not only with regards to AI, it's also a massive indictment of the US's misguided attempt to stop China's technological development, without which Deepseek may not have been possible (as the saying goes, necessity is the mother of inventions)." Generative AI models are powered by Nvidia GPUs. Mainstream chatbots from OpenAI, Google, and Anthropic use more than half a million of these. US export restrictions on the advanced tech means that DeepSeek was only able to obtain around 50,000 GPUs, meaning it's delivering comparable results with only 10% of the processing power. Wired reports that this forced the Chinese company to focus on software. "Unlike many Chinese AI firms that rely heavily on access to advanced hardware, DeepSeek has focused on maximizing software-driven resource optimization," explains Marina Zhang, an associate professor at the University of Technology Sydney, who studies Chinese innovations. "DeepSeek has embraced open source methods, pooling collective expertise and fostering collaborative innovation. This approach not only mitigates resource constraints but also accelerates the development of cutting-edge technologies, setting DeepSeek apart from more insular competitors." The Financial Times reports that US tech shares have been hit hard. The results sent a shockwave through markets on Monday, with Nvidia on course to lose more than $300bn of market value, the biggest recorded drop for any company, as investors reassessed the likely future investment in AI hardware [...] European chip equipment maker ASML was down 10 per cent. Microsoft fell 6 per cent and Meta slid 5 per cent. Stock futures pointed to a 4.2 per cent drop in the tech-heavy Nasdaq, while the S&P 500 index was set to decline 2.4 per cent.
[37]
Chinese AI startup DeepSeek overtakes ChatGPT on Apple App Store
BEIJING (Reuters) - Chinese startup DeepSeek's AI Assistant on Monday overtook rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States. Powered by the DeepSeek-V3 model, which its creators say "tops the leaderboard among open-source models and rivals the most advanced closed-source models globally", the artificial intelligence application has surged in popularity among U.S. users since it was released on Jan. 10, according to app data research firm Sensor Tower. The milestone highlights how DeepSeek has left a deep impression on Silicon Valley, upending widely held views about U.S. primacy in AI and the effectiveness of Washington's export controls targeting China's advanced chip and AI capabilities. AI models from ChatGPT to DeepSeek require advanced chips to power their training. The Biden administration has since 2021 widened the scope of bans designed to stop these chips from being exported to China and used to train Chinese firms' AI models. However, DeepSeek researchers wrote in a paper last month that the DeepSeek-V3 used Nvidia's H800 chips for training, spending less than $6 million. Although this detail has since been disputed, the claim that the chips used were less powerful than the most advanced Nvidia products Washington has sought to keep out of China, as well as the relatively cheap training costs, has prompted U.S. tech executives to question the effectiveness of tech export controls. Little is known about the company behind DeepSeek, a small Hangzhou-based startup founded in 2023, when search engine giant Baidu released the first Chinese AI large-language model. Since then, dozens of Chinese tech companies large and small have released their own AI models, but DeepSeek is the first to be praised by the U.S. tech industry as matching or even surpassing the performance of cutting-edge U.S. models. (Reporting by Eduardo Baptista. Editing by Gerry Doyle)
[38]
What is DeepSeek? China's ChatGPT rival sends Nvidia and Microsoft stocks tumbling, rattles AI giants
America's AI industry was left reeling over the weekend after a small Chinese company called DeepSeek released an updated version of its chatbot last week, which appears to outperform even the most recent version of ChatGPT. But it's not just DeepSeek's performance that is rattling U.S. artificial intelligence giants. It's the fact that DeepSeek built its model in just a few months, using inferior hardware, and at a cost so low it was previously nearly unthinkable. Here's what you need to know about DeepSeek? DeepSeek is a Chinese artificial intelligence lab. It was founded in 2023 and is based in Hangzhou, in China's Zhejiang province. It has released an open-source AI model, also called DeepSeek. The latest version of DeepSeek, called DeepSeek-V3, appears to rival and, in many cases, outperform OpenAI's ChatGPT -- including its GPT-4o model and its latest o1 reasoning model. However, the idea that the DeepSeek-V3 chatbot could outperform OpenAI's ChatGPT, as well as Meta's Llama 3.1, and Anthropic's Claude Sonnet 3.5, isn't the only thing that is unnerving America's AI experts. It's that fact that DeepSeek appears to have developed DeepSeek-V3 in just a few months, using AI hardware that is far from state of the art, and at a minute fraction of what other companies have spent developing their LLM chatbots.
[39]
What is DeepSeek and why is it disrupting the AI sector? | BreakingNews.ie
Chinese start-up DeepSeek's launch of its latest AI models, which it says are on a par or better than industry-leading models in the United States at a fraction of the cost, is threatening to upset the technology world order. The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million (€5.7 million) worth of computing power from Nvidia H800 chips. DeepSeek's AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States. This has raised doubts about the reasoning behind some US tech companies' decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit. Below are some facts about the company shaking up the AI sector worldwide. Why is DeepSeek causing a stir? The release of OpenAI's ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence. But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu, there was widespread disappointment in China at the gap in AI capabilities between US and Chinese firms. The quality and cost efficiency of DeepSeek's models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and US tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta's most advanced models, the Chinese start-up has said. They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek's official WeChat account. But some have publicly expressed scepticism about DeepSeek's success story. Scale AI chief executive Alexandr Wang said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington's export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation. Bernstein analysts on Monday highlighted in a research note that DeepSeek's total training costs for its V3 model were unknown but were much higher than the $5.58m (€5.3m) the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed. Who is behind DeepSeek? DeepSeek is a Hangzhou-based start-up whose controlling shareholder is Liang Wenfeng, co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records. Liang's fund announced in March 2023 on its official WeChat account it was "starting again", going beyond trading to concentrate resources on creating a "new and independent research group, to explore the essence of AGI" [Artificial General Intelligence]. DeepSeek was created later that year. ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks. It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, according to Chinese corporate records. High-Flyer's AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips. How does Beijing view DeepSeek? DeepSeek's success has already been noticed in China's top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang, according to state news agency Xinhua. Liang's presence at the gathering is potentially a sign DeepSeek's success could be important to Beijing's policy goal of overcoming Washington's export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li. - Reuters
[40]
DeepSeek tops the App Store, but it has a rocky road ahead
Chinese AI app DeepSeek is on top of the App Store, challenging Apple Intelligence, and shaking Wall Street confidence in big tech. The market for generative AI is already dominated by a few major names in the West, with Apple Intelligence alongside Google Gemini and OpenAI's ChatGPT. While there are many smaller efforts to offer AI services, one has seemingly come out of nowhere in a surge of popularity. Called DeepSeek, it's an app that has leaped to the top of the U.S. App Store, as well as many other countries, in a very short space of time. Following the release of its LLM one week ago, it has seen massive growth in a short space of time. It has already been heralded as a milestone in AI. However, just as quickly as it became a major event in AI, it may just as easily disappear from view. DeepSeek is a chat app in a similar vein to ChatGPT. You can ask it questions and to perform tasks, and it will use its large language model (LLM) to offer a response. This is practically what ChatGPT and other competitors offer users, and isn't massively transformative on its own. However, the circumstances behind it are more interesting. DeepSeek-V3, the LLM, was officially released in December, with a mobile app introduced just one week ago by the Chinese AI company. That app uses R1, a reasoning model that is based on V3. However, the LLM was released as an open source model, instead of being a proprietary model. Due to the U.S. limiting the shipments of AI chips to China, developers had to work on alternative ways to not only train models, but also to perform the actual queries. This necessitated collaboration between development teams, which led to the open-source LLM's development. It's a model that anyone can download, including from Hugging Face. With limited hardware, it had to pull off some new tricks to be as resource-efficient as it can. This included analyzing the query to only use relevant parts of the LLM itself, cutting processing costs considerably. These limitations were also affected by the team having a considerably smaller budget than typical AI projects cost in the West. It allegedly cost DeepSeek just $5.58 million to train the model, reports The Register. Bearing in mind Microsoft's billion-dollar investments into OpenAI and the U.S. government's $500 billion Stargate AI project, DeepSeek's development is a drop in the ocean. The sudden massive popularity of DeepSeek has been felt by the money markets, stunned by a newcomer that could feasibly offer ChatGPT responses for minimal cost. The shares of major AI companies have been badly hit in the face of a cheaper, faster rival competing against well-heeled projects. Most affected was chip maker Nvidia, which saw its shares dip at a peak of 17 percent on Monday, reports the BBC. The incident wiped more than $600 billion in share value in one day alone. Other tech companies also saw their shares affected, including Microsoft and Meta. Apple, meanwhile, seems largely unaffected by the news. It is likely that Nvidia was hit the most due to it being a supplier of high-priced hardware used for machine learning purposes. With a model demonstrating training can be efficiently done with minimal resources, it means there could be less reliance on Nvidia's chips in the future. While popularity is often a great thing for business, sometimes the attention can cause problems. In the case of DeepSeek, it's from a suspected distributed denial of service (DDoS) attack. A note on the status page advises that "large-scale malicious attacks on DeepSeek's services" is underway. To try and counter the problem, and to limit its impact on new users, DeepSeek has limited the registration of new accounts temporarily. For the moment, only users in China with an +86 country code phone number can sign up for a new account. Existing users who already have an account can continue using it without issue. It is likely that DeepSeek will keep the limitation in play until it can mitigate the DDoS, or if the attack eventually dies down. The impact of DeepSeek cannot be ignored by the AI industry as a whole. Indeed, tech investor Marc Andreessen referred to it as "AI's Sputnik moment," with US efforts blindsided by an AI produced in a rival superpower. However, as history has shown, it's likely that DeepSeek will have a bumpy ride in the United States and in the West in general. The United States is very wary of Chinese developments, especially in the tech field. With claims of national security issues at risk, the United States has already diminished Huawei's presence in the West, leaving it to be a major force in its home country. Then there's the constant battle over TikTok. While current events may see it rescued from a ban in the U.S., the entire issue stemmed from the U.S. deeming it a potential security risk to consumers. It's likely that, if DeepSeek sticks around and stays extremely prominent, the U.S. government may become interested. When that happens, it's likely that there will be accusations of it being a security risk, calls for investigations, and possible bans too. Even if that occurs, DeepSeek may have already inflicted enough damage on the West's AI efforts, simply by demonstrating that it can be done in cheaper ways, without significant hardware acquisition or ongoing running costs.
[41]
What is DeepSeek? Why China's latest AI model is spooking Wall Street...
Wall Street and Silicon Valley got clobbered on Monday over rising fears about DeepSeek - a Chinese artificial intelligence startup that claims to have developed an advanced model at a fraction of the cost of its US counterparts. Shares of Nvidia and other major tech giants shed more than $1 trillion in market value as investors parsed details. DeepSeek claims it built its AI model in a matter of months for just $6 million, upending expectations in an industry that has forecast hundreds of billions of dollars in spending on chips and other infrastructure. DeepSeek immediately surged to the top of the charts in Apple's App Store over the weekend - displacing OpenAI's ChatGPT and other competitors. The rapid rise has sparked panic that the US could lose its AI advantage to China. Here is what to know about DeepSeek: DeepSeek was founded in May 2023. Based in Hangzhou, China, the company develops open-source AI models, which means they are readily accessible to the public and any developer can use it. DeepSeek released details earlier this month on R1, the reasoning model that underpins its chatbot. The AI firm turned heads in Silicon Valley with a research paper explaining how it built the model. DeepSeek says the model excels at problem-solving despite being much cheaper to train and run than its rivals. DeepSeek claims that the performance of its R1 model is "on par" with the latest release from OpenAI. Alexandr Wang, the CEO of Scale AI, recently told CNBC that DeepSeek is "the top-performing, or roughly on par with the best American models." DeepSeek's top shareholder is Liang Wenfeng, who runs the $8 billion Chinese hedge fund High-Flyer. High-Flyer has an office in the same building as its headquarters, according to Chinese corporate records obtained by Reuters. The AI chatbot has already faced allegations of rampant censorship in line with the Chinese Communist Party's preferences. DeepSeek admitted that its "programming and knowledge base are designed to follow China's laws and regulations, as well as socialist core values," according to an output posted on the US House's select committee on China. The chatbot self-censored its responses when asked about China's leader Xi Jinping, the Tiananmen Square massacre of 1989, China's human rights abuses toward Uighurs in its Xinjiang Province and whether Taiwan is a country, according to examples circulating on social media. DeepSeek is a privately held startup and is not publicly traded in the US. Shares of Nvidia plunged a whopping 17% in Monday trading on panic related to DeepSeek, erasing more than $600 billion in value from its market cap. However, the long-term threat that DeepSeek's success poses to Nvidia's business model remains to be seen. For example, analysts at Citi said access to advanced computer chips, such as those made by Nvidia, will remain a key barrier to entry in the AI market. Wedbush analyst Dan Ives described the chaos around DeepSeek's launch as a "buying opportunity. Nvidia is also heavily involved in President Trump's $500 billion "Stargate" AI infrastructure project. Nvidia bulls have noted that DeepSeek still used its chips to train the model, albeit with a less costly footprint than previously thought possible.
[42]
China's DeepSeek Excites Scientists, Sparks Global Market Turmoil
The Hangzhou-based startup founded by Liang Wenfeng in 2023, which released the model quietly, shot to No. 1 position on Apple's App Store, outperforming US tech players' products. US stocks, particularly in the semiconductor sector, saw significant losses, while Chinese markets held steady, with their strongest performance relative to US markets in over two years. Chip maker Nvidia's shares were down nearly 15% on NYSE in early trade on Monday. Speaking at the World Economic Forum earlier this month, Satya Nadella said, "We should take the developments out of China very, very seriously." For years, the AI debate centered on open-source vs. closed models. But today, it's moving beyond and questioning who has or will have AI dominance. Meta's chief AI scientist Yan LeCun said, "To people who see the performance of DeepSeek and think China is surpassing the US in AI. You are reading this wrong. The correct reading is: Open source models are surpassing proprietary ones." He added that DeepSeek's success stems from open research and tools like PyTorch and Llama, which were developed by Meta. By building on existing work and contributing back, they've shown the power of open-source collaboration in advancing the AI ecosystem. Additionally, HuggingFace is replicating the entire DeepSeek R1 pipeline for the open-source community. Now, it's about who actually controls AI's future. Despite US trade restrictions, the country, with limited resources, proved to the world that it could develop a cost-effective alternative in a short span. However, at the WEF in Davos, Scale AI CEO Alexandr Wang said that DeepSeek has around 50,000 NVIDIA H100s that they cannot openly discuss due to US export controls. Interesting to note that while OpenAI is charging a whopping $200 a month for its version of the 01 reasoning model, DeekSeek is offering it for free. DeepSeek has in a way proved that the US has no AI moat. "While NVIDIA has a strong foothold, AI dominance cannot be taken for granted," warned Charu Chanana, chief investment strategist at Saxo Markets. "The emergence of China's DeepSeek indicates that competition is intensifying. Future competitors will evolve faster and challenge established companies more quickly." Meanwhile, OpenAI's o3 model -- its next big release -- still isn't public. DeepSeek's disruption isn't just shaking up Wall Street -- it's forcing India to rethink its AI strategy. For years, Indian tech leaders have argued over whether to build foundational models or just focus on AI applications. But now? "We need to build, not just wrap existing AI," said Perplexity AI CEO Aravind Srinivas, after seeing DeepSeek's success. Srinivas is now offering $10 million to any Indian team that can build a competitive AI model from scratch. Others, like Turing's Dream, Sarvam AI and Zoho's Sridhar Vembu, are shifting their focus to deep AI research. Ajai Chowdhry, HCL co-founder, stressed the need for strategic autonomy, warning, "We're heading towards the weaponisation of tech," and urged India to develop its own AI doctrine and domestic data hardware. Meanwhile, Jio's Gaurav Aggarwal echoed this urgency, saying, "I left Google Research to ensure India doesn't get AI-colonised." Sarvam AI's Pratyush Kumar is already building sovereign AI models and inviting collaboration from Srinivas. Researchers like Paras Chopra and Zerodha's Kailash Nadh emphasised grassroots efforts and deeper research funding as crucial for long-term success. India's IT giants, including TCS, Infosys, Wipro, and HCLTech, are working on AI projects, mostly focusing on the applications and client side of things, not building foundational technologies for the country. But there's one exception. Tech Mahindra is probably the only system integrator that has built a foundational model from scratch. Last year, at MachineCon GCC Summit 2024, former Tech Mahindra chief CP Gurnani revealed that it was able to develop an Indian LLM for local languages and 37+ dialects in just 5 months, spending less than $5 million. [Must Watch] Check out the latest episode of 'What's the Point?,' where we unpack what AI for Bharat really means. The message is clear: AI dominance isn't about who spends the most money -- it's about who moves the fastest and smartest.
[43]
What Is DeepSeek and Why Is It Disrupting the AI Sector?
Chinese startup DeepSeek's launch of its latest AI models, which it says are on a par or better than industry-leading models in the United States at a fraction of the cost, is threatening to upset the technology world order. The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips. DeepSeek's AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States. This has raised doubts about the reasoning behind some U.S. tech companies' decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit.
[44]
DeepSeek AI is now on top of Apple's App Store
Just a few days after China's AI startup DeepSeek launched its latest reasoning model, DeepSeek R1, the company's iOS app surged to the top of Apple's App Store, leaving OpenAI's ChatGPT in second place. DeepSeek claims that it DeepSeek R1 beats competing AI models in several key benchmarks. In my testing, it definitely held its own, helping me design a web project in minutes and helping me improve my chess skills. But what truly propelled DeepSeek's popularity is the fact that it's open source, as well as its pricing. DeepSeek is free to use on the web and as a smartphone app, and the company charges a lot less for API access than OpenAI does for its most advanced reasoning AI model. Numerous threads on social networks popped up over the weekend with examples of people running DeepSeek on fairly modest hardware or using it to perform complex tasks. The result? DeepSeek is now the top app on Apple's App Store, and the 20th most popular app on Google Play. The launch and sudden popularity of DeepSeek could also impact the U.S. stock market, where Nvidia currently reigns supreme as the largest company by market cap. There are indications that DeepSeek has been built and trained for far less than competing U.S.-based models such as Meta's Llama and OpenAI's ChatGPT models. This, in turn, could mean a weaker demand for Nvidia hardware, which is commonly used for high-end AI applications. Currently, tech stock index NASDAQ's futures are down 3.2 percent ahead of market open on Monday.
[45]
DeepSeek Overthrows ChatGPT; Becomes #1 App on the App Store
DeepSeek's remarkable success has hit the US tech stock market, amid concerns over high-priced AI infrastructure and training cost. DeepSeek, the Chinese AI pioneer, has surpassed ChatGPT and taken the top position on the Apple App Store. This comes after DeepSeek released its latest breakthrough, the R1 model that matches or even surpasses OpenAI's o1 reasoning model in major benchmarks. In the US, DeepSeek is currently the #1 app in the 'Top Free Apps' category while ChatGPT is in third place. This development is significant because DeepSeek, backed by High-Flyer, a China-based Quant fund, managed to train the R1 model for just $5.58 million -- just 3% of what OpenAI spent to train its o1 model. It goes on to show that DeepSeek has highly optimized the model architecture to reduce the training and operating costs. Unlike ChatGPT which serves the smaller GPT-4o mini model for free, DeepSeek is offering its most capable R1 model for free, which is equivalent to OpenAI's o1 model that costs $20 per month. Reacting to the development, OpenAI CEO Sam Altman tweeted that the upcoming o3-mini model will be available to free users as well. DeepSeek's remarkable success has also impacted the US tech stock market. Nasdaq Futures declined by 400 points this morning amid concerns that US tech leadership in AI might be disrupted by China. Nvidia (NVDA) is down by 3.12%, reflecting apprehension over high-priced hardware for AI training and inference. You can download DeepSeek on your smartphone (Android / iOS) and use it for free. On the web, you can chat with the R1 model at chat.deepseek.com. Bear in mind that users get 50 free messages per day for the DeepThink (R1) model, and you also have the option to search the web.
[46]
How China's new AI model DeepSeek is threatening U.S. dominance
A little-known AI lab out of China has ignited panic throughout Silicon Valley after releasing AI models that can outperform America's best despite being built more cheaply and with less-powerful chips. DeepSeek, as the lab is called, unveiled a free, open-source large-language model in late December that it says took only two months and less than $6 million to build, using reduced-capability chips from Nvidia called H800s. The new developments have raised alarms on whether America's global lead in artificial intelligence is shrinking and called into question big tech's massive spend on building AI models and data centers. In a set of third-party benchmark tests, DeepSeek's model outperformed Meta's Llama 3.1, OpenAI's GPT-4o and Anthropic's Claude Sonnet 3.5 in accuracy ranging from complex problem-solving to math and coding. DeepSeek on Monday released r1, a reasoning model that also outperformed OpenAI's latest o1 in many of those third-party tests. "To see the DeepSeek new model, it's super impressive in terms of both how they have really effectively done an open-source model that does this inference-time compute, and is super-compute efficient," Microsoft CEO Satya Nadella said at the World Economic Forum in Davos, Switzerland, on Wednesday. "We should take the developments out of China very, very seriously." DeepSeek also had to navigate the strict semiconductor restrictions that the U.S. government has imposed on China, cutting the country off from access to the most powerful chips, like Nvidia's H100s. The latest advancements suggest DeepSeek either found a way to work around the rules, or that the export controls were not the chokehold Washington intended. "They can take a really good, big model and use a process called distillation," said Benchmark General Partner Chetan Puttagunta. "Basically you use a very large model to help your small model get smart at the thing you want it to get smart at. That's actually very cost-efficient." Little is known about the lab and its founder, Liang WenFeng. DeepSeek was was born of a Chinese hedge fund called High-Flyer Quant that manages about $8 billion in assets, according to media reports. But DeepSeek isn't the only Chinese company making inroads. Leading AI researcher Kai-Fu Lee has said his startup 01.ai was trained using only $3 million. TikTok parent company ByteDance on Wednesday released an update to its model that claims to outperform OpenAI's o1 in a key benchmark test. "Necessity is the mother of invention," said Perplexity CEO Aravind Srinivas. "Because they had to figure out work-arounds, they actually ended up building something a lot more efficient."
[47]
China's DeepSeek AI assistant becomes top free iPhone app as US tech stocks take a hit
It requires far less computing power than rivals like OpenAI, according to the developer. Chinese AI assistant DeepSeek has become the top rated free app on Apple's App Store in the US and elsewhere, beating out ChatGPT and other rivals. It's powered by the open-source DeepSeek V3 model, which reportedly requires far less computing power than competitors and was developed for under $6 million, according to (disputed) claims by the company. At the same time, it offers performance that's on par with Claude-3.5, GPT-4o and other rivals, DeepSeek said last week. The news that DeepSeek topped the App Store charts caused a sharp drop in tech stocks like NVIDIA and ASML this morning. Google parent company Alphabet and Microsoft were also down this morning. As the New York Times notes, the overall Nasdaq and S&P 500 dropped as well, and markets in Europe and Japan also took a bit of a hit. Available on web, app and API, DeepSeek is similar to AI Assistant like ChatGPT with features like coding content creation and research. Its first DeepSeek-R1 release is available under an MIT license, so it can be used commercially and without restrictions. The company is headquartered in Hangzhou, China and was founded in 2023 by Liang Wenfeng, who also launched the hedge fund backing DeepSeek. To develop the tech, he reportedly stockpiled NVIDIA A100 chips prior to the US export ban and paired those with less powerful chips that can still be imported, according to MIT Technology Review. However, DeepSeek was still at a significant hardware disadvantage next to rival models from OpenAI, Google and others. That forced the company to be more efficient with its AI models, and it has supposedly been able to build and train them at a far lower cost than previously thought possible. Analysts from Citi and elsewhere have questioned those claims, though, and pointed out that China is a "more restrictive environment" for AI development than the US. Still, the rise of DeepSeek has raised concerns about the potential profits of rivals like OpenAI that have already invested billions in AI infrastructure. Update, January 27 2025, 11:27AM ET: This story and its headline were updated with more details on the stock price drops that appear to have hit this morning in conjunction with the news around DeepSeek.
[49]
Why China's DeepSeek AI Is Blowing Everyone's Minds -- And Blowing Up the Market - Decrypt
A Chinese artificial intelligence lab has done more than just build a cheaper AI model -- it's exposed the inefficiency of the entire industry's approach. DeepSeek's breakthrough showed how a small team, in an effort to save money, was able to rethink how AI models are built. While tech giants like OpenAI and Anthropic spend several billions of dollars on compute power alone, DeepSeek purportedly achieved similar results for just over $5 million. The company's model matches or beats GPT-4o (OpenAI's best LLM), OpenAI o1 -- OpenAI's best reasoning model currently available -- and Anthropic's Claude 3.5 Sonnet on many benchmark tests, using roughly 2.788M H800 GPU hours for its full training. That's a very small fraction of the hardware traditionally thought necessary. The model is so good and efficient, it climbed to the top of Apple's iOS productivity apps category in a matter of days, challenging OpenAI's dominance. Necessity is the mother of innovation. The team was able to achieve this using techniques that American developers didn't need to even consider -- and don't even dominate today. Perhaps the most important one was that instead of using full precision for calculations, DeepSeek implemented 8-bit training, cutting memory requirements by 75%. "They figured out floating-point 8-bit training, at least for some of the numerics," Perplexity CEO Aravind Srinivas told CNBC. "To my knowledge, I think floating-point 8 training is not that well understood. Most of the training in America is still running in FP16." FP8 uses half the memory bandwidth and storage compared to FP16. For large AI models with billions of parameters, this reduction is substantial. DeepSeek needed to master this because its hardware was weaker, but OpenAI has never had this constraint. DeepSeek also developed a "multi-token" system that processes entire phrases at once rather than individual words, making the system twice as fast while maintaining 90% accuracy. Another technique it used was something called "distillation" -- making a small model replicate the outputs of a larger one without having to train it on the same knowledge database. This made it possible to release smaller models that are extremely efficient, accurate, and competitive. The firm also used a technique called "mixture of experts," which added to the model's efficiency. While traditional models keep all of their parameters active constantly, DeepSeek's system uses 671 billion total parameters but only activates 37 billion at once. It's like having a large team of specialists, but only calling in the experts needed for certain tasks. "We use DeepSeek-R1 as the teacher model to generate 800K training samples, and fine-tune several small dense models. The results are promising: DeepSeek-R1-Distill-Qwen-1.5B outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks with 28.9% on AIME and 83.9% on MATH," DeepSeek wrote in its paper. For context, 1.5 billion is such a small amount of parameters for a model that it's not considered an LLM or large language model, but rather a SLM or small language model. SLMs require so little computation and vRAM that users can run them on weak machines like their smartphones. The cost implications are staggering. Beyond the 95% reduction in training costs, Deepseek's API charges just 10 cents per million tokens, compared to $4.40 for similar services. One developer reported processing 200,000 API requests for about 50 cents, with no rate limiting. The "DeepSeek Effect" is already noticeable. "Let me say the quiet part out loud: AI model building is a money trap," said investor Chamath Palihapitiya. And despite the punches thrown at DeepSeek, OpenAI CEO Sam Altman quickly pumped the brakes on his quest to squeeze users for money, after all the raves on social media about people achieving for free with DeepSeek what OpenAI charges $200 a month to do. Most of the AI stocks are down as investors question whether the hype is at bubble levels. Both AI hardware (Nvidia, AMD) and software stocks (Microsoft, Meta, and Google) are suffering the consequences of the apparent paradigm shift triggered by DeepSeek's announcement, and the results shared by users and developers. Even AI crypto tokens took a hit, with scads of DeepSeek AI token imposters popping up in an attempt to scam degens. Aside from the financial wreckage, the takeaway from all this is that DeepSeek's breakthrough suggests that AI development might not require massive data centers and specialized hardware. This could fundamentally alter the competitive landscape, transforming what many considered permanent advantages of major tech companies into temporary leads. The timing is almost comical. Just days before DeepSeek's announcement, President Trump, OpenAI's Sam Altman, and Oracle's founder unveiled Project Stargate -- a $500 billion investment in U.S. AI infrastructure. Meanwhile, Mark Zuckerberg doubled down on Meta's commitment to pour billions into AI development, and Microsoft's $13 billion investment in OpenAI suddenly looks less like strategic genius and more like expensive FOMO fueled by a waste of resources .
[50]
DeepSeek's AI Assistant from China has become the top free iPhone app
It requires far less computing power than rivals like OpenAI, according to the developer. Chinese AI assistant DeepSeek has become the top rated free app on Apple's App Store in the US and elsewhere, beating out ChatGPT and other rivals. It's powered by the open-source DeepSeek V3 model, which reportedly requires far less computing power than competitors and was developed for under $6 million, according to (disputed) claims by the company. At the same time, it offers performance that's on par with Claude-3.5, GPT-4o and other rivals, DeepSeek said last week. Available on web, app and API, DeepSeek is similar to AI Assistant like ChatGPT with features like coding content creation and research. Its first DeepSeek-R1 release is available under an MIT license, so it can be used commercially and without restrictions. The company is headquartered in Hangzhou, China and was founded in 2023 by Liang Wenfeng, who also launched the hedge fund backing DeepSeek. To develop the tech, he reportedly stockpiled NVIDIA A100 chips prior to the US export ban and paired those with less powerful chips that can still be imported, according to MIT Technology Review. However, DeepSeek was still at a significant hardware disadvantage next to rival models from OpenAI, Google and others. That forced the company to be more efficient with its AI models, and it has supposedly been able to build and train them at a far lower cost than previously thought possible. Analysts from Citi and elsewhere have questioned those claims, though, and pointed out that China is a "more restrictive environment" for AI development than the US. Still, the rise of DeepSeek has raised concerns about the potential profits of rivals like OpenAI that have already invested billions in AI infrastructure. In fact, the news that DeepSeek topped the App Store charts caused a sharp drop in tech stocks like NVIDIA and ASML this morning.
[57]
DeepSeek iOS App Overtakes ChatGPT, Ranks on Top of the App Store
DeepSeek's API pricing is also a fraction of what the OpenAI API charges DeepSeek for iOS has overtaken OpenAI's ChatGPT to top the App Store's "Top free apps" chart in the US. The eponymous Chinese company released the open-source DeepSeek-R1 artificial intelligence (AI) model last week, which outperforms OpenAI's o1 AI model in several benchmarks. This release has made the company and its AI chatbot the talk of the town, with several Silicon Valley tech leaders reacting to its sudden rise. Notably, the DeepSeek app is entirely free to use, with no subscription tiers announced so far. Ever since OpenAI launched the ChatGPT for iOS app in May 2023, it has consistently been among the top free applications on the App Store and has remained the highest-ranked AI app on the platform. However, this throne was taken by the DeepSeek app recently as it reached the top of the chart. The rise is being attributed to the recent launch of the DeepSeek-R1 AI model. Notably, little is known about the Hangzhou-based AI firm which was founded in 2023 and has released several open-source large language models (LLMs). While US-based tech companies have also released open-source models, with Meta being a notable mention in the community, none comes close to the capabilities and scale of what the DeepSeek models offer. The company also claimed to have built the model with a cost of $6 million (roughly Rs. 51.8 crore), significantly cheaper than AI models of this scale. Several Silicon Valley-based tech leaders have reacted to the sudden rise of these models. Venture capitalist Marc Andreessen called the R1 AI model "one of the most amazing and impressive breakthroughs I've ever seen," while Perplexity AI Co-Founder and CEO Aravind Srinivas congratulated the Chinese firm for becoming the first AI app to beat ChatGPT. In a LinkedIn post, Yann LeCun, one of the godfathers of AI and the current Vice President and Chief AI Scientist at Meta, said "Open source models are surpassing proprietary ones." Holger Zschäpitz, a senior financial reporter at German free-to-air television news channel Welt, argued DeepSeek "could represent a big threat to US equity markets." While DeepSeek has broken the trend of companies open-sourcing smaller and less capable AI models while keeping the frontier models under a proprietary paywall, it is not the only one. Another Chinese firm, Kimi AI, announced the release of the Kimi k1.5 AI model. The company claims it is an "o1-level multimodal LLM that outperforms GPT-4o and Claude Sonnet 3.5 on several benchmarks". Notably, Kimi AI has made the web version of its chatbot completely free to use with unlimited usage, similar to DeepSeek. The AI model can also perform real-time web searches, analyse up to 50 files across a diverse range of formats, and has image understanding capabilities as well. While its technical report is available on GitHub, it is currently not available in open-source.
[59]
A new AI assitant from China has Silicon Valley talking
DeepSeek's latest models, created by a small company with limited resources, are already beating many of the leading AI models in the United States. The sudden appearance of an advanced AI assistant from DeepSeek, a previously little-known company in the Chinese city of Hangzhou, has sparked discussion and debate within the U.S. tech industry about what it says about the broader AI development race. DeepSeek's assistant hit No. 1 on the Apple App Store in recent days, and the AI models powering the assistant are already outperforming top U.S. models, with the company saying that they were made with a fraction of the resources. DeepSeek released its latest large language model, R1, a week ago. Second only to OpenAI's o1 model in the Artificial Analysis Quality Index, a well-followed independent AI analysis ranking, R1 is already beating a range of other models including Google's Gemini 2.0 Flash, Anthropic's Claude 3.5 Sonnet, Meta's Llama 3.3-70B and OpenAI's GPT-4o. "DeepSeek R1 is AI's Sputnik moment," entrepreneur Marc Andreessen, known for coauthoring Mosaic, one of the world's first web browsers, wrote on X Sunday, likening it to the space race between the U.S. and the Soviet Union and the event that forced the U.S. to realize that its technological abilities were not unassailable. One of R1's core competencies is its ability to explain its thinking through chain-of-thought reasoning, which is intended to break complex tasks into smaller steps. This method enables the model to backtrack and revise earlier steps -- mimicking human thinking -- while allowing users to also follow its rationale. At last week's World Economic Forum in Switzerland, Microsoft CEO Satya Nadella -- whose company is one of OpenAI's biggest investors -- called DeepSeek's new model "super impressive," adding that he believes "we should take the developments out of China very, very seriously." Both R1 and o1 are part of an emerging class of "reasoning" models meant to solve more complex problems than previous generations of AI models. But unlike OpenAI's o1, DeepSeek's R1 is free to use and open weight, meaning anyone can study and copy how it was made. R1 was based on DeepSeek's previous model V3, which had also outscored GPT-4o, Llama 3.3-70B and Alibaba's Qwen2.5-72B, China's previous leading AI model. Upon its release in late December, V3 was performing on par with Claude 3.5 Sonnet. Part of what makes R1 so impressive are the claims from DeepSeek about its development. R1 took only two months and less than $6 million to build, according to a DeepSeek technical report, even as leading tech companies in the United States continue to spend billions of dollars a year on AI. DeepSeek also had to navigate U.S. export restrictions that limited access to the best AI computing chips, forcing the company to build its models with less-powerful chips. It's ignited a heated debate in American tech circles: How did a small Chinese company so dramatically surpass the best-funded players in the AI industry? And what does this mean for the field going forward? Meta's chief AI scientist Yann LeCun wrote in a Threads post that this development doesn't mean China is "surpassing the US in AI," but rather serves as evidence that "open source models are surpassing proprietary ones." He added that DeepSeek benefited from other open-weight models, including some of Meta's. "They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it," LeCun wrote. "That is the power of open research and open source." (Although many companies, including DeepSeek and Meta, claim their AI models are open source, they have not actually revealed their training data to the public.) OpenAI CEO Sam Altman also appeared to take a jab at DeepSeek last month, after some users noticed that V3 would occasionally confuse itself with ChatGPT. A day after V3's release, Altman wrote on X that "it is (relatively) easy to copy something that you know works. it is extremely hard to do something new, risky, and difficult when you don't know if it will work." Some figures online floated unsubstantiated claims that DeepSeek's success is a Chinese government "psyop," or psychological operation, casting suspicion on the small team's ability to "beat all of the top researchers in the world as a side project." Soumith Chintala, cofounder of PyTorch, the machine learning library developed by Meta AI, was among many this weekend who hit back at these allegations. "i'm comically impressed that people are coping on deepseek by spewing bizarre conspiracy theories -- despite deepseek open-sourcing and writing some of the most detail oriented papers ever," Chintala posted on X. "read. replicate. compete. don't be salty, just makes you look incompetent." Others in the tech and investment spheres joined in on the praise, expressing excitement about the implications of DeepSeek's success. "This is what makes the DeepSeek thing so funny. A bunch of grifters have been selling AI secret sauce for years -- spooky mystery juice that could never be fully explained," macroeconomist Philip Pilkington wrote on X. "Now a bunch of young guys just wrote a good algo, published it, and the circus tent burned down." Nat Friedman, the former CEO of Github, similarly posted: "The deepseek team is obviously really good. China is full of talented engineers. Every other take is cope. Sorry." DeepSeek's models tout bilingual proficiency, excelling in both Chinese and English. They do, however, appear subject to censorship or specific political leanings around topics deemed sensitive in China. When asked about the sovereignty of Taiwan, a self-ruling island democracy that Beijing claims as its territory, DeepSeek's R1 sometimes states the subject is "beyond my current scope." Other times, the model describes Taiwan as "an inalienable part of China's territory," adding: "We firmly oppose any form of 'Taiwan independence' separatist activities and are committed to achieving the complete reunification of the motherland through peaceful means." Hot on the heels of DeepSeek's latest models, other players in China's tech sector are already rolling out new contenders in the race for AI dominance. Alibaba on Sunday introduced its newest Qwen2.5-1M model, an upgrade from Qwen2.5-72B. Kimi AI, owned by Beijing-based company Moonshot AI, also announced the launch of its latest multimodal reasoning model Kimi k1.5 on Saturday, which it touts as comparable to OpenAI's o1.
[60]
This Chinese AI App Just Overtook ChatGPT
It turns out you may not need billions of dollars to invent a fancier Siri. A week after banning TikTok from app stores (no, it still isn't downloadable) and one day after threatening 25% tariffs on Colombia, the U.S. has taken a major blow in its ongoing trade wars -- overnight, Silicon Valley seems to have lost its AI dominance. The shakeup follows a release from fresh-faced Chinese AI startup DeepSeek, which on Jan. 20 updated its ChatGPT-like AI assistant with its open-source R1 reasoning model. According to Deepseek's testing, the R1 model matches OpenAI's o1 reasoning model on several metrics, all while being much, much cheaper to develop. The Wall Street Journal was the first to report on DeepSeek R1's ultra-low development cost, citing the one-year-old company's claims that it only took $5.6 million to develop the new model, vs. over $100 million from OpenAI for its equivalent. It took a little bit for the news to get out there, but DeepSeek has consequently risen to the top of the App Store, unseating ChatGPT as the most-downloaded free app. The sudden surge in attention has been hard on U.S. stocks, sending the Dow down by about 0.22%, the S&P down by 2%, and the Nasdaq down by 3.6%. More specifically, Google parent company Alphabet is down 2.89%, with Meta, Oracle, and other tech giants also seeing significant declines. Notably, data center and graphics card company Nvidia, which supplies much of the hardware powering AI development, is down 11.64%. Nvidia's stock drop in particular likely has to do with claims from DeepSeek that it only needed roughly 2,000 specialized Nvidia chips to train its latest AI model, whereas leading U.S. models tend to use closer to 16,000 chips. Said claims are still awaiting verification, but if true, would poke holes in the US' recent policy efforts to restrict the amount of U.S. chips Chinese developers can use. On the consumer side of things, DeepSeek promises cheaper access to higher-tier models than ChatGPT, which puts basic access to its o1 model behind a $20/month ChatGPT Plus subscription and unlimited access to the o1 model behind its pricey $200/month ChatGPT Pro plan. Unfortunately, the company seems to be suffering from success right now -- servers appear to be overloaded, and I'm currently not able to sign up for an account for testing. Presumably, as more people get through and get their hands on these models, it'll be easier to verify just how scared of DeepSeek U.S. companies should be. Still, the competition could prove to be a shot across the bow for U.S. AI developers, who, alongside President Trump, just announced the $500 billion "Stargate Project," an initiative to build out U.S. AI infrastructure starting with a $100 billion plan to build out data centers in Texas.
[61]
Silicon Valley in Shambles as Chinese Startup Creates Top-Tier AI Without Billions of Investment
A Chinese AI chatbot called DeepSeek has rocketed to the top of the charts on Apple's App Store -- beating the ever-popular ChatGPT, among other competitors. But what had Silicon Valley leaders breaking out in a cold sweat over the weekend were claims that the open-source large language model it's based on, DeepSeek V3, was trained with a fraction of the computing power they've been relying on, developed for what was reportedly less than $6 million. Western AI models, on the other hand, have sucked up billions of funding, with companies including OpenAI looking to spend hundreds of billions of dollars in the coming years to build out the infrastructure required to train these behemoth models and keep them running. The timing couldn't be worse. Just last week, US president Donald Trump announced a $500 billion AI infrastructure deal, dubbed Stargate, that involved the ChatGPT maker -- alongside investment company SoftBank, tech giant Oracle, and others -- spending a planned $500 billion on AI datacenters over the next few years. Not to be outdone, Meta CEO Mark Zuckerberg announced on Friday that the company would be spending a record $60 billion on AI this year alone. DeepSeek bringing something impressive to market for far less raises an unnerving question for all those check writers: what if AI can actually be done for cheap, without all that epic infrastructure? AI chipmaker Nivida, in particular, felt the ground shake when trading began on Monday, with shares dropping around eleven percent in early trading. Microsoft, which has invested billions of dollars in OpenAI, slid almost four percent. SoftBank and Oracle plummeted around nine and seven percent, respectively. DeepSeek claims its latest R1 model has a performance that's on par with OpenAI's o1 model, which was released last fall. If true -- to be clear, there are plenty of experts out there who claim the Chinese outfit may be fudging the numbers -- the latest AI chatbot could be a point of reckoning for the likes of OpenAI, who are riding an enormous wave of AI hype and lavish, multibillion-dollar deals. "DeepSeek's power implications for AI training punctures some of the [capital expenditure] euphoria which followed major commitments from Stargate and Meta last week," investment bank Jefferies analysts wrote in a note to investors, noting increasing "pressure on AI players to justify ever-increasing capex plans." Other analyst groups claimed the latest hype surrounding DeepSeek was overblown. "In short, we believe that 1) DeepSeek DID NOT 'build OpenAI for $5M'; 2) the models look fantastic but we don't think they are miracles; and 3) the resulting Twitterverse panic over the weekend seems overblown," Bernstein analysts wrote. Nonetheless, DeepSeek has sent ripples across the AI chatbot landscape and its leaders are broadcasting a clear message. "OpenAI is not a god, they won't necessarily always be at the forefront," the company's founder Lian Wenfeng told JP Morgan analysts. OpenAI CEO Sam Altman has so far remained silent on the matter. It's certainly a story to watch, as ungodly sums of money are being poured into building AI infrastructure in the US. Could the datacenters currently being built already be out of date as soon as they come online? Could OpenAI match DeepSeek's extremely lean operating model? How long would that take? For now, the Chinese company is certainly an outlier. "The number of companies who have $6 million to spend is vastly greater than the number of companies who have $100 million or $1 billion to spend," Page One Ventures investor Chris Nicholson told the New York Times. In short, DeepSeek has thrown down the gauntlet, and investors of US-based AI companies will be asking plenty of questions going forward.
[62]
DeepSeek: all the news about the startup that's shaking up AI stocks
DeepSeek is shaking up the AI industry with cost-efficient large-language models it claims can perform just as well as rivals from giants like OpenAI and Meta. The Chinese startup says its flagship R1 reasoning model is capable of achieving "performance comparable" to OpenAI's o1 equivalent, while the newly-released Janus Pro multimodal AI model can supposedly outperform Stable Diffusion and DALL-E 3. DeepSeek's ChatGPT competitor quickly soared to the top of the App Store, and the company is disrupting financial markets, with shares of Nvidia dipping 17 percent by 2PM on January 27th. The AI assistant is powered by the startup's "state-of-the-art" DeepSeek-V3 model, allowing users to ask questions, plan trips, generate text, and more. As downloads of DeepSeek's app spiked, the startup began restricting signups due to "malicious attacks."
[64]
Why everyone in AI is freaking out about DeepSeek
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More As of a few days ago, only the nerdiest of nerds (I say this as one) had ever heard of DeepSeek, a Chinese A.I. subsidiary of the equally evocatively named High-Flyer Capital Management, a quantitative analysis (or quant) firm that initially launched in 2015. Yet within the last few days, it's been arguably the most discussed company in Silicon Valley. That's largely thanks to the release of DeepSeek R1, a new large language model that performs "reasoning" similar to OpenAI's current best-available model o1 -- taking multiple seconds or minutes to answer hard questions and solve complex problems as it reflects on its own analysis in a step-by-step, or "chain of thought" fashion. Not only that, but DeepSeek R1 scored as high or higher than OpenAI's o1 on a variety of third-party benchmarks (tests to measure AI performance at answering questions on various subject matter), and was reportedly trained at a fraction of the cost (reportedly around $5 million) , with far fewer graphics processing units (GPU) under a strict embargo imposed by the U.S., OpenAI's home turf. But unlike o1, which is available only to paying ChatGPT subscribers of the Plus tier ($20 per month) and more expensive tiers (such as Pro at $200 per month), DeepSeek R1 was released as a fully open source model, which also explains why it has quickly rocketed up the charts of AI code sharing community Hugging Face's most downloaded and active models. Also, thanks to the fact that it is fully open source, people have already fine-tuned and trained many multiple variations of the model for different task-specific purposes such as making it small enough to run on a mobile device, or combining it with other open source models. Even if you want to use it for development purposes, DeepSeek's API costs are more than 90% cheaper than the equivalent o1 model from OpenAI. Most impressively of all, you don't even need to be a software engineer to use it: DeepSeek has a free website and mobile app even for U.S. users with an R1-powered chatbot interface very similar to OpenAI's ChatGPT. Except, once again, DeepSeek undercut or "mogged" OpenAI by connecting this powerful reasoning model to web search -- something OpenAI hasn't yet done (web search is only available on the less powerful GPT family of models at present). An open and shut irony There's a pretty delicious, or maybe disconcerting irony to this given OpenAI's founding goals to democratize AI to the masses. As NVIDIA Senior Research Manager Jim Fan put it on X: "We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive - truly open, frontier research that empowers all. It makes no sense. The most entertaining outcome is the most likely." Or as X user @SuspendedRobot put it, referencing reports that DeepSeek appears to have been trained on question-answer outputs and other data generated by ChatGPT: "OpenAI stole from the whole internet to make itself richer, DeepSeek stole from them and give it back to the masses for free I think there is a certain british folktale about this" Meta in crisis as open source Llama falls behind? But Fan isn't the only one to sit up and take note of DeepSeek's success. The open source availability of DeepSeek R1, its high performance, and the fact that it seemingly "came out of nowhere" to challenge the former leader of generative AI, has sent shockwaves throughout Silicon Valley and far beyond, based on my conversations and readings of various engineers, thinkers, and leaders. If not "everyone" is freaking out about it as my hyperbolic headline suggests, it's certainly the talk of the town in tech and business circles. A message posted to Blind, the app for sharing anonymous gossip in Silicon Valley, has been making the rounds suggesting Meta is in crisis over the success of DeepSeek because of how quickly it surpassed Meta's own efforts to be the king of open source AI with its Llama models. 'This changes the whole game' X user @tphuang wrote compellingly: "DeepSeek has commoditized AI outside of very top-end. Lightbulb moment for me in 1st photo. R1 is so much cheaper than US labor cost that many jobs will get automated away over next 5 yrs," later noting why DeepSeek's R1 is more enticing to users than even OpenAI's o1: "3 huge issues w/ o1: 1) too slow 2) too expensive 3) lack of control for end user/reliance on OpenAI R1 solves all of them. A company can buy their own Nvidia GPUs, run these models. Don't have to worry about additional costs or slow/unresponsive OpenAI servers" @tphaung also posed a compelling analogy as a question: "Will DeepSeek be to LLM what Android became to OS world?" Web entrepreneur Arnaud Bertrand didn't mince words about the startling implications of DeepSeek's success, either, writing on X: "There's no overstating how profoundly this changes the whole game. And not only with regards to AI, it's also a massive indictment of the US's misguided attempt to stop China's technological development, without which Deepseek may not have been possible (as the saying goes, necessity is the mother of inventions)." The censorship issue However, others have sounded cautionary notes on DeepSeek's rapid rise, arguing that as a startup operated out of China, it is necessarily subject to that country's laws and content censorship requirements. Indeed, my own usage of DeepSeek on the iOS app here in the U.S. found it would not answer questions about Tiananmen Square, the site of the 1989 pro-democracy student protests and uprising, and subsequent violent crackdown by the Chinese military, resulting in at least 200, possibly thousands of deaths, earning it the nickname "Tiananmen Square Massacre" in Western media outlets. Ben Hylak, a former Apple human interface designer and co-founder of AI product analytics platform Dawn, posted on X how asking about this subject caused DeepSeek R1 to enter a circuitous loop. As a member of the press itself, I of course take freedom of speech and expression extremely seriously and it is arguably one of the most fundamental, inarguable causes I champion. Yet I would be remiss not to note that OpenAI's models and products including ChatGPT also refuse to answer a whole range of questions about even innocuous content -- especially pertaining to human sexuality and erotic/adult, NSFW subject matter. It's not an apples-to-apples comparison, of course. And there will be some for whom the resistance to relying on foreign technology makes them skeptical of DeepSeek's ultimate value and utility. But there's no denying its performance and low cost. And in a time when 16.5% of all U.S. goods are imported by China, it's hard for me to caution against using DeepSeek R1 on the basis of censorship concerns or security risks -- especially when the model code is freely available to download, take offline, use on-device in secure environments, and to fine-tune at will. The first but hardly the last Chinese AI model to shake the world It also won't be the last Chinese AI model to threaten the dominance of Silicon Valley giants -- even as they, like OpenAI, raise more money than ever for their ambitions to develop artificial general intelligence (AGI), programs that outperform humans at most economically valuable work. Just yesterday, another Chinese model from TikTok parent company Bytedance -- called Doubao-1.5-pro -- was released with performance matching OpenAI's non-reasoning GPT-4o model on third-party benchmarks, but again, at 1/50th the cost. Chinese models have gotten so good, so fast, even those outside the tech industry are taking note: The Economist magazine just ran a piece on DeepSeek's success and that of other Chinese AI efforts, and political commentator Matt Bruenig posted on X that: "I have been extensively using Gemini, ChatGPT, and Claude for NLRB document summary for nearly a year. Deepseek is better than all of them at it. The chatbot version of it is free. Price to use it's API is 99.5% below the price of OpenAI's API. [shrug emoji]" How does OpenAI respond? Little wonder OpenAI co-founder and CEO Sam Altman today said that the company was bringing its yet-to-be released second reasoning model family, o3, to ChatGPT even for free users. OpenAI still appears to be carving its own path with more proprietary and advanced models -- setting the industry standard. But the question becomes: with DeepSeek, ByteDance, and other Chinese AI companies nipping at its heels, how long can OpenAI remain in the lead at making and releasing new cutting-edge AI models? And if it and when it falls, how hard and how fast will its decline be? OpenAI does have another historical precedent going for it, though. If DeepSeek and Chinese AI models do indeed become to LLMs as Google's open source Android did to mobile -- taking the lion's share of the market for a while -- you only have to see how the Apple iPhone with its locked down, proprietary, all-in house approach managed to carve off the high-end of the market and steadily expand downward from there, especially in the U.S., to the point that it now owns nearly 60% of the domestic smartphone market. Still, for all those spending big bucks to use AI models from leading labs, DeepSeek shows the same capabilities may be available for much cheaper and with much greater control. And in an enterprise setting, that may be enough to win the ballgame.
[65]
DeepSeek 'punctures' tech spending plans, and what analysts are saying | TechCrunch
Chinese AI firm DeepSeek has emerged as a potential challenger to U.S. AI leaders, demonstrating breakthrough models that claim to offer performance comparable to leading chatbots at a fraction of the cost. The company's mobile app, released in early January, has also topped iPhone charts across major markets including the U.S., UK, and China. Founded in 2023 by Liang Wenfeng, former chief of AI-driven quant hedge fund High-Flyer, DeepSeek makes its models open-source and incorporates a reasoning feature that articulates its thinking before providing responses. Wall Street's reaction has been mixed. While Jefferies warns that DeepSeek's efficient approach "punctures some of the capex euphoria" following recent spending commitments from Meta and Microsoft -- each exceeding $60 billion this year -- Citi questions whether such results were achieved without advanced GPUs. Goldman Sachs sees broader implications, suggesting the development could reshape competition between established tech giants and startups by lowering barriers to entry. Here's how Wall Street analysts are reacting to DeepSeek, in their own words (emphasis mine): DeepSeek's power implications for AI training punctures some of the capex euphoria which followed major commitments from Stargate and Meta last week. With DeepSeek delivering performance comparable to GPT-4o for a fraction of the computing power, there are potential negative implications for the builders, as pressure on AI players to justify ever increasing capex plans could ultimately lead to a lower trajectory for data center revenue and profit growth. If smaller models can work well, it is potentially positive for smartphone. We are bearish on AI smartphone as AI has gained no traction with consumers. More hardware upgrade (adv pkg+fast DRAM) is needed to run bigger models on the phone, which will raise costs. AAPL's model is in fact based on MoE, but 3bn data parameters are still too small to make the services useful to consumers. Hence DeepSeek's success offers some hope but there is no impact on AI smartphone's near-term outlook. China is the only market that pursues LLM efficiency owing to chip constraint. Trump/Musk likely recognize the risk of further restrictions is to force China to innovate faster. Therefore, we think it likely Trump will relax the AI Diffusion policy. While DeepSeek's achievement could be groundbreaking, we question the notion that its feats were done without the use of advanced GPUs to fine tune it and/or build the underlying LLMs the final model is based on through the Distillation technique. While the dominance of the US companies on the most advanced AI models could be potentially challenged, that said, we estimate that in an inevitably more restrictive environment, US' access to more advanced chips is an advantage. Thus, we don't expect leading AI companies would move away from more advanced GPUs which provide more attractive $/TFLOPs at scale. We see the recent AI capex announcements like Stargate as a nod to the need for advanced chips. In short, we believe that 1) DeepSeek DID NOT "build OpenAI for $5M"; 2) the models look fantastic but we don't think they are miracles; and 3) the resulting Twitterverse panic over the weekend seems overblown. Our own initial reaction does not include panic (far from it). If we acknowledge that DeepSeek may have reduced costs of achieving equivalent model performance by, say, 10x, we also note that current model cost trajectories are increasing by about that much every year anyway (the infamous "scaling laws...") which can't continue forever. In that context, we NEED innovations like this (MoE, distillation, mixed precision etc) if AI is to continue progressing. And for those looking for AI adoption, as semi analysts we are firm believers in the Jevons paradox (i.e. that efficiency gains generate a net increase in demand), and believe any new compute capacity unlocked is far more likely to get absorbed due to usage and demand increase vs impacting long term spending outlook at this point, as we do not believe compute needs are anywhere close to reaching their limit in AI. It also seems like a stretch to think the innovations being deployed by DeepSeek are completely unknown by the vast number of top tier AI researchers at the world's other numerous AI labs (frankly we don't know what the large closed labs have been using to develop and deploy their own models, but we just can't believe that they have not considered or even perhaps used similar strategies themselves). We have not confirmed the veracity of these reports, but if they are accurate, and advanced LLM are indeed able to be developed for a fraction of previous investment, we could see generative AI run eventually on smaller and smaller computers (downsizing from supercomputers to workstations, office computers, and finally personal computers) and the SPE industry could benefit from the accompanying increase in demand for related products (chips and SPE) as demand for generative AI spreads. With the latest developments, we also see 1) potential competition between capital-rich internet giants vs. start-ups, given lowering barriers to entry, especially with recent new models developed at a fraction of the cost of existing ones; 2) from training to more inferencing, with increased emphasis on post-training (including reasoning capabilities and reinforcement capabilities) that requires significantly lower computational resources vs. pre-training; and 3) the potential for further global expansion for Chinese players, given their performance and cost/price competitiveness. We continue to expect the race for AI application/AI agents to continue in China, especially amongst To-C applications, where China companies have been pioneers in mobile applications in the internet era, e.g., Tencent's creation of the Weixin (WeChat) super-app. Amongst To-C applications, ByteDance has been leading the way by launching 32 AI applications over the past year. Amongst them, Doubao has been the most popular AI Chatbot thus far in China with the highest MAU (c.70mn), which has recently been upgraded with its Doubao 1.5 Pro model. We believe incremental revenue streams (subscription, advertising) and eventual/sustainable path to monetization/positive unit economics amongst applications/agents will be key. For the infrastructure layer, investor focus has centered around whether there will be a near-term mismatch between market expectations on AI capex and computing demand, in the event of significant improvements in cost/model computing efficiencies. For Chinese cloud/data center players, we continue to believe the focus for 2025 will center around chip availability and the ability of CSP (cloud service providers) to deliver improving revenue contribution from AI-driven cloud revenue growth, and beyond infrastructure/GPU renting, how AI workloads & AI related services could contribute to growth and margins going forward. We remain positive on long-term AI computing demand growth as a further lowering of computing/training/inference costs could drive higher AI adoption. See also Theme #5 of our key themes report for our base/bear scenarios for BBAT capex estimates depending on chip availability, where we expect aggregate capex growth of BBAT to continue in 2025E in our base case (GSe: +38% yoy) albeit at a slightly more moderate pace vs. a strong 2024 (GSe: +61% yoy), driven by ongoing investment into AI infrastructure. Above all, much is made of DeepSeek's research papers, and of their models' efficiency. It's unclear to what extent DeepSeek is leveraging High-Flyer's ~50k hopper GPUs (similar in size to the cluster on which OpenAI is believed to be training GPT-5), but what seems liklely is that they're dramatically reducing costs (inference costs for their V2 model, for example, are claimed to be 1/7 that of GPT-4 Turbo). Their subversive (though not new) claim - that started to hit the US AI names this week - is that "more investments do not equal more innovation." Liang: "Right now I don't see any new approaches, but big firms do not have a clear upper hand. Big firms have existing customers, but their cash-flow businesses are also their burden, and this makes them vulnerable to disruption at any time." And when asked about the fact that GPT5 has still not been released: "OpenAI is not a god, they won't necessarily always be at the forefront."
[0]
DeepSeek R1 Vs. OpenAI: What Consumers Need To Know About AI's Latest Challenger - Microsoft (NASDAQ:MSFT), Meta Platforms (NASDAQ:META)
Consumers weigh affordability against advanced features in AI choices. The race among large language models has intensified with DeepSeek R1 emerging as a formidable competitor to established players like OpenAI's o1 and Meta Platforms Inc.'s META Llama 3.2. Each model brings distinct strengths to the table, with DeepSeek emphasizing cost-efficiency and performance, while its rivals focus on multimodal capabilities and advanced reasoning. What Happened: DeepSeek R1's recent launch has fueled comparisons with OpenAI o1 and Meta's Llama 3.2, particularly in terms of technical specifications and cost advantages, Digit.in reports. DeepSeek R1, leveraging a Mixture-of-Experts architecture, activates only a subset of its 671 billion parameters per query to balance efficiency and performance. In contrast, OpenAI o1 prioritizes multi-step reasoning with a 200K context window, and Meta's Llama 3.2 offers scalable multimodal capabilities optimized for edge deployments. Wedbush analyst Daniel Ives remains unconvinced by DeepSeek's market impact, noting that while it has gained popularity, major enterprises are unlikely to adopt it over more established solutions from U.S. tech giants. Also Read: MicroCloud Hologram Joins With DeepSeek For Holographic AI Facilities Why It Matters: DeepSeek R1 stands out with its affordability, being nearly 27.4 times cheaper than proprietary alternatives, making it attractive for cost-sensitive applications. However, OpenAI o1 continues to dominate complex problem-solving tasks such as coding and scientific analysis, thanks to its chain-of-thought processing and reinforcement learning techniques. Meta's Llama 3.2, on the other hand, caters to multimodal applications, integrating text and image processing for enhanced versatility, according to Digit.in. Despite DeepSeek's rapid rise, industry experts highlight that long-term adoption will depend on enterprise acceptance and broader infrastructure investments. Established players such as Nvidia Corp. NVDA and Microsoft Corp. MSFT continue to dominate due to their extensive ecosystem support and market reach. As competition in the AI space heats up, DeepSeek R1's emergence underscores the evolving landscape, offering a cost-effective alternative while facing stiff competition from technologically advanced and widely integrated solutions. Read Next: DeepSeek Sparks AI Carnage As Magnificent Seven, Broadcom Lose Nearly $900 Billion In Value: Why This Expert Is Still 'Bullish' On US Tech Image created using artificial intelligence via Midjourney. METAMeta Platforms Inc$651.980.69%Overview Rating:Good62.5%Technicals Analysis1000100Financials Analysis400100WatchlistOverviewMSFTMicrosoft Corp$427.69-3.69%NVDANVIDIA Corp$119.61-16.1% This content was partially produced with the help of AI tools and was reviewed and published by Benzinga editors. Market News and Data brought to you by Benzinga APIs
[6]
What Is DeepSeek? China Shocks AI Industry With 'Sputnik Moment'
Chinese AI company DeepSeek released an AI model that is sending shockwaves through the US tech industry due to its low cost and high performance. Tech investor Marc Andreessen calls it "AI's Sputnik moment," causing US tech stocks to tumble this morning, with Nvidia the hardest hit, down 11%. The app surpassed ChatGPT on the Apple App Store as the most-downloaded free application. But after receiving a flood of registrations and "large-scale malicious attacks" -- possibly a DDoS attack -- DeepSeek has "temporarily" restricted access to those with a +86 mainland Chinese phone number. You may still be able to sign in with a Google account. We downloaded it from the app store before this, and found the interface nearly identical to ChatGPT. It currently has a knowledge cutoff of Oct. 2023, as shown in the screenshot below. But the big game-changer is not the chatbot experience -- it's the open source model that powers it on the backend. It appears to be the highest-performing available for an ultra-low price. Being open source, anyone with the right skills can download it and use it. The model, dubbed R1, came out on Jan. 20, a few months after DeepSeek released its first model in the fall of 2024. The company says its performance is "on par with OpenAI-o1." That's the model OpenAI debuted in Sept. 2024 with the ability to follow a chain of thought. The company called the approach a "new paradigm" that would bring AI models closer to human-level intelligence. Looking at the pricing specs for DeepSeek R1 reasoner model and OpenAI's GPT-o1, you'll see R1 is a fraction of the cost. DeepSeek charges $2.19 per 1 million output tokens, compared to $60.00 for GPT-o1. Even OpenAI's mini version of GPT-o1, which is intended to be lightweight, more affordable version, costs $12.00. Like DeepSeek, Meta has made open source its central AI strategy with its Llama models, which are currently free. "To people who see the performance of DeepSeek and think, 'China is surpassing the US in AI.' You are reading this wrong," says Meta's chief AI scientist Yann LeCun on Threads. "The correct reading is: 'Open source models are surpassing proprietary ones.'" Meta has set up four war rooms to analyze DeepSeek's technology, focusing on how the company reduced training costs and what data it may have used, according to The Information. Controversial Chips, Data Collection There is some controversy over the chips that power R1. DeepSeek founder Liang Wenfeng, a 40-year-old businessman, reportedly stocked up on Nvidia A100 chips before they were banned from export to China in Sept. 2022. DeepSeek claims R1 runs on lower-capacity Nvidia H800 chips, but it's difficult to know exactly what is going on behind closed doors, especially since A100 chips are still being snuck into China in violation of US sanctions. "Rather than weakening China's AI capabilities, the sanctions appear to be driving startups like DeepSeek to innovate in ways that prioritize efficiency, resource-pooling, and collaboration," says the MIT Technology Review. The constrained chip supply may have forced the company to place a premium on efficient computing from the start of development. More efficient computing also means less energy use -- a welcome innovation given AI's astronomical energy needs. Why has Nvidia's stock price tumbled, even though DeepSeek is still using its chips? Because its AI model does more with less. US tech companies are now coming under fire for a potentially bloated approach to AI development, requiring more and more chips, while being less efficient with them than DeepSeek appears to be. Last week, President Donald Trump announced a $500 billion investment in AI data centers with the Stargate Project. Data collection is another concern with DeepSeek. Like TikTok, the company is obligated to share user data with the Chinese government if it requests it. The privacy policy lists the data it collects, and how it will use it, including to "comply with our legal obligations." As an open source platform, that means US companies may feed it their data in addition to everyday chatbot users.
[24]
ETtech Explainer: What is DeepSeek, China's competitor to OpenAI?
After two years of American companies such as OpenAI, Google, Meta, and Amazon dominating the artificial intelligence (AI) discourse, a little-known and scrappy Chinese AI lab -- DeepSeek -- has entered the chat. What is DeepSeek? DeepSeek is a Chinese AI startup specialising in open-source large language models (LLMs). It recently garnered significant attention following the successful launch of its LLM, DeepSeek V3, which demonstrated impressive capabilities. When launched in December 2024, DeepSeek V3 demonstrated superior performance across benchmarks compared to leading models from OpenAI, Anthropic, and Meta. This includes excelling in coding, mathematical problem-solving, and even identifying code errors. A fortnight later, the company unveiled DeepSeek R1, which caused a stir in the tech world. It showcased advancements in reasoning and problem-solving that were on par with or better than many existing models. The R1 uses large-scale reinforcement learning (RL) to process data and create responses. Its performance is comparable to OpenAI's O1 model in areas such as mathematical ability, coding proficiency, and general knowledge comprehension. At the same time, it excels in creative writing, general question answering, editing, and summarisation tasks. Low-cost What has shocked Silicon Valley players is that it was built for a fraction of the cost required to develop the top-performing OpenAI and Meta models. The startup claims that it developed this outperforming, low-cost model within two months and with an investment of less than $6 million ($5.58 million, to be precise). This starkly contrasts the $100 million OpenAI reportedly spent on training its GPT-4 model. The Chinese breakout AI company has used graphics processing units (GPUs) regarded as last-generation in the US (such as Nvidia's H100 AI GPUs) to train its models. A significant reason is the export controls imposed by the Joe Biden administration in October 2022 on advanced computing chips, which sought to hinder China's advances in AI. While this may have limited Chinese companies' access to advanced GPUs, it didn't stop them from maximising the abilities of the chips they had, as DeepSeek has shown. Besides the scrappy 'garage mindset' mentality that DeepSeek may embody, its sudden popularity is attributed to its affordability. Its R1 model offers developers, researchers, and organisations seeking AI solutions $0.55 per million input tokens and $2.19 per million output tokens. For comparison, OpenAI charges $15 per million input tokens and $60 per output token. DeepSeek's ability to dramatically reduce inference costs is causing concern among US tech giants. Perplexity AI CEO Aravind Srinivas commented on X, "DeepSeek has effectively replicated O1-mini and made it open-source." Microsoft CEO Satya Nadella also remarked that DeepSeek's arrival should be taken seriously. Open source fight DeepSeek has released several models, including DeepSeek Coder, DeepSeek LLM, and DeepSeek R1. These models are free of charge for researchers and commercial users. The company is gaining popularity for its innovative approach to AI development and its commitment to open source, which Elon Musk claims OpenAI was intended to embody. This development comes amid the Altman-led company's efforts to transition to a for-profit model, which resulted in a legal dispute with the CEOs of Tesla and SpaceX. In the broader context, China is moving towards open sourcing. For instance, Alibaba Cloud has released over 100 open-source large language models (LLMs) as part of its Qwen 2.5 series. These models range from 7 billion to 72 billion parameters and support more than 29 languages. Versus US This occurs as the US-China "chip war" escalates into a significant geopolitical battleground. Due to national security and technological dominance concerns, the Biden administration has imposed strict export controls on advanced semiconductor manufacturing equipment for China, placing Chinese tech firms at a disadvantage against their Western counterparts. There is also growing concern, particularly in the West, that Chinese LLMs are subject to government review, specifically concerning their adherence to "core socialist values," as reported by the Financial Times in July last year. The report added that regulators would test responses to politically sensitive topics such as the 1989 Tiananmen Square massacre, Taiwan's independence, and Chinese President Xi Jinping. The High-Flyers The company's CEO, Liang Wenfeng, founded DeepSeek in 2023 as an AI lab within his quantitative hedge fund. Wenfeng studied artificial intelligence as an undergraduate at China's Zhejiang University before establishing High-Flyer in 2015, a hedge fund focused on algorithmic trading. In 2021, High-Flyer relied exclusively on AI for its trading operations, which interestingly accounts for its possession of Nvidia chips. In April 2023, the fund announced the formation of a separate entity dedicated to artificial general intelligence (AGI) research. This new entity would operate independently from High-Flyer's financial business, even as High-Flyer remains DeepSeek's sole investor.
[48]
What Is China's DeepSeek and Why Is It Freaking Out the AI World?
DeepSeek, an AI startup just over a year old, stirred awe and consternation in Silicon Valley with its breakthrough artificial intelligence model that offered comparable performance to the world's best chatbots at seemingly a fraction of the cost. Created in China's Hangzhou, DeepSeek carries far-reaching implications for the global tech industry and supply chain, offering a counterpoint to the widespread belief that the future of AI will require ever-increasing amounts of power and energy to develop. DeepSeek is a Chinese startup founded in 2023 by Liang Wenfeng, the chief of AI-driven quant hedge fund High-Flyer. The company develops open-source AI models, and its eponymous mobile app surged to the top of the iPhone's download charts in the US after its release in early January.
[51]
DeepSeek: everything you need to know about the AI that dethroned ChatGPT
Table of Contents Table of Contents What is DeepSeek? What can DeepSeek do? Who can use DeepSeek? Why is DeepSeek suddenly such a big deal? A year-old startup out of China is taking the AI industry by storm after releasing a chatbot which rivals the performance of ChatGPT while using a fraction of the power, cooling, and training expense of what OpenAI, Google, and Anthropic's systems demand. Here's everything you need to know about Deepseek's V3 and R1 models and why the company could fundamentally upend America's AI ambitions. What is DeepSeek? DeepSeek (technically, "Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.") is a Chinese AI startup that was originally founded as an AI lab for its parent company, High-Flyer, in April, 2023. That May, DeepSeek was spun off into its own company (with High-Flyer remaining on as an investor) and also released its DeepSeek-V2 model. V2 offered performance on par with other leading Chinese AI firms, such as ByteDance, Tencent, and Baidu, but at a much lower operating cost. Recommended Videos The company followed up with the release of V3 in December 2024. V3 is a 671 billion-parameter model that reportedly took less than 2 months to train. What's more, according to a recent analysis from Jeffries, DeepSeek's "training cost of only US$5.6m (assuming $2/H800 hour rental cost). That is less than 10% of the cost of Meta's Llama." That's a tiny fraction of the hundreds of millions to billions of dollars that US firms like Google, Microsoft, xAI, and OpenAI have spent training their models. 🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 API compatibility intact 🌍 Fully open-source models & papers 🐋 1/n pic.twitter.com/p1dV9gJ2Sd — DeepSeek (@deepseek_ai) December 26, 2024 Benchmark tests put V3's performance on par with GPT-4o and Claude 3.5 Sonnet. A December 2024 Op-Ed in The Hill categorized DeepSeek's success as America's "Sputnik Moment." DeepSeek released its R1-Lite-Preview model in November 2024, claiming that the new model could outperform OpenAI's o1 family of reasoning models (and do so at a fraction of the price). The company estimates that the R1 model is between 20 and 50 times less expensive to run, depending on the task, than OpenAI's o1. DeepSeek subsequently released DeepSeek-R1 and DeepSeek-R1-Zero in January 2025. The R1 model, unlike its o1 rival, is open source, which means that any developer can use it. As such V3 and R1 have exploded in popularity since their release, with DeepSeek's V3-powered AI Assistant displacing ChatGPT at the top of the app stores. Venture capitalist Marc Andreesen, in a recent social media post, called DeepSeek's chatbot "one of the most amazing and impressive breakthroughs I've ever seen" and a "profound gift to the world." What can DeepSeek do? As an open-source large language model, DeepSeek's chatbots can do essentially everything that ChatGPT, Gemini, and Claude can. That includes text, audio, image, and video generation. What's more, DeepSeek's newly released family of multimodal models, dubbed Janus Pro, reportedly outperforms DALL-E 3 as well as PixArt-alpha, Emu3-Gen, and Stable Diffusion XL, on a pair of industry benchmarks. DeepSeek-R1, rivaling o1, is specifically designed to perform complex reasoning tasks, while generating step-by-step solutions to problems and establishing "logical chains of thought," where it explains its reasoning process step-by-step when solving a problem. oh boy #deepseek — Alexios Mantzarlis (@mantzarlis.com) 2025-01-27T16:50:40.640Z What DeepSeek's products can't do is talk about Tienanmen Square. Or the Yellow Umbrella protests. Or President Xi Jinping's likeness to Winnie the Pooh. Basically, if it's a subject considered verboten by the Chinese Communist Party, DeepSeek's chatbots will not address it or engage in any meaningful way. Who can use DeepSeek? As an open-source LLM, DeepSeek's model can be used by any developer for free. OpenAI charges $200 per month for the Pro subscription needed to access o1. DeepSeek's models are available on the web, through the company's API, and via mobile apps. You will need to sign up for a free account at the DeepSeek website in order to use it, however the company has temporarily paused new sign ups in response to "large-scale malicious attacks on DeepSeek's services." Existing users can sign in and use the platform as normal, but there's no word yet on when new users will be able to try DeepSeek for themselves. Why is DeepSeek suddenly such a big deal? Since the release of ChatGPT in November 2023, American AI companies have been laser-focused on building bigger, more powerful, more expansive, more power and resource-intensive large language models. Rather than seek to build more cost-effective and energy-efficient LLMs, companies like OpenAI, Microsoft, Anthropic, and Google instead saw fit to simply brute force the technology's advancement by, in the American tradition, simply throwing absurd amounts of money and resources at the problem. In 2024 alone, xAI CEO Elon Musk was expected to personally spend upwards of $10 billion on AI initiatives. OpenAI and its partners just announced a $500 billion Project Stargate initiative that would drastically accelerate the construction of green energy utilities and AI data centers across the US. Google plans to prioritize scaling the Gemini platform throughout 2025, according to CEO Sundar Pichai, and is expected to spend billions this year in pursuit of that goal. Meta announced in mid-January that it would spend as much as $65 billion this year on AI development. DeepSeek just showed the world that none of that is actually necessary -- that the "AI Boom" which has been helping spur the American economy in recent months and which has made GPU companies like Nvidia exponentially more wealthy than they were in October 2023, may be nothing more than a sham. It also calls into question just how much of a lead the US actually has in AI, despite repeatedly banning shipments of leading-edge GPUs to China over the past year. "The bottom line is the US outperformance has been driven by tech and the lead that US companies have in AI," Keith Lerner, an analyst at Truist, told CNN. "The DeepSeek model rollout is leading investors to question the lead that US companies have and how much is being spent and whether that spending will lead to profits (or overspending)." In short, DeepSeek just beat the American AI industry at its own game, showing that the current mantra of "growth at all costs" is no longer valid. "DeepSeek clearly doesn't have access to as much compute as U.S. hyperscalers and somehow managed to develop a model that appears highly competitive," Srini Pajjuri, semiconductor analyst at Raymond James, told CNBC. If a Chinese startup can build an AI model that works just as well as OpenAI's latest and greatest, and do so in under two months and for less than $6 million, then what use is Sam Altman anymore? "Time will tell if the DeepSeek threat is real -- the race is on as to what technology works and how the big Western players will respond and evolve," Michael Block, market strategist at Third Seven Capital, told CNN. "Markets had gotten too complacent on the beginning of the Trump 2.0 era and may have been looking for an excuse to pull back -- and they got a great one here."
[52]
Why is the AI world freaking out over China's DeepSeek?
DeepSeek, an AI startup just over a year old, has stirred awe and consternation in Silicon Valley with its breakthrough artificial intelligence model that offers comparable performance to the world's best chatbots at seemingly a fraction of the cost. Created in China's Hangzhou, DeepSeek carries far-reaching implications for the global tech industry and supply chain, offering a counterpoint to the widespread belief that the future of AI will require ever-increasing amounts of power and energy to develop.
[53]
China's DeepSeek Surprise
One week ago, a new and formidable challenger for OpenAI's throne emerged. A Chinese AI start-up, DeepSeek, launched a model that appeared to match the most powerful version of ChatGPT -- but, at least according to its creator, was a fraction of the cost to build. The program, called DeepSeek-R1, has incited plenty of concern: Ultrapowerful Chinese AI models are exactly what many leaders of American AI companies feared when they, and more recently President Donald Trump, have sounded alarms about a technological race between the United States and the People's Republic of China. This is a "wake up call for America," Alexandr Wang, the CEO of Scale AI, commented on social media. But at the same time, many Americans -- including much of the tech industry -- appear to be lauding this Chinese AI. As of this morning, DeepSeek had overtaken ChatGPT as the top free application on Apple's mobile-app store in the U.S. Researchers, executives, and investors have been heaping on praise. The new DeepSeek model "is one of the most amazing and impressive breakthroughs I've ever seen," the venture capitalist Marc Andreessen, an outspoken supporter of Trump, wrote on X. The program shows "the power of open research," Yann LeCun, Meta's chief AI scientist, wrote online. Indeed, the most notable feature of DeepSeek may be not that it is Chinese, but that it is relatively open. Unlike top American AI labs -- OpenAI, Anthropic, and Google DeepMind -- which keep their research almost entirely under wraps, DeepSeek has made the program's final code, as well as an in-depth technical explanation of the program, free to view, download, and modify. In other words, anybody from any country, including the U.S., can use, adapt, and even improve upon the program. That openness makes DeepSeek a boon for American start-ups and researchers -- and an even bigger threat to the top U.S. companies, as well as the government's national-security interests. To understand what's so impressive about DeepSeek, one has to look back to December, when OpenAI launched its own technical breakthrough: the full release of o1, a new kind of AI model that, unlike all the "GPT"-style programs before it, appears able to "reason" through challenging problems. o1 displayed leaps in performance on some of the most challenging math, coding, and other tests available, and sent the rest of the AI industry scrambling to replicate the new reasoning model -- which OpenAI disclosed very few technical details about. The start-up, and thus the American AI industry, were on top. (The Atlantic recently entered into a corporate partnership with OpenAI.) DeepSeek, less than two months later, not only exhibits those same "reasoning" capabilities apparently at much lower costs, but has spilled at least one way to match OpenAI's more covert methods to the rest of the world. The program is not entirely open-source -- its training data, for instance, and the fine details of its creation are not public -- but, unlike with ChatGPT, Claude, or Gemini, researchers and start-ups can still study the DeepSearch research paper and directly work with its code. OpenAI has enormous amounts of capital, computer chips, and other resources, and has been working on AI for a decade. In comparison, DeepSeek is a smaller team formed two years ago with far less access to essential AI hardware, because of U.S. export controls on advanced AI chips, but it has relied on various software and efficiency improvements to catch up. DeepSeek has reported that the final training run of a previous iteration of the model that R1 is built from, released in December, cost less than $6 million. Meanwhile, Dario Amodei, the CEO of Anthropic, has said that U.S. companies are already spending on the order of $1 billion to train future models. Exactly how much the latest DeepSeek cost to build is uncertain -- some researchers and executives, including Wang, have cast doubt on just how cheap it could have been -- but the price for software developers to incorporate DeepSeek-R1 into their own products is roughly 95 percent cheaper than incorporating OpenAI's o1, as measured by the price of every "token" -- basically, every word -- the model generates. DeepSeek's success has abruptly forced a wedge between Americans most directly invested in outcompeting China and those who benefit from any access to the best, most reliable AI models. (It's a divide that echoes Americans' attitudes about TikTok -- China hawks versus content creators -- and China's other apps and platforms.) For the start-up and research community, DeepSeek is an enormous win. "A non-US company is keeping the original mission of OpenAI alive," Jim Fan, a top AI researcher at the chipmaker Nvidia and former OpenAI employee, wrote on X. "Truly open, frontier research that empowers all." But for America's top AI companies, and the nation's government, what DeepSeek represents is unclear. The stocks of many major tech firms -- including Nvidia, Alphabet, and Microsoft -- dropped this morning amid the excitement around the Chinese model. And Meta, which has branded itself as a champion of open-source models in contrast to OpenAI, now seems a step behind. (The company is reportedly panicking.) To some investors, all those massive data centers, billions of dollars of investment, or even the half-a-trillion-dollar AI-infrastructure joint venture from OpenAI, Oracle, and SoftBank, which Trump recently announced from the White House, could seem far less essential. Maybe bigger AI isn't better. For those who fear that AI will strengthen "the Chinese Communist Party's global influence," as OpenAI wrote in a recent lobbying document, this is legitimately concerning: The DeepSeek app refuses to answer questions about, for instance, the Tiananmen Square protests and massacre of 1989 (although the censorship may be relatively easy to circumvent). None of that is to say the AI boom is over, or will take a radically different form going forward. The next iteration of OpenAI's reasoning models, o3, appears far more powerful than o1 and will soon be available to the public. There are some signs that DeepSeek trained on ChatGPT outputs (outputting "I'm ChatGPT" when asked what model it is), although perhaps not intentionally -- if that's the case, it's possible that DeepSeek could only get a head start thanks to other high-quality chatbots. America's AI innovation is accelerating, and its major forms are beginning to take on a technical research focus other than reasoning: "agents," or AI systems that can use computers on behalf of humans. American tech giants could, in the end, even benefit. Satya Nadella, the CEO of Microsoft, framed DeepSeek as a win: More efficient AI means that use of AI across the board will "skyrocket, turning it into a commodity we just can't get enough of," he wrote on X today -- which, if true, would help Microsoft's profits as well. Still, the pressure is on OpenAI, Google, and their competitors to maintain their edge. With the release of DeepSeek, the nature of any U.S.-China AI "arms race" has shifted. Preventing AI computer chips and code from spreading to China evidently has not tamped the ability of researchers and companies located there to innovate. And the relatively transparent, publicly available version of DeepSeek, rather than leading American programs, could mean Chinese programs and approaches become global technological standards for AI -- akin to how the open-source Linux operating system is now standard for major web servers and supercomputers. Being democratic -- in the sense of vesting power in software developers and users -- is precisely what has made DeepSeek a success. If Chinese AI maintains its transparency and accessibility, despite emerging from an authoritarian regime whose citizens can't even freely use the web, it is moving in exactly the opposite direction of where America's tech industry is heading.
[54]
Chinese AI Startup DeepSeek Shakes up Silicon Valley, Wall Street | PYMNTS.com
Shares of Nvidia, the darling stock of the generative AI boom, are down 15.3% in midday trading on Monday (Jan. 27) as Wall Street grapples with the revelation that it could be overpaying for this emerging technology. Google, Microsoft and Amazon shares are also down. The culprit: Chinese AI startup DeepSeek's artificial intelligence (AI) models that perform on par with OpenAI's and Google's top models but at a fraction of the cost and with far fewer of Nvidia's GPUs. DeepSeek was founded by Liang Wenfeng, who also co-founded a quantitative hedge fund in China called High-Flyer. DeepSeek has released a family of models: V3 (AI chat) and R1 (reasoning models). The startup said R1 had cost $5.58 million and used 2,048 of Nvidia's H800 chips to train, according to a paper published by its researchers. Anthropic CEO Dario Amodei has said that AI models currently cost $100 million to train, but it could hit $100 billion. AI executives have also said training would need thousands of AI chips, principally those made by Nvidia. The sell-off comes as OpenAI, SoftBank, Oracle and MGX announced a project called Stargate last week that plans to spend $100 billion to half a trillion dollars to build AI infrastructure, mainly data centers. Those and other AI cost estimates are now in question. Shares of Microsoft were down 3.7%, Google fell 3% and Amazon gave up about 1%. "Artificial intelligence has reached a critical inflection point. The industry stands at a crossroads where escalating costs, environmental concerns, and innovation appear intertwined, threatening to stifle accessibility and adoption," Gokul Naidu, senior manager at SAP, told PYMNTS. "Enter DeepSeek-R1, the model that's turning heads in Silicon Valley and beyond for proving that high performance and affordability aren't mutually exclusive." "DeepSeek challenges the narrative that innovation must come at an unsustainable cost," Naidu said. "For businesses, this means AI could soon be accessible to small and medium enterprises, not just tech giants with deep pockets." DeepSeek could "catalyze a shift in Silicon Valley's approach to AI," Naidu said. "Historically, the tech epicenter has prioritized growth at all costs, often ignoring questions of efficiency. But DeepSeek forces a recalibration. With its dramatically lower cost structure, businesses and developers alike are reevaluating their priorities, focusing on efficiency-driven growth." DeepSeek is now the most downloaded app in the Apple App Store. OpenAI CEO Sam Altman pushed back in a post on X last month, when DeepSeek V3 first came out, saying, "It is (relatively) easy to copy something that you know works. it is extremely hard to do something new, risky, and difficult when you don't know if it will work. individual researchers rightly get a lot of glory for that when they do it! it's the coolest thing in the world." DeepSeek also reportedly is subject to Chinese censorship, refusing to answer questions about Taiwan, for example. (Developers can get around this by revising the open-source models.) Marc Andreessen, from VC firm Andreessen Horowitz, posted on X that "DeepSeek R1 is one of the most amazing and impressive breakthroughs I've ever seen -- and as open source, a profound gift to the world." Naidu said DeepSeek's open-source strategy "further amplifies its impact. OpenAI's proprietary approach, while effective, limits transparency and collaboration. By contrast, DeepSeek invites developers worldwide to contribute, experiment, and build on its platform. This open-source ethos democratizes AI capabilities, fostering a more inclusive innovative ecosystem." Meta Chief AI Scientist Yann LeCun said in a post on LinkedIn that people concerned that China is overtaking the U.S. in AI are misreading DeepSeek's impact. "The correct reading is: Open source models are surpassing proprietary ones. DeepSeek has profited from open research and open source. ... They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it. That is the power of open research and open source." Meta, whose strategy was to distribute open-source AI models, saw its shares up 1%. With open source, any developer can download and fine-tune, or retrain to customize, their AI models. However, developers typically still have to pay fees to access the model through APIs. While Meta doesn't charge to access its API, DeepSeek does. But Naidu said DeepSeek is also deeply discounted here: "DeepSeek charges $0.55 per million tokens, compared to OpenAI's o1 model, which demands $15 per million tokens. For businesses wary of runaway operational bills, this is a game-changer." Many developers build their own AI applications atop the foundation models from OpenAI, Google, Anthropic and others. To access and use these models, such as inputting prompts and getting responses, developers have to pay AI pricing based on token volume. (A thousand tokens translates to roughly 750 words.) Naidu also pointed out that DeepSeek was also able to get around President Joe Biden's export controls on advanced AI chips, which he recently expanded to carve out different levels of access for more than 120 countries. "DeepSeek's achievement is particularly impressive given its constraints," Naidu said. "As a Chinese company facing export restrictions, it was unable to access the latest Nvidia GPUs, such as the H200, and instead relied on older H800 GPUs. Yet, the R1 model still managed to rival -- or, in some tests, outperform -- more expensive counterparts." "This feat speaks volumes about the ingenuity behind DeepSeek's approach. It's not just about throwing money at the problem; it's about finding smarter, leaner ways to train and deploy AI systems," Naidu added. Naidu cited the following ripple effects from DeepSeek: "DeepSeek is more than a model -- it's a wake-up call for the entire AI industry," Naidu said. "It challenges entrenched assumptions about the cost of innovation and offers a path forward where cutting-edge technology is both affordable and sustainable. As we move deeper into 2025, the conversation around AI is no longer just about power -- it's about power at the right price. "And in this race, DeepSeek might just have redefined the finish line."
[55]
China releases a cheap, open rival to ChatGPT, thrilling some scientists and panicking Silicon Valley
The DeepSeek logo displayed on a smartphone screen. (Image credit: VCG/VCG via Getty Images) China has released a cheap, open-source rival to OpenAI's ChatGPT, and it has some scientists excited and Silicon Valley worried. DeepSeek, the Chinese artificial intelligence (AI) lab behind the innovation, unveiled its free large language model (LLM) DeepSeek-V3 in late December 2024 and claims it was built in two months for just $5.58 million -- a fraction of the time and cost required by its Silicon Valley competitors. Following hot on its heels is an even newer model called DeepSeek-R1, released Monday (Jan. 20). In third-party benchmark tests, DeepSeek-V3 matched the capabilities of OpenAI's GPT-4o and Anthropic's Claude Sonnet 3.5 while outperforming others, such as Meta's Llama 3.1 and Alibaba's Qwen2.5, in tasks that included problem-solving, coding and math. Now, R1 has also surpassed ChatGPT's latest o1 model in many of the same tests. This impressive performance -- at a fraction of the cost of other models, alongside its semi-open-source nature -- has wowed AI experts and raised the specter of China's AI models surpassing their U.S. counterparts. "We should take the developments out of China very, very seriously," Satya Nadella, the CEO of Microsoft, a strategic partner of OpenAI, said at the World Economic Forum in Davos, Switzerland, on Jan. 22.. Related: AI can now replicate itself -- a milestone that has experts terrified AI systems learn using training data taken from human input, which enables them to generate output based on the probabilities of different patterns cropping up in that training dataset. For large language models, these data are text. For instance, OpenAI's GPT-3.5, which was released in 2023, was trained on roughly 570GB of text data from the repository Common Crawl -- which amounts to roughly 300 billion words -- taken from books, online articles, Wikipedia and other webpages. Reasoning models, such as R1 and o1, are an upgraded version of standard LLMs that use a method called "chain of thought" to backtrack and reevaluate their logic, which enables them to tackle more complex tasks with greater accuracy. This has made reasoning models popular among scientists who are looking to integrate AI into their work. But unlike ChatGPT's o1, DeepSeek is an "open-weight" model that (although its training data remains proprietary) enables scientists to peer inside and modify its algorithm. Just as important is its reduced price for users -- 27 times less than o1. Besides its performance, the hype around DeepSeek comes from its cost efficiency; the model's shoestring budget is minuscule compared with the tens of millions to hundreds of millions that rival companies spent to train its competitors. In addition, U.S. export controls, which limit Chinese companies' access to the best AI computing chips, forced R1's developers to build smarter algorithms to compensate for their lack of computing power. How much this will translate into useful scientific and technical applications, or whether DeepSeek has simply trained its model to ace benchmark tests, remains to be seen -- but scientists and AI investors are watching closely.
[56]
Tech leaders respond to the rapid rise of DeepSeek
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More If you hadn't heard, there's a new AI star in town: DeepSeek, the subsidiary of Hong Kong-based quantitative analysis (quant) firm High-Flyer Capital Management, has sent shockwaves throughout Silicon Valley and the wider world with its release earlier this week of a new open source large reasoning model, DeepSeek R1, that matches OpenAI's most powerful available model o1 -- and at a fraction of the cost to users and to the company itself (when training it). While the advent of DeepSeek R1 has already reshuffled a consistently topsy turvy, fast moving, intensely competitive market for new AI models -- previous months saw OpenAI jockeying with Anthropic and Google for the most powerful proprietary models available, while Meta Platforms often came in with "close enough" open source rivals -- the difference this time is the company behind the hot model is based in China, the geopolitical "frenemy" of the U.S., and whose tech sector was widely viewed, until this moment, as inferior to that of Silicon Valley. As such, it's caused no shortage of hand-wringing and existentialism from U.S. and Western bloc techies, who are suddenly doubting OpenAI and the general big tech strategy of throwing more money and more compute (graphics processing units, GPUs, the powerful gaming chips typically used to train AI models) toward the problem of inventing ever more powerful models. Yet some Western tech leaders have had a largely positive public response to DeepSeek's rapid ascent. Marc Andreessen, a co-inventor of the pioneering Mosaic web browser, co-founder of the Netscape browser company and current general partner at the famed Andreessen Horowitz (a16z) venture capital firm, posted on X today: "Deepseek R1 is one of the most amazing and impressive breakthroughs I've ever seen -- and as open source, a profound gift to the world [robot emoji, salute emoji]." Yann LeCun, the Chief AI Scientist for Meta's Fundamental AI Research (FAIR) division, posted on his LinkedIn account: "To people who see the performance of DeepSeek and think: 'China is surpassing the US in AI.' You are reading this wrong. The correct reading is: 'Open source models are surpassing proprietary ones.' DeepSeek has profited from open research and open source (e.g. PyTorch and Llama from Meta) They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it. That is the power of open research and open source." And even Mark "Zuck" Zuckerberg, Meta AI's founder and CEO, seemed to seek to counter the rise of DeepSeek with his own post on Facebook promising that a new version of Facebook's open source AI model family Llama would be "the leading state of the art model" when it is released sometime this year. As he put it: "This will be a defining year for AI. In 2025, I expect Meta AI will be the leading assistant serving more than 1 billion people, Llama 4 will become the leading state of the art model, and we'll build an AI engineer that will start contributing increasing amounts of code to our R&D efforts. To power this, Meta is building a 2GW+ datacenter that is so large it would cover a significant part of Manhattan. We'll bring online ~1GW of compute in '25 and we'll end the year with more than 1.3 million GPUs. We're planning to invest $60-65B in capex this year while also growing our AI teams significantly, and we have the capital to continue investing in the years ahead. This is a massive effort, and over the coming years it will drive our core products and business, unlock historic innovation, and extend American technology leadership. Let's go build!" He even shared a graphic showing the 2 gigawatt datacenter mentioned in his post overlaid on Manhattan: Clearly, even as he espouses a commitment to open source AI, Zuck is not convinced that DeepSeek's approach of optimizing for efficiency while leveraging far fewer GPUs than major labs is the right one for Meta, or for the future of AI. But with U.S. companies raising and/or spending record sums on new AI infrastructure that many experts have noted depreciate rapidly (due to hardware/chip and software advancements), the question remains which vision of the future will win out in the end to become the dominant AI provider for the world. Or maybe it will always be a multiplicity of models each with a smaller market share? Stay tuned, because this competition is getting closer and fiercer than ever.
[63]
What is DeepSeek, and why is it causing Nvidia and other stocks to slump?
A Chinese artificial intelligence company called DeepSeek is grabbing America's attention -- and sending a shock wave through Wall Street -- due to its new tech, which some experts say rivals that of OpenAI's ChatGPT. DeepSeek is also catching investors off guard because of the low development costs for its AI app, which Wedbush Securities analyst Dan Ives pegged at only $6 million. By comparison, OpenAI, Google and other major U.S. companies are on track to invest a total of roughly $1 trillion in AI over the coming years, according to Goldman Sachs. On Monday, DeepSeek's rollout roiled shares of AI stalwarts such as Nvidia, the high-flying manufacturer of advanced chips engineered for AI development, and Dutch company ASML, another chipmaker. The Chinese company's tech is raising questions about whether demand for Nvidia's chips could take a hit, as well as whether investors are overvaluing tech stocks that have been buoyed by the promise of AI, from Meta to Microsoft, experts said. "DeepSeek has taken the market by storm by doing more with less," said Giuseppe Sette, president at AI market research firm Reflexivity, in an email. "This shows that with AI the surprises will keep on coming in the next few years." DeepSeek's latest app comes just days after Mr. Trump announced a new $500 billion venture with ChatGPT maker OpenAI, Softbank and Oracle, dubbed Stargate, which he touted as ensuring "the future of technology" in the U.S. Nvidia shares tumbled 12.5% in early trading on Monday, while ASML shed 7.6%. The tech-heavy Nasdaq index slumped 3.4% in early trading, while the S&P 500 declined 1.8%. DeepSeek is a private Chinese company founded in July 2023 by Liang Wenfeng, a graduate of Zhejiang University, one of China's top universities, who funded the startup via his hedge fund, according to the MIT Technology Review. Liang has about $8 billion in assets, Ives wrote in a Jan. 27 research note. Liang, who had previously focused on applying AI to investing, had bought a "stockpile of Nvidia A100 chips," a type of tech that is now banned from export to China. Those chips became the basis of DeepSeek, the MIT publication reported. The company's AI app is available in Apple's App store, as well as online at its website. The service is free and as of Monday morning was the top download on Apple's store, although some people were having trouble signing up for the app. The company released its latest AI model on Jan. 20, which is causing Wall Street to reappraise the AI sector. "Last week DeepSeek launched a model that rivals OpenAI's ChatGPT and Meta's Llama 3.1 and was #1 on Apple's App Store over the weekend," Wedbush's Ives wrote. "DeepSeek built the model using reduced capability chips from Nvidia. which is impressive and thus has caused major agita for U.S. tech stocks with massive pressure on Nasdaq this morning." DeepSeek is an open-source large language model that relies on what is known as "inference-time computing," which Sette said in layman's terms means "they activate only the most relevant portions of their model for each query, and that saves money and computation power." Some experts praised DeepSeek's performance, with noted tech investor Marc Andreessen writing on X on Jan. 24, "DeepSeek R1 is one of the most amazing and impressive breakthroughs I've ever seen -- and as open source, a profound gift to the world." However, Ives said he's skeptical the service will gain ground with major U.S. businesses. "No U.S. Global 2000 is going to use a Chinese startup DeepSeek to launch their AI infrastructure and use cases," Ives wrote. "At the end of the day there is only one chip company in the world launching autonomous, robotics, and broader AI use cases and that is Nvidia."
[66]
'AI's Sputnik moment': China-based DeepSeek's open-source models may be a real threat to the dominance of OpenAI, Meta, and Nvidia
US-based companies have been at the forefront of the AI revolution, but now it looks like a new challenger has emerged. China-based DeepSeek is just over a year old, but the startup has released an open-source AI model called R1 that seems to have the rest of the industry worried. DeepSeek was founded in 2023 by Liang Wenfeng, the co-founder of High Flyer, an AI-driven hedge-fund (via Bloomberg). Despite being a mere fledgling in the AI world compared to some of the bigger names like OpenAI and Meta, the DeepSeek V3 family of AI models appear to be much more efficient than competitors' efforts like ChatGPT. Now R1, a new reasoning model fine-tuned from V3 training, also appears to be comparable in performance and accuracy to OpenAI's efforts. The Register has performed some early testing on R1 in comparison to OpenAI's o1, a similar "Chain of Thought" (CoT) model, and found that its capabilities seem remarkably similar -- with the former even beating out the OpenAI model in the MATH-500 test. All while costing much less to train, with a Chiphell forum user claiming that "DeepSeek-R1 is priced at only about 3.65% of OpenAI o1" per million outputs. That's potentially bad news for Nvidia, which has been raking in cash as a result of high sales of its AI accelerator hardware to run the latest models at scale, taking the company's valuation up to the highest by market cap in the world at time of writing. According to some analysts, Nvidia's stock price is on track to shed nearly $400 billion dollars in the wake of R1's launch, although it appears to be holding relatively steady after a slump in early Monday trading. Dutch chip-maker ASML has seen its share price fall by more than 10% since R1's launch, while Siemens Energy dropped by 21%. US venture capitalist and advisor to Donald Trump, Marc Andreessen, made the comparison of R1's launch to the US and USSR space race, calling it "AI's Sputnik moment." Sputnik was a satellite launched by the USSR in 1957, and is widely thought to have wrongfooted the US government on the Soviet Union's space capabilities, leading to a space race between the two nations. The elephant in the room, however, is trade sanctions. The US government has banned the export of high-end AI hardware to China, which some have speculated has led to a more efficient approach resulting in DeepSeek's new models. If so, that would suggest that the sanctions haven't had the effect that the US gov may have hoped. However, being a Chinese product (and given that many of the leading AI companies are US-based), it seems possible that restrictions may follow on its use for US companies, and potentially even those using US equipment, in the same fashion as the current trade sanctions. Either way, the release of DeepSeek-R1 appears to have caused a significant rumble in the AI markets. Whatever happens next, it seems that the US may no longer hold all the keys to advanced AI -- and the world awaits it's response.
[67]
DeepSeek Mania Shakes AI Industry to Its Core
Why a relatively unknown Chinese-developed AI model has turned the AI industry on its head. DeepSeek, a relatively unknown Chinese-developed AI model, is now the most popular app in the US Apple App Store based on hype associated with it releasing an an AI model that outpaces OpenAI's and other companies' models on some benchmarks while being trained on older chips at a much lower cost. It has caused Nvidia stock, which has become one of the most valuable companies in history thanks to the AI boom, to tumble and is causing more widespread panic in the U.S. market. Marc Andreessen has called this "AI's Sputnik moment," referring to the first Russian satellite which got ahead of and accelerated the US space program. People in the AI space and those who follow it closely started freaking out when DeepSeek's newest model, DeepSeek R1, was released last week, and that freakout has now seemingly captured the entire world, impacting the stock market, causing people to wonder if American companies like OpenAI and Nvidia can really dominate the AI industry, if the AI bubble is finally popping, and if this is a sign of imminent Chinese world domination and censorship. DeepSeek is particularly notable because it is free, modifiable, and less expensive to run, which has experts worried about the viability of OpenAI's already unprofitable subscription products. I'm going to be upfront with you here and say that 404 Media does not provide any financial advice and that if I had definitive answers to any of these questions I'd be playing the stock market instead of blogging, but in a day when the takes are going to come fast and furious my take is this: The AI industry continues to develop very fast, it's hard to extrapolate how it's all going to unfold based on single event, even if it's monumental, and the fact that DeepSeek comes from China, a perceived adversary to the United States/the West is making hawks and xenophobes, and tankies foam at the mouth. Let's take a deep breath and start with the biggest headline, which is that Nvidia stock dropped over 12 percent early this morning, its worst performance since 2020. GPU maker Nvidia became a trillion dollar company because it is largely making the chips that power the generative AI boom. These are not only the chips that people need to generate text, images, audio etc locally on their machines, but the massive training clusters of thousands of chips that these foundational models are trained on. In July, for example, Elon Musk proudly announced that xAI started training "the most powerful AI training cluster in the world," composed of 100,000 Nvidia H100s. For the most part, AI companies in the US have competed on the general idea that more data and more compute creates more advanced and more "intelligent" AI models and tools. One of the general strategies, therefore, has been for companies like OpenAI, Google, and Meta to feed more and more data into their models and to compete to build incredibly expensive and resource intensive data centers. But last year, we started to see some doubts about the existing consensus on AI scaling laws, which up until recently showed that the performance of AI models improved as the size of the model, data, and compute increased, with some people, including Open AI co-founder Ilya Sutskever and Andreessen saying that AI companies are starting to see diminishing returns. Despite these doubts bubbling up, AI companies were still competing for compute, which largely means access to Nvidia's chips. Musk wants to grow xAI's cluster to one million GPUs eventually, and the CEO of Broadcom recently said he predicts other companies will attempt to build similarly gigantic clusters. This demand for highly specialized and hard to produce hardware has made Nvidia incredibly valuable and critically important to building AI. Because the US government believes that the United States, not China, must be the world leader in AI, it's also why it has introduced export restrictions that forbid Nvidia from selling its most advanced chips to Chinese companies. This is part of why you see OpenAI, Oracle, and SoftBank proposing to throw $500 billion into projects like Stargate, a massive AI data infrastructure project that they may or may not have the money for. The main reason people are excited/scared/throwing up right now is that DeepSeek was developed and released under America's export restrictions that prevent Chinese companies from getting the latest and most powerful Nvidia chips. As Wired explained, DeepSeek was spun out from Fire-Flyer, a Chinese hedge fund that originally acquired GPUs to analyze financial data, before it invested its money and resources in developing AI. That a new player in this space was able to build an AI model without access to the latest and greatest Nvidia chips (though people in China have found ways to obtain them despite restriction), using new, more efficient reinforcement learning strategies, has undermined the idea that companies like Nvidia or OpenAI have built a "moat" around their companies that will secure their lead in the AI race forever, and, by extension, has undermined the notion of American AI world supremacy. It also at least raises the possibility that a Chinese company has found a better, more efficient, and cheaper way to train AI models than any American company has discovered thus far. As others have pointed out, it's hard to say exactly what DeepSeek actually spent to make its model without trusting it blindly. The true cost may be hidden in ways we don't understand, and is definitely benefiting by building on top of the very expensive research (primarily from American companies) that came before it. But if AI companies can build competitive models at a fraction of the cost on a comparatively tiny number of lesser GPUs, then much of Nvidia's value and the billions of dollars AI companies are burning on training suddenly seems excessive and wasteful (even to AI boosters), hence the stock tumbling. Does this mean Nvidia, OpenAI, and other AI companies are doomed? Again, this is not financial advice but the market appears to be spasming based on vibes, and definitely before we have a great understanding of DeepSeek's impact. The most obvious rebuttal from Nvidia bag holders in this situation is that DeepSeek's newfound efficiencies will only benefit AI incumbents. If these new methods give DeepSeek great results with limited compute, the same methods will give OpenAI and other, more well-resourced AI companies even greater results on their huge training clusters, and it is possible that American companies will adapt to these new methods very quickly. Even if scaling laws really have hit the ceiling and giant training clusters don't need to be that giant, there's no reason I can see why other companies can't be competitive under this new paradigm. We should also probably hope that this is the case since it could lower the environmental impact of AI. I don't have a dog in this fight, but the argument I would add here is that this type of leapfrogging seems totally normal, and we have seem variations of it over the last couple of years. People love to prematurely dance on OpenAI's grave whenever a new and shiny model is released. Meta's Llama, France's Mistral, and Anthropic's Claude have all seemed like they're getting ahead at one point or another and are favored by different users for different uses, only for another model to be released by OpenAI or another company that leapfrogs the hot new technology and makes them seem old. The difference is that DeepSeek is from China and that a lot of people including the US government don't like the idea of China being dominant in any arena, let alone one as supposedly consequential as AI. This is obvious given the hysteria on social media right now, the markets, and the way people are talking about DeepSeek's censorship and the possibility that it could be tied to Chinese surveillance or the Chinese government in some way. Steven Heidel, who works at OpenAI, tweeted Sunday "americans sure love giving their data away to the CCP in exchange for free stuff," which has gone viral and served as the basis for discussion about DeepSeek as possible surveillance software, the "new TikTok," etc. What's particularly notable here is that DeepSeek has been released in a way that can be run locally without an internet connection. On various AI subreddits, where DeepSeek is all people have been talking about for days, some users are now suggesting that the conversation is being manipulated by "propaganda" from a few accounts. People have repeatedly shared screenshots on social media of DeepSeek refusing to engage with questions about Tiananmen Square and other topics subject to censorship in China, with the implication that this is the information ecosystem we'd live under if China was to dominate the AI race. Fair enough, I suppose, but as the developers of "uncensored" AI models have been shouting from the rooftops since the beginning: any AI model that the user can't control entirely is subject to censorship. OpenAI is a prude, and will refuse to engage users on a lot of topics, sometimes for reasons stated in OpenAI's policy, and sometimes for reasons we'll never understand because OpenAI is a black box. "Why should the open-source AI running on my computer, get to decide for itself when it wants to answer my question? This is about ownership and control. If I ask my model a question, I want an answer, I do not want it arguing with me," Eric Hartford, a developer of uncensored AI models, told me last year. If anything, DeepSeek maps a better AI future for those concerned about censorship because it was released as an "open weights" model, meaning people could modify it to talk about Tiananmen Square and whatever else they want. We do not know how this will all shake out, but the release of DeepSeek does seem to be a seismic moment for the AI industry. And it will certainly be used, rightly or wrongly, as a political cudgel to highlight the urgency of the competition for AI supremacy between the United States and China.
[68]
Is China having the last laugh? DeepSeek rattles Silicon Valley AI giants including Nvidia; here's all about the Hangzhou startup
DeepSeek is a China-based company, whose AI chatbot has simply begun to disrupt the entire AI hype in the US stock market. This sudden rattle of AI giants like Nvidia could prove to be quite costly for the US markets. The United States claimed to be way ahead in the AI race between 2023 and 2024, and this is reflected in the massive surge of artificial intelligence-based stocks in the US stock market. Everything seemed to be going great for AI stocks on Wall Street, until 2025 came bearing some epically bad news. Starting off this wave of grief was a sudden dip in the AI hype, that saw some select AI stocks take a major beating, and then came the biggest jolt in the form of DeepSeek. US-based artificial intelligence researching companies were deemed to be titans in the field until DeepSeek's AI chatbot suddenly began getting the better of OpenAI's ChatGPT, which is considered one of the best AI language models yet, followed by Google's Gemini or Meta's Llama. Moreover, DeepSeek claims to have come to develop its AI model at a very low cost as compared to its other US competitors, which is now raising questions about the profitability of the AI market among US investors. This has resulted in stocks like Nvidia, Google, Amazon, and others suddenly slipping by a few points in trading right at the start of the week. DeepSeek, having bases in the eastern Chinese city of Hangzhou, can do almost everything that its competitors can, and the best part is that its development took a lot less money to see the light of day, which means serious bad news for Silicon Valley companies. Moreover, according to a previous ET report, DeepSeek's AI chatbot surpassed ChatGPT to become the highest-rated application on Apple's app store recently. This is being deemed as a 'Sputnik '-like moment, similar to the space race that went on between the US and the Soviet Union during the Cold War era. Is DeepSeek made in China? DeepSeek's fundamentals and development have originated in China, and it is now becoming popular all across the world. Does Nvidia have to do anything with DeepSeek? Nvidia has a lot to do with DeepSeek, starting with the fact that the China-based company's AI models have been built and tested using Nvidia chips.
[69]
DeepSeek passes ChatGPT, is now the top free app in the US
The underlying AI model can reportedly compete with OpenAI's models while requiring a fraction of the training budget. Chinese AI platform DeepSeek has made waves in recent weeks thanks to the accompanying DeepSeek R1 AI model. Now, it turns out that DeepSeek's AI Assistant app has become the top free app on the Apple App Store. The DeepSeek AI Assistant has passed OpenAI's ChatGPT to become the top free app on the Apple App Store in the US. Check out a screenshot of the chart below. The app is currently listed in 10th place on the Play Store's free apps chart, but seems to be rising quickly. The company claims that this open-source model can compete with rival models from OpenAI and Meta while requiring only a fraction of the training budget. More specifically, the company asserts that training the model required just $12 million compared to $500 million for OpenAI's GPT-5. What's also notable is that the team behind DeepSeek reportedly claimed that they used older NVIDIA silicon to train the AI model. This is particularly noteworthy in light of the US barring the export of cutting-edge NVIDIA technology to China. It's also raised questions about the valuations of companies in the AI space. Android Authority contributor Dhruv Bhutani recently went hands-on with the DeepSeek AI Assistant and concluded it was an "excellent" alternative to ChatGPT for several reasons. He cited the higher-quality output, the mostly free nature of the platform (at least for now), and the ability to run it locally on your own computer.
[70]
Nvidia, Oracle in trouble as ChatGPT is losing AI battle against China's DeepSeek? Here is what experts are saying
Nvidia, Oracle shares are down as Chinese Artificial Intelligence startup have left OpenAI's ChatGpt behind in one key aspect.Nvidia, Oracle and other Artificial Intelligence firms are experiencing an unexpected jolt Chinese startup DeepSeek's AI Assistant on Monday became the top-rated free application available on Apple's App Store in the United States overtaking rival ChatGPT, according to reports. While, shares of AI chipmaker Nvidia fell 10 per cent, rival Oracle dropped 8 per cent and AI data analytics company Palantir lost 7 per cent in pre-market trading, the impact of DeepSeek will be deeper, experts have claimed. In Europe, ASML which counts Taiwan's TSMC , Intel and Samsung as its customers, dropped almost 11 per cent, while in Japan, startup investor SoftBank Group slid more than 8 per cent. Last week it announced a $19 billion commitment to fund Stargate, a data-centre joint venture with OpenAI. DeepSeek offers the prospect of a viable, cheaper AI alternative which has raised questions about the sustainability of the level of spending and investment on AI by Western companies, including Apple and Microsoft. Marc Andreessen, the Silicon Valley venture capitalist, said in a post on X on Sunday that DeepSeek's R1 model was AI's "Sputnik moment", referencing the former Soviet Union's launch of a satellite that marked the start of the space race in the late 1950s. "Deepseek R1 is one of the most amazing and impressive breakthroughs I've ever seen -- and as open source, a profound gift to the world," he said in a separate post. Masahiro Ichikawa, chief market strategist at Sumitomo Mitsui DS Asset Management said: "The idea that the most cutting-edge technologies in America, like Nvidia and ChatGPT, are the most superior globally, there's concern that this perspective might start to change." "I think it might be a bit premature," Ichikawa said. Little is known about the company behind DeepSeek, a small Hangzhou-based startup founded in 2023, when search engine giant Baidu released the first Chinese AI large-language model. Since then, dozens of Chinese tech companies large and small have released their own AI models, but DeepSeek is the first to be praised by the U.S. tech industry as matching or even surpassing the performance of cutting-edge U.S. models. Powered by the DeepSeek-V3 model, which its creators say "tops the leaderboard among open-source models and rivals the most advanced closed-source models globally", the artificial intelligence application has surged in popularity among U.S. users since it was released on Jan. 10, according to app data research firm Sensor Tower. The milestone highlights how DeepSeek has left a deep impression on Silicon Valley, upending widely held views about U.S. primacy in AI and the effectiveness of Washington's export controls targeting China's advanced chip and AI capabilities. AI models from ChatGPT to DeepSeek require advanced chips to power their training. The Biden administration has since 2021 widened the scope of bans designed to stop these chips from being exported to China and used to train Chinese firms' AI models. However, DeepSeek researchers wrote in a paper last month that the DeepSeek-V3 used Nvidia's H800 chips for training, spending less than $6 million. Although this detail has since been disputed, the claim that the chips used were less powerful than the most advanced Nvidia products Washington has sought to keep out of China, as well as the relatively cheap training costs, has prompted U.S. tech executives to question the effectiveness of tech export controls.
[71]
DeepSeek vs ChatGPT: China's AI shocks US, sparks panic in tech world; Why America is worried
China's DeepSeek AI has sent shockwaves through the US tech industry, sparking panic and causing a market collapse. With its low-cost, high-performance model, DeepSeek is threatening to overtake US giants like ChatGPT, causing major stock losses for AI companies. As DeepSeek rises, the US is left scrambling to maintain its dominance in the AI race.
[72]
DeepSeek's new AI assistant app is the top free app on the Apple App Store, ahead of ChatGPT
TL;DR: Chinese AI company DeepSeek's AI assistant is now the top free app on the Apple App Store, surpassing ChatGPT. Its R1 model, costing $5.6 million to train, competes with OpenAI's o1 model, which costs significantly more. Despite US sanctions, DeepSeek uses NVIDIA H100 GPUs for training. The R1 model, with 671 billion parameters, is available at prices 90-95% cheaper than OpenAI's o1 model. Curai CEO Chinese AI company DeepSeek has had some rather large success on the Apple App Store, with its AI assistant now the top free app on the App Store, beating out ChatGPT. In particular AI benchmarks, DeepSeek's R1 model matches or even beats OpenAI's new o1 model, and better yet... DeepSeek has said its R1 model cost just $5.6 million to train. This is compared to the hundreds of millions of dollars that other companies spend training their AI models. This is all with US sanctions restricting high-powered AI chips from hitting Chinese shores, leaving DeepSeek to play it differently when it came to powering its AI training on the cheap. DeepSeek is using NVIDIA H100 AI GPUs for some of its training, with DeepSeek founder and CEO Liang Wenfeng has recently said that US export restrictions are still bottlenecking the company. Curai CEO Neal Khosla has had some harsh comments, posting on X that "deepseek is a CCP state psyop + economic warfare to make American AI unprofitable. they are faking the cost was low to justify setting price low and hoping everyone switches to it damage AI competitiveness in the US. Don't take the bait". However, Khosla's father is an OpenAI investor thanks to X's awesome Community Notes feature. DeepSeek's new R1 model has 671 billion parameters is rather large, with smaller "distilled" versions of R1 models ranging from 1.5 billion parameters to 70 billion parameters. DeepSeek's smallest R1 model can run on a laptop, but for the full-force R1 model it requires much more hardware... but, it's available through DeepSeek's API at prices that are a game-changing 90-95% cheaper than OpenAI's o1 model.
[73]
What is unique about DeepSeek AI model? Features, cost, US ban - details here
DeepSeek, a Chinese AI startup made a low-cost chatbot and is competing with other AI technologies.Chinese AI startup DeepSeek is now grabbing Silicon Valley's after demonstrating breakthrough artificial intelligence models that offer comparable performance to the world's best chatbots at a low cost, reported Bloomberg. The chief of AI-driven quant hedge fund High-Flyer Liang Wenfeng founded DeepSeek in 2023. The company develops AI models that the developer community at large can inspect and improve the software, according to Bloomberg. DeepSeek's mobile app surged to the top of the iPhone download charts in the US after its release in early January. The company's uniqueness is that it articulates its reasoning before delivering a response to a prompt. DeepSeek claims its R1 release offers performance on par with OpenAI's latest and has granted licenses for individuals interested in developing chatbots using the technology to build on it, as per Bloomberg. Also Read : SpaceX gears up for satellite launch: 21 Starlinks to lift off from Cape Canaveral The cost of training and developing DeepSeek's models is considered to be only a fraction of what's required for OpenAI or Meta Platforms Inc.'s best products. The company has revealed the cost. According to Bloomberg, DeepSeek has highlighted that its R1 model is performing just as well, if not better, than other leading models in several key benchmarks. The benchmark includes AIME 2024 for math tasks, MMLU for general knowledge, and AlpacaEval 2.0 for question-and-answer performance. Additionally, R1 ranks among the top models on the Chatbot Arena leaderboard, which is affiliated with UC Berkeley. The US has imposed a ban on exporting advanced technologies like GPU semiconductors to China as it aims to slow the country's progress in AI. It is a critical area in the US-China technology race. However, DeepSeek's recent advancements suggest that Chinese AI engineers have found ways to adapt, focusing on improving efficiency even with limited resources. DeeSeek's progress indicates that these trade restrictions may not be fully effective in halting China's AI development. Also Read : Hostage deal allows Gaza residents to return north amid a cease-fire agreement Q1. Who is the founder of DeepSeek? A1. Liang Wenfeng founded DeepSeek in 2023. He is the chief of AI-driven quant hedge fund High-Flyer. Q2. Is the cost of technology to build DeepSeek lower? A2. Yes, the cost of training and developing DeepSeek's models is considered to be only a fraction of what's required for other chatbots.
[74]
DeepSeek displaces ChatGPT as the App Store's top app | TechCrunch
The mobile app for DeepSeek, a Chinese AI lab, skyrocketed to the number-one spot in app stores around the globe this weekend, topping the U.S.-based AI chatbot, ChatGPT. On iOS, DeepSeek is currently the No. 1 free app in the U.S. App Store and 51 other countries, according to mobile app analytics firm Appfigures. The rapid rise of DeepSeek in the app stores' Top Charts follows its meteoric rise in popularity this week resulting from the release of a series of open AI models that are competitive with leading offerings from OpenAI and Google. However, compared to other frontier AI models, DeepSeek claims its models were trained for just a fraction of the price with significantly worse AI chips. This represents new efficiency gains for AI model training, which sent Nvidia's stock price tumbling down as much as 17% on Monday, and has put the rest of the tech industry on high alert. On Friday, DeepSeek's mobile app had just a million downloads across both the App Store and Google Play. That number has since doubled, as of Monday morning, to 2.6 million downloads of DeepSeek's mobile app across both platforms. The app currently sits in the Top 10 list for free apps in 111 countries on the App Store, and 18 countries on Google Play, according to Appfigures. In addition, more than 80% of DeepSeek's total mobile app downloads have come in the past 7 days, according to analytics firm Sensor Tower. In that time frame, DeepSeek saw nearly 300% more app downloads than Perplexity, another leading consumer AI app. DeepSeek moved into its No. 1 spot on the U.S. App Store on Sunday, Jan. 26, up from No. 31 just a couple of days prior. Meanwhile, DeepSeek is the No. 14 Overall free app on Google Play, rapidly climbing up from No. 133 on Jan. 24. As a result, daily active users jumped both globally and in the U.S. by over 110% from Jan.24-25 compared with the same period last week. While China is the largest mobile app market for DeepSeek today, it only represents 23% of its total downloads, according to Sensor Tower. The app's second and third largest markets are the United States, which makes up 15% of its total downloads, and Egypt, which makes up 6% of its total downloads. The popularity of DeepSeek's mobile app raises questions about the moat of popular consumer AI apps, such as ChatGPT, Gemini, and Perplexity. While hundreds of millions of people use ChatGPT and Gemini every month, DeepSeek proves that the consumer AI space is still volatile, and new competitors shouldn't be counted out. However, it's worth noting that reaching the No. 1 position on the App Store isn't just calculated by app downloads alone. Apple's algorithm, though secretive, is a combination of numerous factors including also velocity of the installs and more. That allows apps that gain installs quickly to skyrocket to the top of the charts, overtaking others that may have a larger total number of users or installs. It will be more telling to see how long DeepSeek holds its top position over time. The tech industry is still coming to terms with the techniques DeepSeek used to train its AI models, and what it means for the broader AI space. In light of DeepSeek's R1 model, leading AI model providers may be feeling pressured to release better models to prove their dominance, or justify the hefty price they're paying for compute. Meta has reportedly created several "war rooms" to analyze DeepSeek's training methods. DeepSeek's rise in popularity was potentially stifled by "large-scale malicious" attacks, the company reported on Monday, which forced it to limit customers outside of China from registering for the app.
[76]
How Chinese AI startup DeepSeek is competing with Silicon Valley giants
SAN FRANCISCO -- The day after Christmas, a small Chinese startup called DeepSeek unveiled a new artificial intelligence system that could match the capabilities of cutting-edge chatbots from companies such as OpenAI and Google. That alone would have been a milestone. But the team behind the system, called DeepSeek-V3, described an even bigger step. In a research paper explaining how they built the technology, DeepSeek's engineers said they used only a fraction of the highly specialized computer chips that leading AI companies relied on to train their systems. These chips are at the center of a tense technological competition between the United States and China. As the U.S. government works to maintain the country's lead in the global AI race, it is trying to limit the number of powerful chips, such as those made by Silicon Valley firm Nvidia, that can be sold to China and other rivals. But the performance of the DeepSeek model raises questions about the unintended consequences of the U.S. government's trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools that are freely available on the internet. The DeepSeek chatbot answered questions, solved logic problems and wrote its own computer programs as capably as anything already on the market, according to the benchmark tests that U.S. AI companies have been using. And it was created on the cheap, challenging the prevailing idea that only the tech industry's biggest companies -- all of them based in the United States -- could afford to make the most advanced AI systems. The Chinese engineers said they needed only about $6 million in raw computing power to build their new system. That is about 10 times less than tech giant Meta spent building its latest AI technology. "The number of companies who have $6 million to spend is vastly greater than the number of companies who have $100 million or $1 billion to spend," said Chris V. Nicholson, an investor with the venture capital firm Page One Ventures, who focuses on AI technologies. Since OpenAI sparked the AI boom in 2022 with the release of ChatGPT, many experts and investors had concluded that no company could compete with the market leaders without spending hundreds of millions of dollars on specialized chips. The world's leading AI companies train their chatbots using supercomputers that use as many as 16,000 chips, if not more. DeepSeek's engineers, on the other hand, said they needed only about 2,000 specialized computer chips from Nvidia. The constraints on chips in China forced the DeepSeek engineers to "train it more efficiently so it could still be competitive," said Jeffrey Ding, an assistant professor at George Washington University who specializes in emerging technology and international relations. Earlier this month, the Biden administration issued new rules that aim to keep China from obtaining advanced AI chips through other countries. The rules build on multiple rounds of earlier restrictions that prevent Chinese companies from being able to buy or make cutting-edge computer chips. President Donald Trump has not yet indicated whether he will keep the rules or rescind them. The U.S. government has tried to keep advanced chips out of the hands of Chinese companies over concerns they could be used for military purposes. In response, some firms in China have stockpiled thousands of chips, while others sourced them from a thriving underground marketplace of smugglers. DeepSeek is run by a quantitative stock trading firm called High Flyer. By 2021, it had channeled its profits into acquiring thousands of Nvidia chips, which it used to train its earlier models. The company, which did not respond to requests for comment, has become known in China for scooping up talent fresh from top universities with the promise of high salaries and the ability to follow the research questions that most pique their interest. Zihan Wang, a computer engineer who worked on an earlier DeepSeek model, said the company also hires people without any computer science background to help the technology understand and be able to generate poetry and ace questions on the notoriously difficult Chinese college entrance examination. DeepSeek does not make any products for consumers, leaving its engineers to focus entirely on research. That means that its technology is not hemmed in by the strictest aspect of China's regulations on AI, which require consumer-facing technology to comply with the government's controls on information. The leading U.S. companies continue to advance the state of the art in AI. In December, OpenAI unveiled a new "reasoning" system called o3 that exceeds the performance of existing technologies, though it is not yet widely available outside the company. But DeepSeek continues to show that it is not far behind. This month, it released an impressive reasoning model of its own. A crucial part of this rapidly changing global market is an old idea: open source software. Like many other companies, DeepSeek has open sourced its latest AI system, meaning that it has shared the underlying code with other businesses and researchers. This allows others to build and distribute their own products using the same technologies. While employees at big Chinese technology companies are limited to collaborating with colleagues, "if you work on open source, you work with talent around the world," said Yineng Zhang, lead software engineer at Baseten in San Francisco who works on the open source SGLang project. He helps other people and companies build products using DeepSeek's system. The open source ecosystem for AI gathered steam in 2023 when Meta freely shared an AI system called LLama. Many assumed that this community would flourish only if the companies like Meta -- tech giants with massive data centers filled with specialized chips -- continued to open source their technologies. But DeepSeek and others have shown that they, too, can expand the powers of open source technologies. Many executives and pundits have argued that the big U.S. companies should not open source their technologies because they could be used to spread disinformation or cause other serious harm. Some U.S. lawmakers have explored the possibility of preventing or throttling the practice. But others argue that if regulators stifle the progress of open source technology in the United States, China will gain a significant edge. If the best open source technologies come from China, they argue, U.S. developers will build their systems atop those technologies. In the long run, that could put China at the heart of AI research and development. "The center of gravity of the open source community has been moving to China," said Ion Stoica, a professor of computer science at the University of California, Berkeley. "This could be a huge danger for the U.S.," because it allows China to accelerate the development of new technologies. Hours after his inauguration, Trump rescinded a Biden administration executive order that threatened to curb open source technologies. Stoica and his students recently built an AI system called Sky-T1 that rivals the performance of OpenAI's o1 system on certain benchmark tests. They needed only $450 in computing power. They did this by building on top of two open source technologies released by Chinese tech giant Alibaba. Their $450 system is not as powerful as OpenAI's technology or DeepSeek's new system. And the techniques they used are unlikely to yield systems that exceed the performance of the leading technologies. But the project showed that even operations with minuscule resources can build competitive systems. Reuven Cohen, a technology consultant in Toronto, has been using DeepSeek-V3 since late December. He says it is comparable to the latest systems from OpenAI, Google and San Francisco startup Anthropic -- and much cheaper to use. "DeepSeek is a way for me to save money," he said. "This is the kind of technology that someone like me wants to use."
[0]
Satya Nadella Said 'We Should Take Development Out Of China Very, Very Seriously.' Now, DeepSeek's R1 Has Emerged As A ChatGPT Killer, Outperforming OpenAI With 50x Lower Costs: Here's More - Alphabet (NASDAQ:GOOG), Broadcom (NASDAQ:AVGO)
Satya Nadella-led Microsoft Corp. MSFT has bet its AI ambitions on OpenAI, pouring billions of dollars into the Sam Altman-led startup. Now, days after he said the China threat needs to be taken seriously, DeepSeek, a China-based AI startup, has beaten OpenAI with 97% less computing power and 50 times lower prices. What Happened: DeepSeek R1 employs a hybrid architecture that incorporates large-scale reinforcement learning and chain-of-thought reasoning to improve response accuracy. It is available in two versions: DeepSeek-R1 and DeepSeek-R1-Zero, with the latter featuring unsupervised fine-tuning for superior reasoning skills. The startup, based in Hangzhou, China, was founded by Liang Wenfeng in July 2023, with the goal of developing advanced AI capable of performing tasks at or beyond human-level proficiency. The model's affordability is a significant attraction, with pricing lower by up to 50 times than OpenAI's offerings. Subscribe to the Benzinga Tech Trends newsletter to get all the latest tech developments delivered to your inbox. DeepSeek R1 has demonstrated comparable performance to OpenAI's models in various benchmarks, even outperforming them in certain coding tasks. Speaking at the World Economic Forum, Microsoft CEO Nadella said, "We should take the developments out of China very, very seriously." So far, DeepSeek has released six smaller versions of the R1 model, optimized for laptop use, with one surpassing OpenAI's o1-mini in specific tests. See Also: Amazon Prime Video Shifts Focus To Live Sports, Looks To Become Profitable By End Of 2025 The launch of DeepSeek R1 has sparked discussions on social media, with users sharing experiences and comparisons with other AI models. The model's rapid development, competitive pricing, and open-source initiatives highlight China's growing influence in the global AI landscape. Why It Matters: The introduction of DeepSeek R1 has significant implications for the global AI market. The startup created its low-cost model in just two months with less than $6 million, showcasing China's potential to challenge major players like OpenAI, Alphabet Inc. GOOG GOOGL Google, and Meta Platforms Inc. META. The model has overtaken ChatGPT as the most downloaded app on Apple's US store, indicating its growing popularity. The launch has also impacted the stock market, with major technology stocks experiencing a selloff. Stocks of Nvidia Corp. NVDA and Broadcom Inc. AVGO saw significant declines as DeepSeek's model threatened U.S. technological leadership. Moreover, the development has raised questions about the necessity of massive GPU investments. DeepSeek's open-source approach and technical report suggest that advanced AI systems can be developed on substantially smaller budgets, challenging the current industry norms. Check out more of Benzinga's Consumer Tech coverage by following this link. Read Next: OpenAI's Sam Altman Flip Flops On Trump After $500 Billion Stargate AI Project, Says 'I Wish I Had Done More Of My Own Thinking' Disclaimer: This content was partially produced with the help of AI tools and was reviewed and published by Benzinga editors. Photos courtesy: Shutterstock AVGOBroadcom Inc$213.37-12.8%Overview Rating:Good62.5%Technicals Analysis1000100Financials Analysis400100WatchlistOverviewGOOGAlphabet Inc$193.36-4.23%GOOGLAlphabet Inc$191.07-4.57%METAMeta Platforms Inc$610.00-5.79%MSFTMicrosoft Corp$414.14-6.74%NVDANVIDIA Corp$125.87-11.7%Market News and Data brought to you by Benzinga APIs
[58]
How Chinese AI Startup DeepSeek Made a Model that Rivals OpenAI
When Chinese quant hedge fund founder Liang Wenfeng went into AI research, he took 10,000 Nvidia chips and assembled a team of young, ambitious talent. Two years later, DeepSeek exploded on the scene. On January 20, DeepSeek, a relatively unknown AI research lab from China, released an open source model that's quickly become the talk of the town in Silicon Valley. According to a paper authored by the company, DeepSeek-R1 beats the industry's leading models like OpenAI o1 on several math and reasoning benchmarks. In fact, on many metrics that matter -- capability, cost, openness -- DeepSeek is giving Western AI giants a run for their money. DeepSeek's success points to an unintended outcome of the tech cold war between the US and China. US export controls have severely curtailed the ability of Chinese tech firms to compete on AI in the Western way -- that is, infinitely scaling up by buying more chips and training for a longer period of time. As a result, most Chinese companies have focused on downstream applications rather than building their own models. But with its latest release, DeepSeek proves that there's another way to win: by revamping the foundational structure of AI models and using limited resources more efficiently. "Unlike many Chinese AI firms that rely heavily on access to advanced hardware, DeepSeek has focused on maximizing software-driven resource optimization," explains Marina Zhang, an associate professor at the University of Technology Sydney, who studies Chinese innovations. "DeepSeek has embraced open source methods, pooling collective expertise and fostering collaborative innovation. This approach not only mitigates resource constraints but also accelerates the development of cutting-edge technologies, setting DeepSeek apart from more insular competitors." So who is behind the AI startup? And why are they suddenly releasing an industry-leading model and giving it away for free? WIRED talked to experts on China's AI industry and read detailed interviews with DeepSeek founder Liang Wenfeng to piece together the story behind the firm's meteoric rise. DeepSeek did not respond to several inquiries sent by WIRED. Even within the Chinese AI industry, DeepSeek is an unconventional player. It started as Fire-Flyer, a deep-learning research branch of High-Flyer, one of China's best-performing quantitative hedge funds. Founded in 2015, the hedge fund quickly rose to prominence in China, becoming the first quant hedge fund to raise over 100 billion RMB (around $15 billion). (Since 2021, the number has dipped to around $8 billion, though High-Flyer remains one of the most important quant hedge funds in the country.) For years, High-Flyer had been stockpiling GPUs and building Fire-Flyer supercomputers to analyze financial data. Then, in 2023, Liang, who has a master's degree in computer science, decided to pour the fund's resources into a new company called DeepSeek that would build its own cutting-edge models -- and hopefully develop artificial general intelligence. It was as if Jane Street had decided to become an AI startup and burn its cash on scientific research. Bold vision. But somehow, it worked. "DeepSeek represents a new generation of Chinese tech companies that prioritize long-term technological advancement over quick commercialization," says Zhang. Liang told the Chinese tech publication 36Kr that the decision was driven by scientific curiosity rather than a desire to turn a profit. "I wouldn't be able to find a commercial reason [for founding DeepSeek] even if you ask me to," he explained. "Because it's not worth it commercially. Basic science research has a very low return-on-investment ratio. When OpenAI's early investors gave it money, they sure weren't thinking about how much return they would get. Rather, it was that they really wanted to do this thing."
[75]
Tech stocks tank as US AI dominance no longer a sure bet
Chinese startup DeepSeek rolls out open LLMs to rival Meta, OpenAI at fraction of cost Share prices for some of the biggest American tech brands that crested the AI hype waves crashed this morning on the rocks of DeepSeek, a Chinese startup that last week released LLMs that challenges US dominance. As The Register revealed at the weekend, DeepSeek launched some openly available machine-learning models which perform favorably against US competitors OpenAI and Meta, according to benchmarks. And what's really set the cat amongst the investors is that they were trained using fewer Nvidia chips, or so DeepSeek claims. DeepSeek, founded in 2023, by Liang Wenfeng and financially supported by his quantitive hedge fund High Flyer, released DeepSeek V3 at the close of 2024, and has now rolled out R1, classified as a reasoning model optimized from V3. This led investors to question the viability of spending tens of billions of dollars on AI training, especially when the returns on investment still seem uncertain. This existential crisis hit stock valuations, with Nvidia down by as much as 14 percent in pre-market trading today, Microsoft falling 7 percent, and Meta down 5 percent. The Nasdaq 100 futures was down 3.51 percent at the time of writing amid the searching questions that DeepSeek is seemingly posing. "Deepseek R1 is AI's Sputnik moment," said Marc Andreessen, partner in venture capitalist Andreessen Horowitz, which raised billions last year to spend on AI startups. As Reg readers no doubt know, this effectively indicates that something has to change that means the US big tech can compete without spending so much. Microsoft, AWS, Google and others are ploughing billions into AI in the expectation that those bars come off. Meta said last week it is spending $60 billion on AI and there is also the Stargate initiative that is betting a collective $500 billion. As of September last year, the hyperscalers has spent $200 billion in capital expenditure since the start of 2023, and yet AI licenses were around a tenth of that. The bubble was expanding and expanding. Former Microsoft exec Steven Sinofsky said on X: "The current trajectory of AI if you read the news in the US is one of MASSIVE CapEx piled on top of even more MASSIVE CapEx. It is a race between Google, Meta, OpenAI/Microsoft, xAI, and to a lesser extent a few other super well-funded startups like Perplexity and Anthropic. "The past 5 years of AI have been bigger models, more data, more compute, and so on. Why? Because, I would argue, the innovation was driven by the cloud hyperscale companies and they were destined to take the approach of doing more of what they already did. They viewed data for training and huge models as their way of winning and their unique architectural approach." The "big scale solutions" are "consuming too much capita," he added on X. He reckons that beyond this, the current path is not sustainable. "It is a path that works against the history of computing, which is that resources needed become free, not more expensive. The market for computing simply doesn't accept solutions that cost more, especially consumption based pricing. We've seen Microsoft and Google do a bit of resetting with respect to pricing in the hopes of turning these massive CapEx efforts into direct revenue." The arrival of DeepSeek, which has built a scale out tool that is claimed to have been trained on 14.8 trillion tokens using 2,048 Nvidia H800s, amassing 2.788 million GPU hours, equating to a cost of circa $5.58 million, has been impactful in the financial world over the past few hours. "It seems as if there is a bit of reality dawning that China has not been sitting idle, even as these tariffs and investment restrictions on tech companies have been put in place," said Mitul Kotecha, Asia head of emerging markets macro and foreign exchange strategy at Barclays, as quoted by the Financial Times. ®
[77]
DeepSeek: The best ChatGPT alternative or a hotbed of dubious claims?
The internet seemingly has a new favorite AI, and it's not the latest ChatGPT model from industry untouchables OpenAI. Soaring to the top of Apple's App Store, Chinese artificial intelligence chatbot DeepSeek has now become the top-rated free app for productivity after a groundswell in popularity following the release of the DeepSeek-R1 "reasoning" model on January 20, overtaking OpenAI's ChatGPT in the process. Beyond App Store leaderboards, claims surrounding DeepSeek's development and capabilities may be even more impressive. The company behind the LLM (Large Language Model) claims it cost less than $6 million to train its DeepSeek-V3 model and used limited hardware compared to its American contemporaries while achieving similar results. However, while DeepSeek is proving popular with users and developers alike, mainly thanks to its favorable API pricing, all that glitters isn't gold when it comes to this app, and an air of controversy undercuts an otherwise successful launch of two highly capable AI models. DeepSeek was founded by Liang Wenfeng, a Chinese entrepreneur and co-founder of the High-Flyer hedge fund based in Hangzhou, Zhejiang, China. Originally, DeepSeek was intended to be an AGI (Artificial General Intelligence) research wing of High-Flyer, which has exclusively used AI in trading algorithms since 2021. However, since May 2023, DeepSeek has stood as its own company, with High-Flyer becoming one of its primary investors. The company's DeepSeek LLM (Large Language Model) debuted in November 2023 as the open-source DeepSeek Coder and was followed by DeepSeek-V2 in May 2024. The company launched its latest DeepSeek-V3 model in December 2024 and has since seen a swell of popularity, with its mobile app racking up over 1.6 million downloads. While the DeepSeek LLM is mainly similar to other popular chatbots like Google Gemini or ChatGPT, the app's free-to-use models are proving popular with users, and its developer-friendly API pricing is pushing it to the forefront of discussion. The old myth goes that during the space race of the 1960s, NASA spent millions in taxpayer dollars on developing a space pen after it realized that ball-point alternatives were incapable of writing in the zero-gravity environment of space. Meanwhile, their cosmonaut counterparts avoided such costs and headaches by simply using a pencil. While none of that is true, it's a parable of thriftiness and practicality that makes for an excellent story. However, mirroring the legend of the space pen, DeepSeek has seemingly managed to pull off a similar feat in cost-effectiveness and practicality through the development of its DeepSeek-V3 model, which it claims to have trained for less than $6 million, a fraction of the hundreds of millions spent by other companies pursuing similar outcomes (while achieving comparable levels of performance). Not only that, but DeepSeek's recent release of its DeepSeek-R1 "reasoning" model is designed to simulate logical thought by sacrificing the speed of a response for a more well-reasoned answer. It can achieve results equal to (if not better than) OpenAI's own "reasoning" model, GPT-o1 -- even as the company claims to be hamstrung by U.S. export restrictions on more powerful Nvidia GPUs. DeepSeek is free to use online via its web portal or on mobile (with both Android and iOS apps available). However, the impact that DeepSeek's emergence will have on the cost of AI for businesses, developers, and more could be most groundbreaking, with the company's API price model blowing the competition out of the water. While OpenAI currently charges $15 per million tokens (a unit of data that prompts are broken down into during the generation of a model's response), DeepSeek costs only 55 cents per million tokens, a phenomenal drop in charges for API users of up to 96 percent. This cost difference could be game-changing for many professional users involved with AI and poses a significant risk to OpenAI's potential income, with DeepSeek potentially now forcing the hands of other companies to lower their prices to remain competitive. DeepSeek's impact is already being felt in the markets. Several semiconductor names are feeling the hit, including Nvidia. Following the release of DeepSeek's latest models, pre-market trading dropped 13.8%, threatening to wipe out almost $500 billion from the company's trading cap. However, it has since climbed back to 11%. If everything DeepSeek has to offer sounds too good to be true, that's potentially because some of DeepSeek's claims may be just that. The performance of DeepSeek's V3 and R1 models cannot be disputed. Still, many questions remain about the company's actual pricing, its use of hardware, the cost of its training, and the sourcing of its training data. The latter has already been the subject of some controversy. Several users reported that DeepSeek V3 would refer to itself as ChatGPT, potentially indicating that this model was trained on public data sets generated by OpenAI's GPT-4 model. Speaking to TechCrunch, Mike Cook, a research fellow at King's College London specializing in AI, backed these claims, stating, "Obviously, the model is seeing raw responses from ChatGPT." Cook highlights that this may not be an intentional action by DeepSeek but also points out that the practice of training models on data generated by other models can be "very bad," likening it to "taking a photocopy of a photocopy" in the sense that the quality of outputs will degrade each time. It's also possible that by adopting generated training data, DeepSeek will inherit any of the same biases of the original model, adding to the chatbot's own biases, which enforce strict censorship by law of anti-Communist Party of China (CCP) narratives, including the events of the Tiananmen Square incident of 1989, Hong Kong protests, the ownership of Taiwan, China's treatment of the Uighur people, or the occupation of Tibet. This form of censorship only degrades trust in the platform, and founder Liang Wenfeng's ties to the CCP only heighten concerns about how user data may be used or how Chinese authorities could misappropriate the platform in the future. Writing for Biometric Update, Anthony Kimery, former Editor-in-Chief and co-founder of Homeland Security Today, highlighted how the platform could "support disinformation campaigns aimed at destabilizing U.S. institutions." DeepSeek's claims that it developed its models on less advanced hardware are also being questioned. Citi analyst Atif Malik states, "While DeepSeek's achievement could be groundbreaking, we question the notion that its feats were done without the use of advanced GPUs to fine-tune it and/or build the underlying LLMs the final model is based on through the distillation technique." Malik's questioning could have further weight, as while DeepSeek claims that its V3 model was trained using Nvidia H800 GPUs, a recent interview with Scale AI's founder Alexandr Wang on CNBC saw the company's CEO suggest "DeepSeek has about fifty thousand H100s." The very same GPUs that were blocked from export to China by the Biden administration in 2023, with Wang continuing, "they can't talk about obviously because it is against the export controls that [the] United States has put in place." DeepSeek is a proven hit that will give companies like OpenAI something to consider when retaining its sizable user base in the face of stiff competition. However, it remains to be seen if the new car smell still lingering on DeekSeek's latest models is masking the odor of misinformation surrounding how it developed its models and whether or not its pricing is sustainable in the long term. Given the U.S.' recent reaction to TikTok, it's hard to imagine that a company like DeepSeek goes without serious scrutiny for much longer, especially as its models risk upsetting the apple cart on President Trump's plans to keep the United States as the "world capital of AI."
[78]
This Chinese AI Startup is giving tough competition to Google, OpenAI, other Silicon Valley giants
Chinese Artificial Intelligence startup has hogged headline after it unveiled a new AI system.Chinese startup called DeepSeek unveiled a new artificial intelligence system that could match the capabilities of cutting-edge chatbots from companies such as OpenAI and Google, as per a report. That alone would have been a milestone. But the team behind the system, called DeepSeek-V3, described an even bigger step. In a research paper explaining how they built the technology, DeepSeek's engineers said they used only a fraction of the highly specialized computer chips that leading AI companies relied on to train their systems, NYT News Service reported. These chips are at the center of a tense technological competition between the United States and China. As the U.S. government works to maintain the country's lead in the global AI race, it is trying to limit the number of powerful chips, such as those made by Silicon Valley firm Nvidia, that can be sold to China and other rivals, according to NYT News Service report. But the performance of the DeepSeek model raises questions about the unintended consequences of the U.S. government's trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools that are freely available on the internet. The DeepSeek chatbot answered questions, solved logic problems and wrote its own computer programs as capably as anything already on the market, according to the benchmark tests that U.S. AI companies have been using. And it was created on the cheap, challenging the prevailing idea that only the tech industry's biggest companies -- all of them based in the United States -- could afford to make the most advanced AI systems. The Chinese engineers said they needed only about $6 million in raw computing power to build their new system. That is about 10 times less than tech giant Meta spent building its latest AI technology. Since OpenAI sparked the AI boom in 2022 with the release of ChatGPT, many experts and investors had concluded that no company could compete with the market leaders without spending hundreds of millions of dollars on specialized chips. The world's leading AI companies train their chatbots using supercomputers that use as many as 16,000 chips, if not more. DeepSeek's engineers, on the other hand, said they needed only about 2,000 specialized computer chips from Nvidia. The constraints on chips in China forced the DeepSeek engineers to "train it more efficiently so it could still be competitive," said Jeffrey Ding, an assistant professor at George Washington University who specializes in emerging technology and international relations. Earlier this month, the Biden administration issued new rules that aim to keep China from obtaining advanced AI chips through other countries. The rules build on multiple rounds of earlier restrictions that prevent Chinese companies from being able to buy or make cutting-edge computer chips. President Donald Trump has not yet indicated whether he will keep the rules or rescind them. The U.S. government has tried to keep advanced chips out of the hands of Chinese companies over concerns they could be used for military purposes. In response, some firms in China have stockpiled thousands of chips, while others sourced them from a thriving underground marketplace of smugglers. DeepSeek is run by a quantitative stock trading firm called High Flyer. By 2021, it had channeled its profits into acquiring thousands of Nvidia chips, which it used to train its earlier models. The company, which did not respond to requests for comment, has become known in China for scooping up talent fresh from top universities with the promise of high salaries and the ability to follow the research questions that most pique their interest. Zihan Wang, a computer engineer who worked on an earlier DeepSeek model, said the company also hires people without any computer science background to help the technology understand and be able to generate poetry and ace questions on the notoriously difficult Chinese college entrance examination. DeepSeek does not make any products for consumers, leaving its engineers to focus entirely on research. That means that its technology is not hemmed in by the strictest aspect of China's regulations on AI, which require consumer-facing technology to comply with the government's controls on information. Q1. Which is the leading company in AI chip manufacturer? A1. Nvidia is the leading company in AI chip manufacturer. Q2. What is DeepSeek model? A2. The DeepSeek chatbot answered questions, solved logic problems and wrote its own computer programs as capably as anything already on the market, according to the benchmark tests that U.S. AI companies have been using.
[79]
What Is DeepSeek? What to Know About the Chinese AI Startup That Spurred a Stock Selloff
DeepSeek's AI assistant recently topped the list of free iPhone apps on Apple's (AAPL) app store. (It wasn't in the top 10 on Alphabet's (GOOGL) Google Play store, though it is listed there.) In Apple's store, the app ranked higher than ChatGPT; not so on Google. Here's what you need to know about DeepSeek -- and why it's having a big impact on markets. And if you're wondering if it wasn't that long ago that another Chinese app was topping the download charts, you're not wrong. DeepSeek, a Chinese startup founded by hedge fund manager Liang Wenfeng, was founded in 2023 in Hangzhou, China, the tech hub home to Alibaba (BABA) and many of China's other high-flying tech giants. The company says its latest AI model released earlier this month offers performance that is on par with that of OpenAI's ChatGPT. It was also reportedly developed in just two months for under $6 million, at a time when the U.S. is limiting China's access to its most sophisticated chips and American AI leaders like OpenAI, Anthropic, and Meta Platforms (META) are spending billions of dollars on development. The rapid ascension of DeepSeek has investors worried it could threaten assumptions about how much competitive AI models cost to develop, as well as the kind of infrastructure needed to support them, with wide-reaching implications for the AI marketplace and Big Tech shares. The tech-heavy Nasdaq was down more than 3% in afternoon trading Monday as investors dragged a host of stocks with ties to AI, from chip to energy firms, downwards. Meanwhile, non-tech sectors like consumer staples rose Monday, marking a reconsideration of the market's momentum in recent months. Several analysts raised doubts about the longevity of the market's reaction Monday, suggesting that the day's pullback could offer investors a chance to pick up AI names set for a rebound. Bernstein's Stacy Rasgon called the reaction "overblown" and maintained a "buy" rating for Nvidia's stock price. Citi analysts, who said they expect AI companies to continue buying its advanced chips, also maintained a buy rating on Nvidia. Wedbush analysts, who voiced skepticism that any major U.S. companies would use a Chinese startup like DeepSeek to build their AI infrastructure, said "launching a competitive LLM model for consumer use cases is one thing... launching broader AI infrastructure is a whole other ballgame and nothing with DeepSeek makes us believe anything different."
[80]
DeepSeek Releases Open-Source AI Image Generator as American Stocks Continue to Crater
Silicon Valley's Chinese competitor has released another free AI model. DeepSeek, the Chinese startup that has managed to make a mockery of Silicon Valley's capital-bloated AI oligarchy, has done it again. On Monday morning, the company announced the release of yet another open-source AI system, this one an image generator thatâ€"the company claimedâ€"could best OpenAI's DALL-E and Stability AI's Stable Diffusion generators. The modelâ€"Janus-Pro-7Bâ€"was announced in a technical paper shared on DeepSeek's GitHub page Monday. It is an update of Janus, a simpler model that was released last October. Janus promises to bring multi-modal automation to new heights, and can conduct both image generation and image analysis, the company said. The paper claims that DeepSeek's model outperforms both DALL-E and Stable Diffusion on multiple AI benchmarks. The new model release follows news that the startup's R1 model, which was released in December, has surpassed ChatGPT in downloads on the Apple app store. That would be news enough on its own, but DeepSeek also claims it was able to create R1â€"which is an open-source "reasoning" modelâ€"in only two months with, what it claims, was only $6 million. Given that the U.S.'s top AI company, OpenAI, is worth over a hundred billion dollars and is closed-source, you can see why America's tech oligarchs might be freaking out about this. On the same morning that DeepSeek released its image-generator, the company said it was suffering from "large-scale malicious attacks†on its network, Reuters reported. As a result, the company said it would temporarily limit user registrations, though service would not be impacted for existing users. DeepSeek's releases have sent shockwaves through the U.S. stock market, and it isn't difficult to see why. America has increasingly sought to structure its entire economy around the AI industry, and Wall Street has poured billions into the companies selling this technology. Most recently, the Trump administration announced "Stargate," a $500 billion effort to create "AI infrastructure" by building data centers across the U.S. The gross amount of power and capital that has flowed into the small coterie of tech companies behind this technology is truly obscene. And yet, somehow, a Chinese company that appears to have a smidgeon of Big Tech's resources was able to create a comparable product in less time and fly to the top of the mobile downloads charts in a matter of weeks. As such, investors have rushed to sell off tech stocks and a lot of people are saying stuff like "RIP ChatGPT" and "DeepSeek could be an extinction-level event for venture capital firms." Before we write OpenAI's obituary just yet, however, it should be noted that commentators are predicting that DeepSeek's innovations could very well deepen America's commitment to the AI industry. As we all know, America always needs to be the best at everything, and the fact that it's been one-upped in the AI arms race can mean only one thing: it's time to double down.
[81]
The release of DeepSeek R1 shatters long-held assumptions about AI
The release of DeepSeek R1 -- an open-source artificial intelligence large-language model -- has caught the world by surprise and shattered long-held assumptions about AI development. According to venture capitalist Nic Carter, the release of the AI model, which was developed in China, dispelled notions that the country would only produce closed-source AI, and has eroded Silicon Valley's perceived advantages over global competitors. Carter added that DeepSeek is evidence that OpenAI does not have an unbeatable moat and that assumptions about scaling, value accrual in AI models, and development costs were also dispelled by the development. China's rapid deployment of the AI model contradicts previous reports that the country was six months behind the United States in AI development and creates further competition in the global AI arms race. The United States imposed an AI chip embargo on China in October 2022 -- preventing the export of select high-performance computing chips developed by US companies Nvidia and AMD. Despite the ban, Chinese firms turned to domestic manufacturing or used cloud-computing platforms such as Amazon Cloud Services to sidestep the restrictions and access these high-performance chips. China also relaxed its AI development regulations in August 2023 to encourage innovation, which included scrapping financial penalties for AI firms that deviate from the industry regulations. The AI global arms race is underway President Donald Trump has vowed to make the United States the AI capital of the world and sees both crypto and AI as central to continued US hegemony. Speaking at the World Economic Forum's Davos summit, the President said that deregulating the energy sector would fuel US ambitions to dominate the AI industry. On Jan. 22, President Trump announced the $500 billion "Stargate" AI infrastructure initiative, which included investments from OpenAI, tech company Oracle, and SoftBank. The goal of the project is to build AI and high-performance computing data centers in the United States, which the President claimed would create 100,000 jobs. United Kingdom Prime Minister Keir Starmer announced similar plans to develop AI infrastructure in the country in January 2025, including fast-tracking the development of data centers to remain competitive on the global stage.
[82]
China's DeepSeek AI is hitting Nvidia where it hurts
A chatbot made by Chinese artificial intelligence startup DeepSeek has rocketed to the top of Apple's App Store charts in the US this week, dethroning OpenAI's ChatGPT as the most downloaded free app. The eponymous AI assistant is powered by DeepSeek's open-source models, which the company says can be trained at a fraction of the cost using far fewer chips than the world's leading models. The claim has riled financial markets, sending Nvidia's shares down over 12 percent in pre-market training,
[83]
Chinese DeepSeek is an existential threat to OpenAI and Google
Two things ensured that 2025 started with a bang. One, Donald Trump's rash of shock-and-awe pronouncements that threatened to upend the existing world order, and two, a Chinese AI model called DeepSeek, which threatens to have a similar effect on the world of AI. Built as a 'side project' by a Chinese hedge fund High Flyer founded by Liang Wenfeng, the model has shocked Silicon Valley by outcompeting the best OpenAI and Google models like
[84]
How small Chinese AI start-up DeepSeek shocked Silicon Valley
A small Chinese artificial intelligence lab stunned the world this week by revealing the technical recipe for its cutting-edge model, turning its reclusive leader into a national hero who has defied US attempts to stop China's high-tech ambitions. DeepSeek, founded by hedge fund manager Liang Wenfeng, released its R1 model on Monday, explaining in a detailed paper how to build a large language model on a bootstrapped budget that can automatically learn and improve itself without human supervision. US companies including OpenAI and Google DeepMind pioneered developments in reasoning models, a relatively new field of AI research that is attempting to make models match human cognitive capabilities. In December, the San Francisco-based OpenAI released the full version of its o1 model but kept its methods secret. DeepSeek's R1 release sparked a frenzied debate in Silicon Valley about whether better resourced US AI companies, including Meta and Anthropic, can defend their technical edge. Meanwhile, Liang has become a focal point of national pride at home. This week, he was the only AI leader selected to attend a publicised meeting of entrepreneurs with the country's second-most powerful leader, Li Qiang. The entrepreneurs were told to "concentrate efforts to break through key core technologies." In 2021, Liang started buying thousands of Nvidia graphic processing units for his AI side project while running his quant trading fund High-Flyer. Industry insiders viewed it as the eccentric actions of a billionaire looking for a new hobby. "When we first met him, he was this very nerdy guy with a terrible hairstyle talking about building a 10,000-chip cluster to train his own models. We didn't take him seriously," said one of Liang's business partners. "He couldn't articulate his vision other than saying: I want to build this, and it will be a game change. We thought this was only possible from giants like ByteDance and Alibaba," the person added. Liang's status as an outsider in the AI field was an unexpected source of strength. At High-Flyer, he built a fortune by using AI and algorithms to identify patterns that could affect stock prices. His team became adept at using Nvidia chips to make money trading stocks. In 2023, he launched DeepSeek, announcing his intention to develop human-level AI. "Liang built an exceptional infrastructure team that really understands how the chips worked," said one founder at a rival LLM company. "He took his best people with him from the hedge fund to DeepSeek." After Washington banned Nvidia from exporting its most powerful chips to China, local AI companies have been forced to find innovative ways to maximise the computing power of a limited number of onshore chips -- a problem Liang's team already knew how to solve. "DeepSeek's engineers know how to unlock the potential of these GPUs, even if they are not state of the art," said one AI researcher close to the company. Industry insiders say DeepSeek's singular focus on research makes it a dangerous competitor because it is willing to share its breakthroughs rather than protect them for commercial gains. DeepSeek has not raised money from outside funds or made significant moves to monetise its models. "DeepSeek is run like the early days of DeepMind," said one AI investor in Beijing. "It is purely focused on research and engineering." Liang, who is personally involved in DeepSeek's research, uses proceeds from his hedge fund trading to pay top salaries for the best AI talent. Along with TikTok-owner ByteDance, DeepSeek is known for giving the highest remuneration available to AI engineers in China, with staff based in offices in Hangzhou and Beijing. "DeepSeek's offices feel like a university campus for serious researchers," said the business partner. "The team believes in Liang's vision: to show the world that the Chinese can be creative and build something from zero." DeepSeek and High-Flyer did not respond to a request for comment. Liang has styled DeepSeek as a uniquely "local" company, staffed with PhDs from top Chinese schools, Peking, Tsinghua and Beihang universities rather than experts from US institutions. In an interview with the domestic press last year, he said his core team "did not have people who returned from overseas. They are all local . . . We have to develop the top talent ourselves". DeepSeek's identity as a purely Chinese LLM company has won it plaudits at home. DeepSeek claimed it used just 2,048 Nvidia H800s and $5.6mn to train a model with 671bn parameters, a fraction of what OpenAI and Google spent to train comparably sized models. Ritwik Gupta, AI policy researcher at the University of California, Berkeley, said DeepSeek's recent model releases demonstrate that "there is no moat when it comes to AI capabilities". "The first person to train models has to expend lots of resources to get there," he said. "But the second mover can get there cheaper and more quickly." Gupta added that China had a much larger talent pool of systems engineers than the US who understand how to get the best use of computing resources to train and run models more cheaply. Industry insiders say that even though DeepSeek has shown impressive results with limited resources, it remains an open question whether it can continue to be competitive as the industry evolves. Returns at High-Flyer, its big backer, lagged behind in 2024, which one person close to Liang blamed on the founder's attention being mostly focused on DeepSeek. Its US rivals are not standing still. They are building mega "clusters" of Nvidia's next-generation Blackwell chips, creating the computing power that threatens to once again create a performance gap with Chinese rivals. This week, OpenAI said it was creating a joint venture with Japan's SoftBank, dubbed Stargate, with plans to spend at least $100bn on AI infrastructure in the US. Elon Musk's xAI is massively expanding its Colossus supercomputer to contain more than 1mn GPUs to help train its Grok AI models. "DeepSeek has one of the largest advanced computing clusters in China," said Liang's business partner. "They have enough capacity for now, but not much longer."
[85]
How DeepSeek triggered Nasdaq's 4.5% drop, Dow's 430-point slide, and S&P's 2.3% loss
Stock futures took a sharp dive on Monday, driven by concerns over a potential bubble in artificial intelligence stocks. The cause? Chinese startup DeepSeek has launched a highly efficient AI model at a fraction of the cost of U.S.-based competitors, raising questions about the sustainability of the billions being funneled into AI development. This week's tech stock plunge fueled by AI concerns Futures tied to the Dow Jones Industrial Average dropped 430 points, or 1%. The Nasdaq 100 futures were the hardest hit, shedding 4.5%, while S&P 500 futures tumbled 2.3%. Nvidia, a leader in AI technology, saw a premarket drop of 12%, while other major tech players like AMD, Broadcom, and Microsoft faced similar declines. DeepSeek, a relatively new AI startup based in Hangzhou, China, released its open-source AI language model, claiming it outperformed OpenAI's models in several tests. Launched for under $6 million, DeepSeek's cost-effective approach is fueling fears that the U.S. AI investment cycle may be overhyped, given the massive capital allocations announced by companies like Microsoft and Meta, which are investing billions into AI development. How to setup DeepSeek-R1 easily for free (online and local)? DeepSeek's model has stirred wide speculation, with some investors rethinking their positions. The company's advanced AI excels despite utilizing cheaper chips and less data than industry-standard models. The result? A more efficient, lower-cost alternative to the massively resource-heavy AI products from U.S. giants. This development casts doubt on the trajectory of AI-driven stock growth. Even derivative sectors tied to AI saw significant losses. Power providers like Constellation Energy and Vistra witnessed declines of 14% and 11%, respectively. The market sentiment surrounding AI is further complicated by the looming earnings reports from 'Big Tech' companies, including Meta, Microsoft, and Apple, and a Federal Reserve policy meeting poised to make waves in the coming week. DeepSeek's disruptive impact can't be ignored. The startup's ability to offer its AI model at a fraction of the price of leading competitors highlights the question of whether the current AI boom might be artificially inflated. What's more, DeepSeek's model operates under China's stringent censorship protocols, which may undermine its credibility in some areas. Markets will likely remain on edge as traders brace for upcoming reports and the fallout from DeepSeek's emerging prominence in the AI space. U.S. futures sank Monday as a Chinese startup's cheaper AI model rattled Wall Street, triggering a steep selloff in technology shares and spurring doubts about American leadership in artificial intelligence. Futures tumble as AI doubts rise Dow Jones Industrial Average futures dropped around 430 points, or 1%, while the S&P 500 shed roughly 2.3%. Nasdaq 100 futures plunged 4.5%, led by losses in tech and chipmaking stocks. Another report showed the Dow off by about 380 points, reflecting volatile sentiment. Nvidia slumped as much as 12% in premarket trading, while Broadcom fell 13% and AMD slid 6%. Investors also pulled back from Microsoft, down 7%, and Meta Platforms, down 4%. The downturn follows DeepSeek's release of its open-source AI model, which reportedly outperforms OpenAI's comparable system in certain benchmarks at a fraction of the cost. The startup claims its large-language model, launched last year for less than $6 million, challenges the hefty capital expenditures of U.S. tech giants. JPMorgan analyst Sandeep Deshpande noted concerns about "the huge capex announcements of Microsoft, which is spending $80bn in '25, while Meta recently announced investments between $6bn and $65bn," adding that DeepSeek's resource-efficient approach "is posing thoughts to investors that the AI investment cycle may be over-hyped and a more efficient future is possible." Markets are also bracing for earnings from major players Apple, Tesla, Meta, and Microsoft, all scheduled to release quarterly results this week. Analysts say guidance on AI-related spending will draw particular attention in light of the DeepSeek development. Trade war worries added an extra layer of unease after President Donald Trump threatened to impose 25% tariffs on Colombian goods, then put the duties on hold following a last-minute agreement. The episode underscored concerns that the administration might resort to tariffs to achieve policy aims. DeepSeek's cost-cutting model stirs debate Based in Hangzhou and spun off in March 2023 from High-Flyer Quant, DeepSeek first launched an open-source language model in November 2023. Its latest product, R1, debuted last week. The company says R1's performance matches OpenAI's o1 on certain tests while relying on less powerful chips, a byproduct of U.S. export controls that have forced Chinese AI startups to prioritize efficiency. Venture capitalist Marc Andreessen called R1 "AI's Sputnik moment," pointing to its potential to disrupt pricing in a space dominated by U.S. heavyweights. Investors kept a close watch on the Federal Reserve, which begins its first policy meeting of 2025 on Tuesday. Most analysts expect interest rates to remain unchanged, with CMEGroup's FedWatch Tool showing over a 99% chance of a pause. The outlook comes as all three major U.S. indexes ended last week on a positive note, marking their second-straight winning stretch. The S&P 500 hit a new intraday high on Friday after posting an all-time closing record a day earlier. DeepSeek's reliance on Chinese regulations has prompted scrutiny of its censorship practices. The platform reportedly refuses to discuss topics like the Tiananmen Square protests, stating such queries lie "beyond [its] current scope." Even so, DeepSeek's iPhone app rocketed to the top of the U.S. App Store's free downloads on Sunday, overtaking ChatGPT. Disclaimer: The content of this article is for informational purposes only and should not be construed as investment advice. We do not endorse any specific investment strategies or make recommendations regarding the purchase or sale of any securities.
[86]
DeepSeek Introduces Open-Source AI Model to Compete with Industry Giants Like OpenAI
DeepSeek is a Chinese AI research lab that just launched its first open-source model, DeepSeek-R1. The highly innovative model that can compete with big industry players, such as OpenAI, in various fields, for example, in mathematical reasoning, code generation, and cost efficiency. This marks the beginning of an entirely new level in the AI race globally. Founded by Liang Wenfeng in 2023, DeepSeek is a company that originated from the deep-learning division of Fire-Flyer, a branch of the High-Flyer hedge fund. In contrast to most Chinese companies, DeepSeek is not affiliated with big tech companies like Baidu or Alibaba. According to Liang, his goal for is scientific curiosity: he wants to create the most advanced AI solutions beyond financial gain. High-Flyer was founded in 2015. Initially, it focused on the analysis of financial data using high-performance computing. Liang redirected the company toward AI research; he prioritized innovation over any immediate financial returns or bottom line.
[87]
DeepSeek tops Apple app store, surpasses ChatGPT in downloads By Investing.com
Investing.com-- Chinese artificial intelligence firm DeepSeek has claimed the top spot on the U.S. Apple Inc (NASDAQ:AAPL) App Store's free app charts, overtaking OpenAI's ChatGPT, the longstanding leader in generative AI applications. DeepSeek's app, powered by its cutting-edge R1 model, has rapidly gained popularity for its ability to generate human-like responses and perform complex tasks with unprecedented efficiency. The R1 model, is one of the most advanced AI systems globally, offering enhanced accuracy, faster response times, and improved multilingual support. DeepSeek's breakthrough has also triggered a reevaluation of the AI landscape. Analysts suggest its competitive pricing and robust performance could disrupt market dynamics, forcing incumbents to adapt. DeepSeek officially launched its app on January 13, offering free access to advanced AI models. Meanwhile, ChatGPT, a staple in the app charts since its debut, remains widely used but has been challenged by DeepSeek's surge.
[88]
Chinese start-ups like DeepSeek are challenging global AI giants
The writer is professor of law at the Gould School of Law, University of Southern California, and author of 'High Wire: How China Regulates Big Tech and Governs Its Economy' Last month, a Chinese start-up called DeepSeek astonished the international tech community with its latest open-source artificial intelligence model. DeepSeek-V3 delivers a performance comparable to that of better-funded US rivals such as OpenAI. This week it impressed once again with R1, its foray into AI reasoning. DeepSeek is not a one-off. Since the middle of last year, Chinese tech companies such as Alibaba, Tencent, ByteDance, Moonshot and 01.ai have been steadily narrowing the gap with US peers, matching their capabilities and surpassing them in cost efficiency. China's achievements in efficiency are no accident. They are a direct response to the escalating export restrictions imposed by the US and its allies. By limiting China's access to advanced AI chips, the US has inadvertently spurred its innovation. To reduce reliance on high-end chips from overseas, Chinese AI companies have experimented with novel approaches in algorithms, architecture and training strategies. Many have embraced a "mixture-of-experts" approach, focusing on smaller AI models trained on specific data. These can deliver powerful results while reducing computing resources. DeepSeek-V3 embodies the success of this resourceful approach. According to its technical report, the model was trained using a data centre powered by Nvidia H800 GPUs -- a less advanced chip than Nvidia's latest releases. Despite this, DeepSeek completed training in just two months at a cost of $5.5mn -- a fraction of the sums reportedly spent by US companies such as OpenAI. DeepSeek has also drastically reduced inference costs, earning it the nickname the "Pinduoduo of AI", a reference to the cost-slashing business model of the popular Chinese discount ecommerce giant. This breakthrough has profound implications. It challenges the widely held assumption that cutting-edge AI requires vast amounts of computational power and many billions of dollars. DeepSeek demonstrates how software ingenuity can offset hardware constraints. It also lays bare the limits of US export controls designed to slow China's AI progress. While these measures may deliver short-term disruptions, their impact diminishes over time as China innovates to adapt. The inconvenient truth for US policymakers is that strict export controls have forced Chinese tech companies to become more self-reliant, spurring breakthroughs that might not have occurred otherwise. Not everyone agrees, however. Gregory Allen, director of the Wadhwani AI Center, at the Center for Strategic and International Studies think-tank, contends that China would have advanced regardless of US restrictions. Certainly, the Chinese government has long aspired to make the country into an AI superpower. But its record in resource allocation is poor. State-led initiatives often result in inefficiency, corruption and waste -- a reality starkly illustrated by high-profile scandals and slow progress in the country's semiconductor industry. What American AI restrictions inadvertently achieved was a powerful commercial incentive for China's private sector to step into the gap. And while the US often dominates in groundbreaking research, Chinese companies excel in execution, affordability and product diffusion. In ecommerce, electric vehicles, solar panels and batteries, they have demonstrated an impressive ability to scale up. The rise of DeepSeek is a vivid reminder that constraints can fuel innovation. For Donald Trump's new administration, this raises an uncomfortable question: will ever-tighter sanctions on China accelerate the very progress they are designed to suppress?
[89]
How Chinese A.I. Start-Up DeepSeek Is Competing With OpenAI and Google
Sign up for the On Tech newsletter. Get our best tech reporting from the week. Get it sent to your inbox. The day after Christmas, a small Chinese start-up called DeepSeek unveiled a new A.I. system that could match the capabilities of cutting-edge chatbots from companies like OpenAI and Google. That alone would have been a milestone. But the team behind the system, called DeepSeek-V3, described an even bigger step. In a research paper explaining how they built the technology, DeepSeek's engineers said they used only a fraction of the highly specialized computer chips that leading A.I. companies relied on to train their systems. These chips are at the center of a tense technological competition between the United States and China. As the U.S. government works to maintain the country's lead in the global A.I. race, it is trying to limit the number of powerful chips, like those made by Silicon Valley firm Nvidia, that can be sold to China and other rivals. But the performance of the DeepSeek model raises questions about the unintended consequences of the American government's trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools that are freely available on the internet. The DeepSeek chatbot answered questions, solved logic problems and wrote its own computer programs as capably as anything already on the market, according to the benchmark tests that American A.I. companies have been using. And it was created on the cheap, challenging the prevailing idea that only the tech industry's biggest companies -- all of them based in the United States -- could afford to make the most advanced A.I. systems. The Chinese engineers said they needed only about $6 million in raw computing power to build their new system. That is about 10 times less than the tech giant Meta spent building its latest A.I. technology. "The number of companies who have $6 million to spend is vastly greater than the number of companies who have $100 million or $1 billion to spend," said Chris V. Nicholson, an investor with the venture capital firm Page One Ventures, who focuses on A.I. technologies. Since OpenAI sparked the A.I. boom in 2022 with the release of ChatGPT, many experts and investors had concluded that no company could compete with the market leaders without spending hundreds of millions dollars on specialized chips. The world's leading A.I. companies train their chatbots using supercomputers that use as many as 16,000 chips, if not more. DeepSeek's engineers, on the other hand, said they needed only about 2,000 specialized computer chips from Nvidia. The constraints on chips in China forced the DeepSeek engineers to "train it more efficiently so it could still be competitive," said Jeffrey Ding, an assistant professor at George Washington University who specializes in emerging technology and international relations. Earlier this month, the Biden administration issued new rules that aim to keep China from obtaining advanced A.I. chips through other countries. The rules build on multiple rounds of earlier restrictions that prevent Chinese companies from being able to buy or make cutting-edge computer chips. President Trump has not yet indicated whether he will the rules or rescind them. The U.S. government has tried to keep advanced chips out of the hands of Chinese companies over concerns they could be used for military purposes. In response, some firms in China have stockpiled thousands of chips, while others sourced them from a thriving underground marketplace of smugglers. DeepSeek is run by a quantitative stock trading firm called High Flyer. By 2021, it had channeled its profits into acquiring thousands of Nvidia chips, which it used to train its earlier models. The company, which did not respond to requests for comment, has become known in China for scooping up talent fresh from top universities with the promise of high salaries and the ability to follow the research questions that most pique their interest. Zihan Wang, a computer engineer who worked on an earlier DeepSeek model, said the company also hires people without any computer science background to help the technology understand and be able to generate poetry and ace questions on the notoriously difficult Chinese college entrance examination. DeepSeek does not make any products for consumers, leaving its engineers to focus entirely on research. That means that its technology is not hemmed in by the strictest aspect of China's regulations on A.I., which require consumer-facing technology to comply with the government's controls on information. The leading American companies continue to advance the state of the art in A.I. In December, OpenAI unveiled a new "reasoning" system called o3 that exceeds the performance of existing technologies, though it is not yet widely available outside the company. But DeepSeek continues to show that it is not far behind. This month, it released an impressive reasoning model of its own. (The New York Times has sued OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to A.I. systems. OpenAI and Microsoft have denied those claims.) A crucial part of this rapidly changing global market is an old idea: open source software. Like many other companies, DeepSeek has open sourced its latest A.I. system, meaning that it has shared the underlying code with other businesses and researchers. This allows others to build and distribute their own products using the same technologies. While employees at big Chinese technology companies are limited to collaborating with colleagues, "if you work on open source, you work with talent around the world," said Yineng Zhang, lead software engineer at Baseten in San Francisco who works on the open source SGLang project. He helps other people and companies build products using DeepSeek's system. The open source ecosystem for A.I. gathered steam in 2023 when Meta freely shared an A.I. system called LLama. Many assumed that this community would flourish only if the companies like Meta -- tech giants with massive data centers filled with specialized chips -- continued to open source their technologies. But DeepSeek and others have shown that they, too, can expand the powers of open source technologies." Many executives and pundits have argued that the big U.S. companies should not open source their technologies because they could be used to spread disinformation or cause other serious harm. Some U.S. lawmakers have explored the possibility of preventing or throttling the practice. But others argue that if regulators stifle the progress of open source technology in the United States, China will gain a significant edge. If the best open source technologies come from China, they argue, U.S. developers will build their systems atop those technologies. In the long-run, that could put China at the heart of A.I. research and development. "The center of gravity of the open source community has been moving to China," said Ion Stoica, a professor of computer science at the University of California, Berkeley. "This could be a huge danger for the U.S.," because it allows China to accelerate the development of new technologies. Hours after his inauguration, President Trump rescinded a Biden administration executive order that threatened to curb open source technologies. Dr. Stoica and his students recently built an A.I. system called Sky-T1 that rivals the performance of OpenAI latest system, called OpenAI o1, on certain benchmark tests. They needed only $450 in computing power. They did this by building on top of two open source technologies released by the Chinese tech giant Alibaba. Their $450 system is not as powerful as OpenAI's technology or DeepSeek's new system. And the techniques they used are unlikely to yield systems that exceed the performance of the leading technologies. But the project showed that even operations with minuscule resources can build competitive systems. Reuven Cohen, a technology consultant in Toronto, has been using DeepSeek-V3 since late December. He says it is comparable to the latest systems from OpenAI, Google and the San Francisco start-up Anthropic -- and much cheaper to use. "DeepSeek is a way for me to save money," he said. "This is the kind of technology that someone like me wants to use."
[90]
What Is DeepSeek, the New Chinese OpenAI Rival?
A new Chinese AI model, created by the Hangzhou-based startup DeepSeek, has stunned the American AI industry by outperforming some of OpenAI's leading models, displacing ChatGPT at the top of the iOS app store, and usurping Meta as the leading purveyor of so-called open source AI tools. All of which has raised a critical question: despite American sanctions on Beijing's ability to access advanced semiconductors, is China catching up with the U.S. in the global AI race? At a supposed cost of just $6 million to train, DeepSeek's new R1 model, released last week, was able to match the performance on several math and reasoning metrics by OpenAI's o1 model - the outcome of tens of billions of dollars in investment by OpenAI and its patron Microsoft. The Chinese model is also cheaper for users. Access to its most powerful versions costs some 95% less than OpenAI and its competitors. The upshot: the U.S. tech industry is suddenly faced with a potentially cheaper and more powerful challenger, unnerving investors, who sold off American tech stocks on Monday morning. Yet not everyone is convinced. Some American AI researchers have cast doubt on DeepSeek's claims about how much it spent, and how many advanced chips it deployed to create its model. Few, however, dispute DeepSeek's stunning capabilities. "Deepseek R1 is AI's Sputnik moment," wrote prominent American venture capitalist Marc Andreessen on X, referring to the moment in the Cold War when the Soviet Union managed to put a satellite in orbit ahead of the United States. So, what is DeepSeek and what could it mean for U.S. tech supremacy? DeepSeek was founded less than two years ago by the Chinese hedge fund High Flyer as a research lab dedicated to pursuing Artificial General Intelligence, or AGI. A spate of open source releases in late 2024 put the startup on the map, including the large language model "v3", which outperformed all of Meta's open-source LLMs and rivaled OpenAI's closed-source GPT4-o. At the time, Liang Wenfeng, the CEO, reportedly said that he had hired young computer science researchers with a pitch to "solve the hardest questions in the world" -- critically, without aiming for profits. Early signs were promising: his products were so efficient that DeepSeek's 2024 releases sparked a price war within the Chinese AI industry, forcing competitors to slash prices. This year, that price war looks set to reach across the Pacific Ocean. Yet DeepSeek's AI looks different from its U.S. competitors in one important way. Despite their high performance on reasoning tests, Deepseek's models are constrained by China's restrictive policies regarding criticism of the ruling Chinese Communist Party (CCP). DeepSeek R1 refuses to answer questions about the massacre at Tiananmen Square, Beijing, in 1989, for example. "Sorry, that's beyond my current scope. Let's talk about something else," the model said when queried by TIME. At a moment when Google, Meta, Microsoft, Amazon and dozens of their competitors are preparing to spend further tens of billions of dollars on new AI infrastructure, DeepSeek's success has raised a troubling question: Could Chinese tech firms potentially match, or even surpass, their technical prowess while spending significantly less? Meta, which plans to spend $65 billion on AI infrastructure this year, has already set up four "war rooms" to analyze DeepSeek's models, seeking to find out how the Chinese firm had managed to train a model so cheaply and use the insights to improve its own open source Llama models, tech news site The Information reported over the weekend. In the financial markets, Nvidia's stock price dipped more than 15% on Monday morning on fears that fewer AI chips may be necessary to train powerful AI than previously thought. Other American tech stocks were also trading lower. "While [DeepSeek R1] is good news for users and the global economy, it is bad news for U.S. tech stocks," says Luca Paolini, chief strategist at Pictet Asset Management. "It may result in a nominal downsizing of capital investment in AI and pressure on margins, at a time when valuation and growth expectations are very stretched." But American tech hasn't lost -- at least not yet. For now, OpenAI's "o1 Pro" model is still considered the most advanced in the world. The performance of DeepSeek R1, however, does suggest that China is much closer to the frontier of AI than previously thought, and that open-source models have just about caught up to their closed-source counterparts. Perhaps even more worrying for companies like OpenAI and Google, whose models are closed source, is how much -- or rather, how little -- DeepSeek is charging consumers to access its most advanced models. OpenAI charges $60 per million "tokens", or segments of words, outputted by its most advanced model, o1. By contrast DeepSeek charges $2.19 for the same number of tokens from R1 -- nearly 30 times less. "It erodes the industrial base, it erodes the margin, it erodes the incentive for further capital investment into western [AI] scaling from private sources," says Edouard Harris, the chief technology officer of Gladstone AI, an AI firm that works closely with the U.S. government. DeepSeek's success was all the more explosive because it seemed to call into question the effectiveness of the U.S. government's strategy to constrain China's AI ecosystem by restricting the export of powerful chips, or GPUs, to Beijing. If DeepSeek's claims are accurate, it means China has the ability to create powerful AI models despite those restrictions, underlining the limits of the U.S. strategy. DeepSeek has claimed it is constrained by access to chips, not cash or talent, saying it trained its models v3 and R1 using just 2,000 second-tier Nvidia chips. "Money has never been the problem for us," DeepSeek's CEO, Liang Wenfeng, said in 2024. "Bans on shipments of advanced chips are the problem." (Current U.S. policy makes it illegal to export to China the most advanced types of AI chips, the likes of which populate U.S. datacenters used by OpenAI and Microsoft.) But are those claims true? "My understanding is DeepSeek has 50,000 H100s," Scale AI CEO Alexandr Wang recently told CNBC in Davos, referring to the highest-powered Nvidia GPU chips currently on the market. "They can't talk about [them], because it is against the export controls that the U.S. has put in place." (An H100 cluster of that size would cost in the region of billions of dollars.) In a sign of how seriously the CCP is taking the technology, Liang, Deepseek's CEO, met with China's premier Li Qiang in Beijing last Monday. In that meeting, Liang reportedly told Li that DeepSeek needs more chips. "DeepSeek only has access to a few thousand GPUs, and yet they're pulling this off," says Jeremie Harris, CEO of Gladstone AI. "So this raises the obvious question: what happens when they get an allocation from the Chinese Communist Party to proceed at full speed?" Even though China might have achieved a startling level of AI capability with fewer chips, experts say more computing power will always remain a strategic advantage. On that front, the U.S. remains far ahead. "It's never a bad thing to have more of it," says Dean Ball, a research fellow at George Mason University. "No matter how much you have of it, you will always use it." The short answer: from Washington's perspective, in uncertain waters. In the closing days of the Biden Administration, outgoing National Security Adviser Jake Sullivan warned that the speed of AI advancement was "the most consequential thing happening in the world right now." And just days into his new job, President Trump announced a new $500 billion venture, backed by OpenAI and others, to build the infrastructure vital for the creation of "artificial general intelligence" -- the next leap forward in AI, with systems advanced enough to make new scientific breakthroughs and reason in ways that have so far remained in the realm of science fiction. Read More: What to Know About 'Stargate,' OpenAI's New Venture Announced by President Trump And although questions remain about the future of U.S. chip restrictions on China, Washington's priorities were apparent in President Trump's AI executive order, also signed during his first week in office, which declared that "it is the policy of the United States to sustain and enhance America's global AI dominance in order to promote human flourishing, economic competitiveness, and national security." Maintaining this dominance will mean, at least in part, understanding exactly what Chinese tech firms are doing -- as well as protecting U.S. intellectual property, experts say. "There's a good chance that DeepSeek and many of the other big Chinese companies are being supported by the [Chinese] government, in more than just a monetary way," says Edouard Harris of Gladstone AI, who also recommended that U.S. AI companies harden their security measures. Since December, OpenAI's new o1 and o3 models have smashed records on advanced reasoning tests designed to be difficult for AI models to pass. Read More: AI Models Are Getting Smarter. New Tests Are Racing to Catch Up DeepSeek R1 does something similar, and in the process exemplifies what many researchers say is a paradigm shift: instead of scaling the amount of computing power used to train the model, researchers scale the amount of time (and thus, computing power and electricity) the model uses to think about a response to a query before answering. It is this scaling of what researchers call "test-time compute" that distinguishes the new class of "reasoning models," such as DeepSeek R1 and OpenAI's o1, from their less sophisticated predecessors. Many AI researchers believe there's plenty of headroom left before this paradigm hits its limit. Some AI researchers hailed DeepSeek's R1 as a breakthrough on the same level as DeepMind's AlphaZero, a 2017 model that became superhuman at the board games Chess and Go by purely playing against itself and improving, rather than observing any human games. That's because R1 wasn't "pretrained" on human-labeled data in the same way as other leading LLMs. Instead, DeepSeek's researchers found a way to allow the model to bootstrap its own reasoning capabilities essentially from scratch. "Rather than explicitly teaching the model on how to solve a problem, we simply provide it with the right incentives, and it autonomously develops advanced problem-solving strategies," they claim. The finding is significant because it suggests that powerful AI capabilities might emerge more rapidly and with less human effort than previously thought, with just the application of more computing power. "DeepSeek R1 is like GPT-1 of this scaling paradigm," says Ball. Ultimately, China's recent AI progress, instead of usurping U.S. strength, might in fact be the beginning of a reordering -- a step, in other words, toward a future where, instead of a hegemonic power, there are many competing centers of AI power. "China will still have their own superintelligence(s) no more than a year later than the US, absent [for example] a war," wrote Miles Brundage, a former OpenAI policy staffer, on X. "So unless you want (literal) war, you need to have a vision for navigating multipolar AI outcomes."
[91]
What is DeepSeek?
An AI-powered chatbot by the Chinese company DeepSeek has quickly become the most downloaded free app on Apple's store, following its January release in the US. The app's sudden popularity, as well as DeepSeek's reportedly low costs compared to those of US-based AI companies, have thrown financial markets into a spin. Silicon Valley venture capitalist Marc Andreessen has hailed DeepSeek as "one of the most amazing and impressive breakthroughs" in AI. The company says its latest AI models are on par with industry-leading models in the US - like ChatGPT - at a fraction of the cost. Researchers behind the app have said it only took $6m (£4.8m) to build it, much less than the billions spent by AI companies in the US.
[92]
Business - Chinese chatbot DeepSeek leaps to top spot on App Store, overtaking ChatGPT
DeepSeek not only has a cute whale as its logo, but is fast becoming a whale of a player in the AI game. The Chinese chatbot has leapt to the top of the iPhone App Store downloads leaderboard in the US, overtaking ChatGPT, and in France it's currently sitting in second place. It's free to use and open source, with the Chinese company saying it used cheaper computer chips and less data than its American rival OpenAI. Tech giants like Nvidia, Meta and Alphabet have poured hundreds of billions of dollars into artificial intelligence, but now the supply chain everyone has been investing in looks like it has serious competition, and the news has spooked tech stocks worldwide. We take a closer look.
[95]
Calm down: DeepSeek R1 is great, but ChatGPT's product advantage is far from over
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Just a week ago -- on January 20, 2025 -- Chinese AI startup DeepSeek unleashed a new, open source AI model called R1 that might have initially been mistaken for one of the ever-growing mass of nearly interchangeable rivals that have sprung up since OpenAI debuted ChatGPT (powered by its own GPT-3 model, initially) more than two years ago. But that quickly proved unfounded, as DeepSeek's mobile app has in that short time rocketed up the charts of the Apple App Store in the U.S. to dethrone ChatGPT for the number one spot and caused a massive market correction as investors dumped stock in formerly hot computer chip makers such as Nvidia, whose graphics processing units (GPUs) have been in high demand for use in massive superclusters to train new AI models and serve them up to customers on an ongoing basis (a modality known as "inference.") Venture capitalist Marc Andreessen, echoing sentiments of other tech workers, wrote on the social network X last night: "Deepseek R1 is AI's Sputnik moment," comparing it to the pivotal October 1957 launch of the first artificial satellite in history, Sputnik 1, by the Soviet Union -- sparking the "space race" between that country and the U.S. to dominate space travel and reach the moon. Sputnik's launch galvanized the U.S. to invest heavily in research and development of spacecraft and rocketry, and while not a perfect analogy since in this case -- heavy investment was not needed to create DeepSeek R1, quite the contrary (more on this below) -- it does seem to signify a major turning point in the global AI marketplace as for the first time, an AI product from China has become the most popular in the world. But before we jump on the DeepSeek hype train, let's take a step back and examine the reality. As someone who has extensively used OpenAI's ChatGPT -- both on web and mobile platforms -- and followed AI advancements closely, I believe that while DeepSeek R1's achievements are noteworthy, it's not time to dismiss ChatGPT or U.S. AI investments just yet. And please note, I am not being paid by OpenAI to say this -- I've never taken money from the company and don't plan on it. What DeepSeek R1 does well DeepSeek R1 is part of a new generation of large "reasoning" models that does more than answer user queries: it reflects on its own analysis while it is producing a response, attempting to catch errors before serving them to the user. And DeepSeek R1 matches or surpasses OpenAI's own earlier reasoning model o1, released in September 2024, in several areas. For instance, on the MATH-500 benchmark, which assesses high-school-level mathematical problem-solving, DeepSeek R1 achieved a 97.3% accuracy rate, slightly outperforming OpenAI o1's 96.4%. In terms of coding capabilities, DeepSeek R1 scored 49.2% on the SWE-bench Verified benchmark, edging out OpenAI o1's 48.9%. Moreover, financially, DeepSeek R1 offers substantial cost savings. The model was developed with an investment of under $6 million, a fraction of the expenditure reportedly associated with training models like OpenAI's o1 -- estimated to be multiple billions. DeepSeek was essentially forced to become more efficient with scarce and older GPUs thanks to a U.S. export restriction on the tech's sales to China. Additionally, DeepSeek provides API access at $0.14 per million tokens, significantly undercutting OpenAI's rate of $7.50 per million tokens. DeepSeek R1's massive efficiency gain, cost savings and equivalent performance to the top U.S. AI model has caused Silicon Valley and the wider business community to freak out over what appears to be a complete upending of the AI market, geopolitics, and known economics of AI model training. While DeepSeek's gains are revolutionary, the pendulum is swinging too far toward it right now There's no denying that DeepSeek R1's cost-effectiveness is a significant achievement. But let's not forget that DeepSeek itself owes much of its success to U.S. AI innovations, going back to the initial 2017 transformer architecture developed by Google AI researchers (which started the whole LLM craze). DeepSeek R1 was trained on synthetic data question and answers and specifically, according to the paper released by its researchers, on the supervised fine-tuned "dataset of DeepSeek-V3," the company's previous (non-reasoning) model, which was found to have many indicators of being generated with OpenAI's GPT-4o model itself! It seems pretty clear cut to say that without GPT-4o to provide this data, and without OpenAI's own release of the first commercial reasoning model o1 back in September 2024, which created the category -- DeepSeek R1 would almost certainly not exist. Furthermore, OpenAI's success required vast amounts of GPU resources, paving the way for breakthroughs that DeepSeek has undoubtedly benefited from. The current investor panic about U.S. chip and AI companies feels premature and overblown. ChatGPT's vision and image generation capabilities are still hugely important and valuable in workplace and personal settings -- DeepSeek R1 doesn't have any yet While DeepSeek R1 has impressed with its visible "Chain-of-Thought" reasoning -- a kind of stream of consciousness wherein the model displays text as it analyzes the user's prompt and seeks to answer it -- and efficiency in text- and math-based workflows, it lacks several features that make ChatGPT a more robust and versatile tool today. No Image or Vision Capabilities The official DeepSeek R1 website and mobile app does let users upload photos and file attachments, but it can only extract text from them using optical character recognition (OCR), one of the earliest computing technologies dating back to 1959. This pales in comparison to ChatGPT's vision capabilities, where the user can upload images without any text whatsoever and have ChatGPT analyze, describe the image, or provide further information based on what it sees and the user's text prompts. ChatGPT allows users to upload photos, analyze visual material, and provide detailed insights or actionable advice. For example, when I needed guidance on repairing my bike or maintaining my air conditioning unit, ChatGPT's ability to process images proved invaluable. DeepSeek R1 simply cannot do this yet. See below for a visual comparison No Image Generation. The absence of generative image capabilities is another major limitation. As someone who frequently generates AI images using ChatGPT (such as for this article's own header) powered by OpenAI's underlying DALL-E 3 model, the ability to create detailed and stylistic images with ChatGPT is a game-changer. This feature is essential for many creative and professional workflows, and DeepSeek has yet to demonstrate comparable functionality, though today the company did release an open source vision model, Janus Pro, which it says outperforms DALL-E 3 and Stable Diffusion 3 alongside other industry-leading image generation models, on third-party benchmarks. No Voice Mode DeepSeek R1 also lacks a voice interaction mode, a feature that has become increasingly important for accessibility and convenience. ChatGPT's voice mode allows for natural, conversational interactions, making it a superior choice for hands-free use or for users with different accessibility needs. Be excited for DeepSeek's future potential -- but also be wary of its challenges Yes, DeepSeek R1 can -- and likely will -- add voice and vision capabilities in the future. But doing so is no small feat. Integrating image generation, vision analysis, and voice capabilities requires substantial development resources and, ironically, many of the same high-performance GPUs that investors are now undervaluing. Deploying these features effectively and in a user-friendly way is another challenge entirely. DeepSeek R1's accomplishments are impressive and signal a promising shift in the global AI landscape. However, it's crucial to keep the excitement in check. For now, ChatGPT remains the better-rounded and more capable product, offering a suite of features that DeepSeek simply cannot match. Let's appreciate the advancements while recognizing the limitations and the continued importance of U.S. AI innovation and investment.
[96]
How a top Chinese AI model overcame US sanctions
"This could be a truly equalizing breakthrough that is great for researchers and developers with limited resources, especially those from the Global South," says Hancheng Cao, an assistant professor in information systems at Emory University. DeepSeek's success is even more remarkable given the constraints facing Chinese AI companies in the form of increasing US export controls on cutting-edge chips. But early evidence shows that these measures are not working as intended. Rather than weakening China's AI capabilities, the sanctions appear to be driving startups like DeepSeek to innovate in ways that prioritize efficiency, resource-pooling, and collaboration. To create R1, DeepSeek had to rework its training process to reduce the strain on its GPUs, a variety released by Nvidia for the Chinese market that have their performance capped at half the speed of its top products, according to Zihan Wang, a former DeepSeek employee and current PhD student in computer science at Northwestern University. DeepSeek R1 has been praised by researchers for its ability to tackle complex reasoning tasks, particularly in mathematics and coding. The model employs a "chain of thought" approach similar to that used by ChatGPT o1, which lets it solve problems by processing queries step by step. Dimitris Papailiopoulos, principal researcher at Microsoft's AI Frontiers research lab, says what surprised him the most about R1 is its engineering simplicity. "DeepSeek aimed for accurate answers rather than detailing every logical step, significantly reducing computing time while maintaining a high level of effectiveness," he says.
[97]
Scale AI CEO says China has quickly caught the U.S. with the DeepSeek open-source model
AI agents are having a 'ChatGPT moment' as investors look for what's next after chatbots "The United States is going to need a huge amount of computational capacity, a huge amount of infrastructure," Wang said, later adding, "We need to unleash U.S. energy to enable this AI boom." Earlier this week, Trump announced a joint venture with OpenAI, Oracle and SoftBank to invest billions of dollars in U.S. AI infrastructure. The project, Stargate, was unveiled at the White House by Trump, SoftBank CEO Masayoshi Son, Oracle co-founder Larry Ellison and OpenAI CEO Sam Altman. Key initial technology partners will include Microsoft, Nvidia and Oracle, as well as semiconductor company Arm. They said they would invest $100 billion to start and up to $500 billion over the next four years. In the interview Thursday, Wang said he believes that it'll take two to four years to reach artificial general intelligence, or AGI, a widely-cited but vaguely-defined benchmark used in the AI sector to denote a branch of AI pursuing technology that equals or surpasses human intellect on a wide range of tasks. AGI is a hotly debated topic, with some leaders saying we're close to attaining it and some saying it's not possible at all. Wang said his own definition of AGI is "powerful AI systems that are able to use a computer just like you or I could... and basically be a remote worker in the most capable way." Anthropic, the Amazon-backed AI startup founded by ex-OpenAI research executives, ramped up its technology development throughout the past year, and in October, the startup said that its AI agents were able to use computers like humans can to complete complex tasks. Anthropic's Computer Use capability allows its technology to interpret what's on a computer screen, select buttons, enter text, navigate websites and execute tasks through any software and real-time internet browsing, the startup said. The tool can "use computers in basically the same way that we do," Jared Kaplan, Anthropic's chief science officer, told CNBC in an interview at the time. He said it can do tasks with "tens or even hundreds of steps." OpenAI reportedly plans to introduce a similar feature soon. When asked which U.S. AI startups are leading the AI race right now, Wang said that models each have their own strengths -- for instance, OpenAI's models are great at reasoning, while Anthropic's are great at coding. "The space is becoming more competitive, not less competitive," he said.
[0]
Meta leadership worried as Chinese AI DeepSeek sets new benchmark- The Information By Investing.com
Investing.com-- Meta Platforms Inc (NASDAQ:META) is on high alert as DeepSeek, a conversational AI developed by a Chinese hedge fund, challenges its dominance in artificial intelligence, The Information reported on Sunday citing two Meta employees familiar with internal discussions. Meta's AI leaders, including infrastructure director Mathew Oldham, have expressed concerns that the upcoming version of Meta's Llama AI may fall short compared to DeepSeek's groundbreaking performance, the report stated. DeepSeek has quickly gained attention for outperforming established AI systems like OpenAI's GPT models and Meta's Llama, while operating at a fraction of their costs. Its development raises significant capital expenditure (capex) questions for AI companies investing heavily in expensive infrastructure to support their models. Meta, like its competitors, has spent billions on AI advancements, including scaling computational resources and optimizing model training. However, DeepSeek's ability to achieve similar or better results with lower costs has raised concerns about the sustainability of such heavy investments. Smaller firms aiming to compete in the AI space may face further challenges, as DeepSeek sets a new standard for cost-effective innovation. The competition underscores the shifting landscape of AI development, where performance is no longer the sole metric of success. Cost efficiency and accessibility are becoming increasingly critical, putting pressure on industry leaders like Meta, NVIDIA Corporation (NASDAQ:NVDA), Microsoft Corporation (NASDAQ:MSFT), Alphabet, (NASDAQ:GOOG) and other tech leaders.
[93]
DeepSeek-R1: Redefining AI Innovation and Transforming Global Dynamics: By Ritesh Jain
Artificial intelligence (AI) has long been a battleground for technological supremacy, dominated by a few powerful players. The arrival of DeepSeek-R1, a revolutionary large language model (LLM) developed in China, marks a pivotal shift in this landscape. Unlike the resource-intensive, proprietary systems pioneered by US-based giants like OpenAI and Google, DeepSeek-R1 offers a cost-efficient and open-source alternative, challenging the exclusivity that has defined AI innovation. This development represents more than a technical milestone -- it is a movement toward inclusivity in AI. By lowering barriers and democratizing access, DeepSeek-R1 paves the way for a future where AI becomes a global public good, transforming industries and empowering nations that have long been sidelined. For decades, the development of AI has required immense computational power, billions of dollars in investments, and proprietary frameworks that exclude smaller nations and organizations. DeepSeek-R1 challenges this exclusivity. Developed by the Hangzhou-based research lab DeepSeek, this model achieves 97% task accuracy at just 10% of the cost of traditional LLMs. Its reinforcement learning (RL)-based architecture, combined with an MIT open-source license, ensures that advanced AI capabilities are accessible to a much broader audience. The implications are profound. In healthcare, DeepSeek-R1 can enable rural hospitals in India to provide early diagnoses and personalized treatment plans without needing expensive infrastructure. In Brazil, small logistics companies can optimize supply chains with insights that were previously available only to well-funded tech giants. Emerging startups across Africa can now build tailored AI solutions for local challenges without facing prohibitive costs. These are not just hypothetical scenarios -- they are tangible opportunities that redefine the role AI can play in bridging global inequities. DeepSeek's rise is driven by a deliberate, research-first approach that prioritizes meaningful innovation over quick commercialization. Key milestones in its journey include the release of DeepSeek-Coder, an open-source coding model supporting 86 programming languages, and DeepSeek-V2, which slashed training and inference costs with its Mixture-of-Experts architecture. By January 2025, the launch of DeepSeek-R1 showcased the lab's commitment to excellence, delivering high performance in coding and mathematical tasks while cutting operational costs by 95%. These achievements have earned DeepSeek comparisons to "AI Pinduoduo," a reference to its unmatched cost-efficiency and focus on accessibility. Unlike many proprietary systems, DeepSeek invites collaboration, enabling governments, startups, and researchers worldwide to build on its foundation. The significance of DeepSeek-R1 extends far beyond its technical capabilities. Its open-source approach represents a bold departure from the proprietary ecosystems that have traditionally dominated AI. This openness reflects a broader global trend toward shared innovation, as seen in initiatives like Meta's Llama 2, Europe's modular Mistral AI, and India's AI Stack, which democratizes access to AI tools for startups and researchers. DeepSeek-R1 amplifies these movements, signaling a future where collaboration -- not monopolization -- drives progress. By empowering diverse stakeholders, it challenges the notion that AI is the exclusive domain of the few. While DeepSeek-R1's technology is transformative, its full potential depends on the readiness of supporting ecosystems. This includes the availability of high-quality, localized datasets to ensure models remain unbiased and relevant to local contexts. Nations must also invest in talent development, creating pipelines of skilled professionals who can design, deploy, and scale AI solutions. Programs like Singapore's AI Singapore Programme offer a strong blueprint for such efforts. Equally important is policy and governance. Open-source models, while powerful, are vulnerable to misuse. Regulatory frameworks, such as Europe's AI Act, must ensure responsible deployment while fostering innovation. DeepSeek's success hinges not only on its technical strengths but also on how effectively ecosystems around the world integrate and govern this technology. The real-world applications of DeepSeek-R1 are already reshaping industries: In healthcare, predictive analytics powered by DeepSeek enable earlier and more accurate diagnoses, improving patient outcomes while reducing costs. In finance, its fraud detection capabilities enhance trust and security in digital transactions, addressing a growing global concern. The retail sector benefits from hyper-personalized shopping experiences, driven by DeepSeek's advanced consumer behavior analysis. In logistics, optimization algorithms streamline supply chains, cutting delivery times and reducing operational expenses. These examples underscore a simple truth: AI isn't just about technology -- it's about solving real-world problems at scale. DeepSeek-R1 is not just a technological achievement; it is a geopolitical statement. For decades, the United States has leveraged its dominance in AI as a strategic advantage, shaping global power dynamics. DeepSeek challenges this status quo, offering nations in the Global South a viable alternative to proprietary Western systems. This shift raises critical questions. Will nations align with China's model of open innovation, or will they remain tethered to Western frameworks? Can the global community prioritize collaboration, or will this divide deepen competition in the AI race? The answers to these questions will define the future of not just AI but global technological leadership. While DeepSeek-R1 democratizes access to advanced AI, its open-source nature introduces challenges. Increased accessibility can lead to misuse, whether through disinformation campaigns, cybercrime, or unethical surveillance. Addressing these risks requires collaborative regulation. Initiatives like the Global Partnership on AI and ethical oversight mechanisms, such as those employed by Google's DeepMind, provide pathways to balance innovation with accountability. DeepSeek-R1 represents more than a technological breakthrough -- it is a call to action. To truly democratize AI, we must invest in ecosystems that enable adoption, foster international collaborations, and establish ethical frameworks to ensure transparency and security. At its core, the question is not whether AI will shape our future -- it is how inclusive and equitable that future will be. The story of DeepSeek-R1 is a reminder that the winners in the AI race will not be those with the largest budgets or the most advanced systems. The true winners will be those who create ecosystems where AI serves humanity, driving innovation, equity, and sustainability. DeepSeek-R1 has shown us what is possible when purpose drives progress. Now, it is up to us -- leaders, policymakers, businesses, and individuals -- to ensure that AI becomes a tool for the greater good.
[94]
China's DeepSeek Tops iPhone Downloads and Spurs Asia Stocks
DeepSeek, the eponymous AI assistant app from a Chinese startup, rocketed to the top of Apple Inc.'s iPhone download charts, stirring doubts in Silicon Valley about the strength of America's lead in AI. The app's underlying artificial intelligence model is widely seen as competitive with OpenAI and Meta Platforms Inc.'s latest. Its claim that it cost much less to train and develop triggered share moves across Asia's supply chain.
[98]
Scale AI's 28-Year-Old Billionaire CEO Warns About This Scarily Good Chinese Startup
The A.I. race between the two countries "is one of the most important issues of today," according to Wang. For around a decade, the U.S. has retained its A.I. dominance over China, according to Alexandr Wang, the 28-year-old founder and CEO of Scale AI, a $13 billion startup. But with the Christmas release of an "earth-shattering" A.I. model from the Chinese startup DeepSeek, the gap between the two countries is becoming dangerously narrower, Wang said in an interview with CNBC on Jan. 23. Sign Up For Our Daily Newsletter Sign Up Thank you for signing up! By clicking submit, you agree to our <a href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime. See all of our newsletters Earlier this week, DeepSeek released a second A.I. model that demonstrates reasoning capabilities rivaling those from top U.S. companies like OpenAI. In addition to shocking researchers with its performance, the Chinese startup's rapid progress has raised questions about the effectiveness of A.I. chip export controls intended to curb China's access to the advanced graphics processing units (GPUs) underpinning A.I. tools. "The A.I. race and the A.I. war between the U.S. and China is one of the most important issues of today," said Wang, who recently took out a full-page advertisement in The Washington Post urging the Trump administration to protect America's lead over the emerging technology. The advertisement pointed readers to a letter from Wang that claimed China's A.I. models are quickly catching up and called upon the federal government to pour more money into compute and data and unleash an energy plan to support the A.I. boom, among other recommendations. Wang is one of the world's youngest self-made billionaires. He was born and raised in Los Alamos, New Mexico, as the son of weapons physicists who worked at the famed Los Alamos National Laboratory. After a brief stint in Silicon Valley working for fintech company Addepar and Q&A website Quora, Wang began studying machine learning at the Massachusetts Institute of Technology before dropping out to establish Scale AI, which uses contract work to provide troves of accurately labeled data for A.I. training efforts. Scale AI was valued at $13.8 billion last year and counts the U.S. Department of Defense and OpenAI among its clients. Wang is longtime friends with OpenAI CEO Sam Altman, having roomed with him during the Covid-19 pandemic. He launched Scale AI in 2016 within Y Combinator, the startup accelerator formerly helmed by Altman. How does DeepSeek's new release compare to frontier models in the U.S.? Scale AI also recently partnered with the Center for A.I. Safety to release "Humanity's Last Exam," which they described as the toughest benchmark test yet for A.I. systems. While no model has been able to achieve more than 10 percent on the test so far, Wang said DeepSeek's new reasoning model, DeepSeek-R1, has topped the leaderboard. "Their model is actually the top-performing, or roughly on par with the best American models," he told CNBC. Due to tough GPU export controls in China, DeepSeek claims it has managed to create powerful models with far less computing power compared to American systems. DeepSeek-V3, the model released in December, was trained on around 2,000 A.I. chips from Nvidia (NVDA), according to DeepSeek researchers. Meta (META)'s Llama 3.1 model, for reference, was trained on 16,000 GPUs. Wang isn't convinced. "The Chinese labs have more H100s than people think," he said, referring to a type of Nvidia GPU that isn't legally available in China. "My understanding is that DeepSeek has about 50,000 H100s -- which they can't talk about, obviously, because it is against the export controls that the United States has put in place." To continue pushing its frontier models, the U.S. must unlock more computational capacity and infrastructure, according to Wang, who believes the A.I. market is on track to be worth $1 trillion once A.I. achieves the benchmark of artificial general intelligence (A.G.I.). While A.I. leaders have offered up various timelines and meanings for A.G.I., Wang's definition refers to systems that "could basically be a remote worker in the most capable way," an achievement he believes will be accomplished in the next two to four years.
[99]
9to5Neural: ChatGPT Operator, Claude Citations, Trump AI EO - 9to5Mac
Welcome to 9to5Neural. AI moves fast. We help you keep up. In our inaugural edition, we're exploring the start of the next frontier for OpenAI, Anthropic's thoughtful solution to a common AI critique, and presidential AI executive order ping-pong. Let's start making sense of the latest in AI news. OpenAI recently released the 18K gold Apple Watch Edition of ChatGPT. ChatGPT Pro is a $200/month subscription that makes Tim Cook wish Apple had that kind of recurring revenue per customer. Starting today, ChatGPT Pro also gives AI enthusiasts a major new reason to subscribe beyond higher request limits. Meet Operator. OpenAI calls it "a research preview of an agent that can use its own browser to perform tasks for you." From meme creation to ordering groceries and filling out forms, OpenAI dubs Operator one of its first agents that will execute tasks you give it. Today we're releasing Operator, an agent that can go to the web to perform tasks for you. Using its own browser, it can look at a webpage and interact with it by typing, clicking, and scrolling. It is currently a research preview, meaning it has limitations and will evolve based on user feedback. Operator won't always be behind a $200/month paywall. OpenAI plans to open access to this AI tool for Plus, Team, and Enterprise paid users in the future. For now, Operator is available to all ChatGPT Pro customers in the U.S. at operator.chatgpt.com. OpenAI says Operator is powered by its new Computer-Using Agent (CUA) technology. Powering Operator is Computer-Using Agent (CUA), a model that combines GPT-4o's vision capabilities with advanced reasoning through reinforcement learning. CUA is trained to interact with graphical user interfaces (GUIs) -- the buttons, menus, and text fields people see on a screen -- just as humans do. This gives it the flexibility to perform digital tasks without using OS- or web-specific APIs. [...] While CUA is still early and has limitations, it sets new state-of-the-art benchmark results, achieving a 38.1% success rate on OSWorld for full computer use tasks, and 58.1% on WebArena and 87% on WebVoyager for web-based tasks. These results highlight CUA's ability to navigate and operate across diverse environments using a single general action space. I guess this is as good of a time as any to announce that I am stepping down from 9to5Neural to spend more time with my family. All future editions of 9to5Neural will be brought to you by Operator. I have full faith in the Computer-Using Agent to translate AI news for humanity going forward. Wait, no, I spoke too soon. Apparently there's an issue with our ChatGPT Pro subscription. I'm back in the saddle! But seriously, Operator is clearly a big deal. We'll look back at January 2025 as a milestone in AI advancement. Computer-User Agent technology may also satisfy AI skeptics who keep asking when ChatGPT-5 is coming. The other big OpenAI story this week? Stargate. Or as Sam Altman said on X, "big. beautiful. buildings." What's Stargate? Basically a big computer brain in Texas. OpenAI detailed the initiative this week: The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately. This infrastructure will secure American leadership in AI, create hundreds of thousands of American jobs, and generate massive economic benefit for the entire world. This project will not only support the re-industrialization of the United States but also provide a strategic capability to protect the national security of America and its allies. The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman. Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners. The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements. As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. Behind every ambitious AI firm is an ambitious billionaire, of course, and the billionaires are fighting on X over Stargate finances. Elon Musk, whose xAI firm has no involvement in Stargate, responded to the announcement on X, saying "they don't actually have the money." Musk added that he has it on good authority that SoftBank has "well under $10B secured." Altman, on the other hand, is confident the parties involved have funding secured. Meanwhile, the OpenAI boss says he fell into the non-playable character trap regarding Trump (now that Trump has made his character playable, referring to Stargate). Frankly, I'm much more bullish on the prospects of ChatGPT Operator than I am on the relationship complexities of the billionaires. Meanwhile, Anthropic, which has always had a more measured approach to AI safety, is launching a promising new tool for its Claude chatbot called Citations. Today, we're launching Citations, a new API feature that lets Claude ground its answers in source documents. Claude can now provide detailed references to the exact sentences and passages it uses to generate responses, leading to more verifiable, trustworthy outputs. [...] Previously, developers relied on complex prompts that instruct Claude to include source information, often resulting in inconsistent performance and significant time investment in prompt engineering and testing. With Citations, users can now add source documents to the context window, and when querying the model, Claude automatically cites claims in its output that are inferred from those sources. Our internal evaluations show that Claude's built-in citation capabilities outperform most custom implementations, increasing recall accuracy by up to 15%. Anthropic points to relevant use cases including customer support queriers and document summarization tasks. Best take? Kyle B. Russel on X, no citations needed: Claude 3.5 Sonnet and Claude 3.5 Haiku are ready for Citations starting today, and Anthropic has documentation ready for your exploration. Following that brief break from presidential politics, let's return to the American policy on AI. President Trump continued his marathon executive order signing race on Thursday, revoking the Biden administration's executive order on AI policy with the Trump administration's executive order on AI policy. In case you've forgotten, Biden's EO on AI focused on artificial intelligence safety, infrastructure standards, mitigating job disruption, and watermarking AI content for transparency. In sum, Biden's executive order: Per the AP report, Trump's AI executive order revokes past government policies that "act as barriers to American AI innovation," adding that the U.S. must "develop AI systems that are free from ideological bias or engineered social agendas," per the executive order. Aside from the broad policy directive, President Trump's AI EO authorizes the "development of an AI action plan within 180 days," per the AP, which will be headed by Special Advisor for AI and Crypto David Sacks, the ex-PayPal executive appointed by Trump. Going forward, tech companies will no longer need to disclose with the government the development of AI models that cross a certain power threshold. Meanwhile, AI competition isn't just happening among American firms. This week, Chinese AI firm DeepSeek released its R1 model family into the wild. What's unique about R1 is that the model can run locally with performance comparable to OpenAI's ChatGPT 4o model. Local models tend to trail models that operate off-machine, making this developmental model and DeepSeek worth watching. The catch? R1 naturally has a state-approved view of world history when it comes to topics like the 1989 Tiananmen Square protest and massacre or Taiwan's independence. You know, just in case the stakes for who wins the AI race weren't clear already.
Share
Share
Copy Link
Chinese AI startup DeepSeek's R1 model tops app charts, rivaling OpenAI's ChatGPT with impressive performance at a fraction of the cost, sparking debates on AI development and US-China tech competition.
Chinese AI startup DeepSeek has taken the tech world by storm with its R1 reasoning model, unseating OpenAI's ChatGPT as the most downloaded free app on the U.S. App Store [1]. This unexpected development has sent ripples through the global AI industry, challenging the perceived dominance of American tech giants.
Founded in 2023 by Liang Wenfeng, a 40-year-old engineer and hedge fund co-founder, DeepSeek has achieved remarkable results with a fraction of the resources used by its American counterparts [1]. The company claims to have developed its R1 model for just $5.6 million, compared to the billions spent by OpenAI, Meta, and others [2][3].
DeepSeek's success stems from its innovative approach, blending advanced Nvidia A100 chips with lower-end alternatives and reworking its training process to reduce strain on GPUs [1][2]. This strategy has allowed the company to adapt to U.S. export restrictions on advanced chips to China, demonstrating the potential for efficiency and resource-pooling in AI development [2].
Despite resource constraints, DeepSeek's R1 model reportedly matches or exceeds the performance of industry leaders like OpenAI's GPT-4 and Anthropic's Claude 3.5 Sonnet on certain AI benchmarks [3][4]. The model excels in tasks related to physics, science, and mathematics, employing a step-by-step approach to problem-solving [3].
Notably, DeepSeek has released an open-source version of R1 under an MIT license, allowing for commercial use without restrictions [3]. This move has been praised by industry experts, including Meta's VP and chief AI scientist Yann LeCun, who sees it as a win for open-source development [2].
DeepSeek's rise has significant implications for the AI industry and the ongoing tech competition between the United States and China. Silicon Valley venture capitalist Marc Andreessen described it as "AI's Sputnik moment," drawing parallels to the 1957 Soviet satellite launch that shocked the United States [1][4].
The success of DeepSeek has led to market fluctuations, with shares of AI-related companies like Nvidia, Microsoft, and Meta experiencing declines [4][5]. This development has raised questions about the sustainability of high-cost AI models and the potential for more cost-effective approaches to reshape the industry [1][2].
DeepSeek's achievement comes amid ongoing U.S. sanctions on advanced chip exports to China, highlighting the adaptability of Chinese tech firms in the face of restrictions [2][4]. While some analysts believe that U.S. access to more advanced chips will maintain an advantage in the long run, others see DeepSeek's success as a potential game-changer in the global AI race [4][5].
As the AI landscape continues to evolve, DeepSeek's accomplishment serves as a wake-up call to Silicon Valley and policymakers alike. It underscores the potential for innovation to emerge from unexpected sources and challenges assumptions about the resources required for cutting-edge AI development [1][3][5].
Reference
[1]
[2]
[5]
Chinese AI company DeepSeek unveils a highly efficient large language model, DeepSeek-V3, trained at a fraction of the cost of Western counterparts, raising questions about the effectiveness of US chip export restrictions.
2 Sources
Chinese AI startup DeepSeek's cost-effective AI model sparks a global tech stock selloff, challenging assumptions about AI development costs and US technological superiority.
71 Sources
Chinese AI startup DeepSeek releases DeepSeek V3, an open-weight AI model with 671 billion parameters, outperforming leading open-source models and rivaling proprietary systems in various benchmarks.
7 Sources
DeepSeek R1, a new open-source AI model, demonstrates advanced reasoning capabilities comparable to proprietary models like OpenAI's GPT-4, while offering significant cost savings and flexibility for developers and researchers.
21 Sources
DeepSeek, a Chinese AI company, has launched a powerful and cost-effective AI model that rivals ChatGPT, causing significant market reactions and reassessment of US tech dominance in AI.
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved