8 Sources
[0]
Microsoft probing if DeepSeek-linked group improperly obtained OpenAI data
Gift 5 articles to anyone you choose each month when you subscribe. Microsoft and OpenAI are investigating whether data output from OpenAI's technology was obtained in an unauthorised manner by a group linked to Chinese artificial intelligence start-up DeepSeek, according to people familiar with the matter. Microsoft's security researchers in the fall observed individuals they believe may be linked to DeepSeek exfiltrating a large amount of data using the OpenAI application programming interface, or API, said the people, who asked not to be identified because the matter is confidential. Software developers can pay for a license to use the API to integrate OpenAI's proprietary artificial intelligence models into their own applications.
[0]
Microsoft probing if DeepSeek-linked group improperly obtained OpenAI data
Microsoft's security researchers in the fall observed individuals they believe may be linked to DeepSeek exfiltrating a large amount of data using the OpenAI application programming interface, or API, said the people, who asked not to be identified because the matter is confidential. Software developers can pay for a license to use the API to integrate OpenAI's proprietary artificial intelligence models into their own applications.Microsoft Corp. and OpenAI are investigating whether data output from OpenAI's technology was obtained in an unauthorized manner by a group linked to Chinese artificial intelligence startup DeepSeek, according to people familiar with the matter. Microsoft's security researchers in the fall observed individuals they believe may be linked to DeepSeek exfiltrating a large amount of data using the OpenAI application programming interface, or API, said the people, who asked not to be identified because the matter is confidential. Software developers can pay for a license to use the API to integrate OpenAI's proprietary artificial intelligence models into their own applications. Microsoft, an OpenAI technology partner and its largest investor, notified OpenAI of the activity, the people said. Such activity could violate OpenAI's terms of service or could indicate the group acted to remove OpenAI's restrictions on how much data they could obtain, the people said. DeepSeek earlier this month released a new open-source artificial intelligence model called R1 that can mimic the way humans reason, upending a market dominated by OpenAI and US rivals such as Google and Meta Platforms Inc. The Chinese upstart said R1 rivaled or outperformed leading US developers' products on a range of industry benchmarks, including for mathematical tasks and general knowledge -- and was built for a fraction of the cost. The potential threat to the US firms' edge in the industry sent technology stocks tied to AI, including Microsoft, Nvidia Corp., Oracle Corp. and Google parent Alphabet Inc., tumbling on Monday, erasing a total of almost $1 trillion in market value. OpenAI didn't respond to a request for comment, and Microsoft declined to comment. DeepSeek and hedge fund High-Flyer, where DeepSeek was started, didn't immediately respond to requests for comment via email. David Sacks, President Donald Trump's artificial intelligence czar, said Tuesday there's "substantial evidence" that DeepSeek leaned on the output of OpenAI's models to help develop its own technology. In an interview with Fox News, Sacks described a technique called distillation whereby one AI model uses the outputs of another for training purposes to develop similar capabilities. "There's substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI models and I don't think OpenAI is very happy about this," Sacks said, without detailing the evidence. In a statement responding to Sacks' comments, OpenAI didn't directly address his comments about DeepSeek. "We know PRC based companies -- and others -- are constantly trying to distill the models of leading US AI companies," an OpenAI spokesperson said in the statement, referring to the People's Republic of China. "As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe as we go forward that it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology."
[0]
Microsoft Investigates DeepSeek-Tied Group's Access to OpenAI Data | PYMNTS.com
Microsoft and OpenAI are reportedly investigating whether OpenAI's data was improperly accessed by a group tied to DeepSeek. As Bloomberg News reported Tuesday (Jan. 28), sources familiar with the matter say Microsoft security researchers in the fall discovered people who could be tied to DeepSeek withdrawing a large amount of data using OpenAI's application programming interface (API). Microsoft, OpenAI's partner and biggest investor, notified the startup of the activity, the sources said. This activity could be a violation of OpenAI's terms of service or could indicate the group attempted to remove OpenAI's limits on how much data they could obtain, the sources said. The news follows the recent release of DeepSeek's open-source artificial intelligence (AI) model R-1, which the Chinese company says can rival or surpass those developed or backed by America's tech giants, while being constructed for a sliver of the cost. To be more precise, the company said it was able to train a foundation model for only $5.58 million on 2,048 Nvidia H800 chips. OpenAI and its rival Anthropic have estimated the costs at $100 million to a billion dollars using several thousand Nvidia chips. R-1's arrival caused U.S. tech stocks to stumble, wiping out $1.5 trillion in market value from companies such as Nvidia, Google and Microsoft. DeepSeek showed that "small companies, individual developers, and even researchers are now able to harness the power of AI without breaking the bank," Roy Benesh, CTO at eSIMple, told PYMNTS Tuesday. "This can lead to the development of new ideas and technologies due to the increased competitiveness in the field. This can alter the current state of affairs by providing new options for customers where older established AI companies are likely to charge less and improve their technology faster." The Bloomberg report also notes comments Tuesday from David Sacks, the White House's AI czar, who claimed there is "substantial evidence" that DeepSeek used the output of OpenAI's models to develop its own technology. Speaking to Fox News, Sacks described a technique known as distillation in which one AI model uses the outputs of another to craft similar capabilities. While not directly acknowledging Sacks' comments on DeepSeek, OpenAI told Bloomberg that it was aware that China-based companies are "constantly trying to distill the models of leading US AI companies."
[0]
Microsoft is investigating if DeepSeek illegally accessed OpenAI data to train its AI model
In context: AI startup DeepSeek stunned the world with the release of its R1 AI model, which can mimic human reasoning at a level comparable to the best OpenAI models to date. While the company has received widespread acclaim for its achievement, it is now reportedly under investigation by Microsoft and OpenAI for allegedly accessing OpenAI's data output illegally to train its AI model. According to unnamed sources cited by Bloomberg, the probe started last fall after Microsoft's security researchers discovered that a group linked to DeepSeek had accessed a large amount of data through OpenAI's API. While developers can legally pay for a license to use this data in their applications, Microsoft suspects that DeepSeek's actions violated OpenAI's terms of service. Following the discovery, Microsoft informed OpenAI that it had detected an attempt by a DeepSeek-affiliated group to bypass restrictions on how much data a single party can access. The two companies then launched a joint investigation into the incident, which is now being treated as a potential data breach. An OpenAI spokesperson declined to confirm the alleged data theft but told Reuters that the company employs cutting-edge "countermeasures" to protect its intellectual property. They added that OpenAI is working closely with the U.S. government to prevent its AI models from falling into the hands of foreign adversaries and competitors. However, the White House's AI and crypto czar, David Sacks, was less restrained in his response to the report. In an interview with Fox News, he claimed there was "substantial evidence" that DeepSeek had accessed OpenAI's data in an unethical and possibly illegal manner. Microsoft refused to comment, while DeepSeek could not be reached for its reaction on the controversy. With the launch of its all-conquering AI model, DeepSeek appears poised to challenge OpenAI, Google, and Meta in the field of AI research. However, if the allegations turns out to be accurate, it could spell trouble for the Chinese AI firm, which saw its new app overtake ChatGPT to become the top free app on both the App Store and Play Store in the U.S. this week. DeepSeek also received unexpected praise from President Trump this week when he described the release of the R1 LLM as a "positive" development and an "asset" for the global tech sector. He noted that if DeepSeek delivers on its promise of accelerating AI training at a lower cost, "that's good (and) I view that as a positive."
[6]
OpenAI: DeepSeek Used Our Models to Train Their Own
With the release of its R1 model, China-based DeepSeek has become the talk of the AI world, and its biggest rival isn't too happy about it. OpenAI tells the Financial Times that it has found some evidence of DeepSeek using OpenAI models to develop its chatbot. The ChatGPT creator accuses DeepSeek of "distillation," a process in which developers use outputs obtained from larger, more capable models to improve the efficiency of smaller models. It helps developers achieve similar results at a much lower cost. Although this practice is common in the industry, extracting data from OpenAI to create a direct rival violates its terms of service, a source tells the FT. Security researchers from Microsoft, a major OpenAI investor, spotted data being extracted from OpenAI's API in the fall and informed OpenAI, according to Bloomberg. They now believe DeepSeek could have been responsible, although both companies are still investigating. President Trump's AI and crypto czar, David Sacks, told Fox News: "There's substantial evidence that what DeepSeek did here is they distilled the knowledge out of OpenAI's models, and I don't think OpenAI is very happy about this." An OpenAI spokesperson tells Bloomberg that the company is aware China-based companies and others are "trying to distill the models of leading US AI companies." "As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe as we go forward that it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology," the spokesperson added. The statement might strike some as ironic, given that OpenAI is fighting off multiple lawsuits from media organizations, authors, and other artists for using their content without permission to train OpenAI's models. DeepSeek is currently limiting registration for new users after receiving a flood of registrations and "large-scale malicious attacks." However, PCMag ran some quick tests to see how DeepSeek compares to ChatGPT.
[0]
OpenAI Says DeepSeek May Have Improperly Harvested Its Data
OpenAI says it is reviewing evidence that the Chinese start-up DeepSeek broke its terms of service by harvesting large amounts of data from its A.I technologies. The San Francisco-based start-up, which is now valued at $157 billion, said that DeepSeek may have used data generated by OpenAI technologies to teach similar skills to its own systems. This process, called distillation, is common across the A.I. field. But OpenAI's terms of service say that the company does not allow anyone to use data generated by its systems to build technologies that compete in the same market. "We know that groups in the P.R.C. are actively working to use methods, including what's known as distillation, to replicate advanced U.S. A.I. models," OpenAI spokeswoman Liz Bourgeois said in statement emailed to The New York Times, referring to the People's Republic of China. "We are aware of and reviewing indications that DeepSeek may have inappropriately distilled our models, and will share information as we know more," she said. "We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the U.S. government to protect the most capable models being built here." DeepSeek did not immediately respond to a request for comment. DeepSeek spooked Silicon Valley tech companies and sent the U.S. financial markets into a tailspin earlier this week after releasing A.I. technologies that matched the performance of anything else on the market. The prevailing wisdom had been that the most powerful systems could not be built without billions of dollars in specialized computer chips, but DeepSeek said it had created its technologies using far fewer resources. Like any other A.I. company, DeepSeek built its technologies using computer code and data corralled from across the internet. A.I. companies lean heavily on a practice called open sourcing, freely sharing the code that underpins their technologies -- and reusing code shared by others. They see this is as way of accelerating technological development. They also need massive amounts of online data to train their A.I. systems. These systems learn their skills by pinpointing patterns in text, computer programs, images, sounds and videos. The leading systems learn their skills by analyzing just about all of the text on the internet. Distillation is often used to train new systems. If a company takes data from proprietary technology, the practice may be legally problematic. But it is often allowed by open source technologies. OpenAI is now facing more than a dozen lawsuits accusing it of illegally using copyrighted internet data to train its systems. This includes a lawsuit brought by The New York Times against OpenAI and its partner Microsoft. The suit contends that millions of articles published by The Times were used to train automated chatbots that now compete with the news outlet as a source of reliable information. Both OpenAI and Microsoft deny the claims. A Times report also showed that OpenAI has used speech recognition technology to transcribe the audio from YouTube videos, yielding new conversational text that would make an A.I. system smarter. Some OpenAI employees discussed how such a move might go against YouTube's rules, three people with knowledge of the conversations said. An OpenAI team, including the company's president, Greg Brockman, transcribed more than one million hours of YouTube videos, the people said. The texts were then fed into a system called GPT-4, which was widely considered one of the world's most powerful A.I. models and was the basis of the latest version of the ChatGPT chatbot.
[7]
OpenAI Says AI Industry Disruptor DeepSeek May Have Copied Its Work as Rivals Race to Catch Up
Now, competitor OpenAI is saying that the more inexpensive-to-make DeepSeek used its work, while another competitor, Alibaba, is claiming it has developed an even smarter model. China-based startup DeepSeek became an AI standout this week by creating an AI model believed to be on par with leading models from U.S. startups -- at a fraction of the cost. In a research paper released last month, DeepSeek said it developed its AI for under $6 million in only two months, a far cry from the $100 million it takes U.S. startups to train AI -- and that's on the lower end of the spectrum, according to Anthropic CEO Dario Amodei. It quickly rose to the top of the app store charts, challenging the U.S.'s position as the world's leader in AI. The release set off a race for AI dominance and shook Big Tech stocks, causing AI chipmaker Nvidia to lose almost $600 billion in market value one day and new competitor claims -- from having an even better model to allegations of theft. According to White House AI and Crypto Czar David Sacks, DeepSeek's arrival shows that Chinese companies are "hot on our heels" but that the U.S. maintains its leadership in AI. He says DeepSeek's AI is on par with OpenAI's o1 model, which came out about four months ago. "We basically have somewhere between a three and six-month lead on them [Chinese companies]," Sacks said. "But they are catching up very, very fast." OpenAI and Microsoft are investigating whether DeepSeek used large amounts of OpenAI training data without permission for its own AI. OpenAI told The Financial Times earlier this week that it had proof that DeepSeek used its large AI models to create its own through a process called distillation, in which one AI model learns from another like a student learning from a teacher. Sacks backed up OpenAI's claims in an interview with Fox Business on Tuesday. "There's substantial evidence that what DeepSeek did here is they distilled the knowledge out of OpenAI's models," Sacks said. "I think one of the things you're going to see over the next few months is our leading AI companies taking steps to try and prevent distillation." Other industry leaders say DeepSeek's success is due to the collaborative nature of open-source AI models. DeepSeek "came up with new ideas and built them on top of other people's work," Meta's chief AI scientist Yann LeCun stated in a Threads post on Saturday. "Because their work is published and open source, everyone can profit from it." Chinese e-commerce company Alibaba is claiming that it has developed an even smarter model than DeepSeek's. Alibaba on Wednesday released a new AI model called Qwen 2.5 Max edition that the company says scored better than AI from Meta, OpenAI, and DeepSeek in leading benchmark tests, per Bloomberg. "Qwen 2.5-Max outperforms ... almost across the board [OpenAI's] GPT-4o, DeepSeek-V3 and [Meta's] Llama-3.1-405B," Alibaba's cloud division stated in an announcement on its official WeChat account, according to Reuters.
[8]
Did DeepSeek Copy ChatGPT? Trump Adviser Thinks So
Did the upstart Chinese tech company DeepSeek copy ChatGPT to make the artificial intelligence technology that shook Wall Street this week? That's what ChatGPT maker OpenAI is suggesting, along with U.S. President Donald Trump's top AI adviser. Neither has disclosed specific evidence of intellectual property theft, but the comments could fuel a reexamination of some of the assumptions that led to a panic in the U.S. over DeepSeek's advancements. "There's substantial evidence that what DeepSeek did here is they distilled the knowledge out of OpenAI's models," David Sacks, Trump's AI adviser, told Fox News on Tuesday. "And I don't think OpenAI is very happy about this." DeepSeek and the hedge fund it grew out of, High-Flyer, didn't immediately respond to emailed questions Wednesday, the start of China's extended Lunar New Year holiday. OpenAI said in a statement that China-based companies "are constantly trying to distill the models of leading U.S. AI companies" but didn't call out DeepSeek specifically. OpenAI's official terms of use ban the technique known as distillation that enables a new AI model to learn by repeatedly querying a bigger one that's already been trained. OpenAI said it has banned and revoked access of accounts attempting to distill its models. The San Francisco company said it pursues countermeasures to protect its intellectual property and will work "closely with the U.S. government to best protect the most capable models from efforts by adversaries and competitors to take U.S. technology." OpenAI has itself been accused of copyright theft in lawsuits from media organizations, book authors and others in cases that are still working through courts in the U.S. and elsewhere. Even before DeepSeek news rattled markets Monday, many who were trying out the company's AI model noticed a tendency for it to declare that it was ChatGPT or refer to OpenAI's terms and policies. "If you ask it what model are you, it would say, 'I'm ChatGPT,' and the most likely reason for that is that the training data for DeepSeek was harvested from millions of chat interactions with ChatGPT that were just fed directly into DeepSeek's training data," said Gregory Allen, a former U.S. Defense Department official who now directs the Wadhwani AI Center at the Center for Strategic and International Studies. Such declarations are not necessarily an indication of IP theft -- chatbots are prone to fabricating information. But DeepSeek, despite describing its technology as "open-source," doesn't disclose the data it used to train its model. "I think that there's a pretty obvious reason for that choice, which is that they harvested ChatGPT for training data," Allen said. Much about the new model has perplexed analysts poring through the startup's public research papers about its new model, R1, and its precursors. Among the details that startled Wall Street was DeepSeek's assertion that the cost to train the flagship v3 model behind its AI assistant was only $5.6 million, a stunningly low number compared to the multiple billions of dollars spent to build ChatGPT and other popular chatbots. The $5.6 million number only included actually training the chatbot, not the costs of earlier-stage research and experiments, the paper said. But the number -- and DeepSeek's relatively cheap prices for developers -- called into question the huge amounts of money and electricity pouring into AI development in the U.S. DeepSeek was also working under constraints: U.S. export controls on the most powerful AI chips. It said it relied on a relatively low-performing AI chip from California chipmaker Nvidia that the U.S. hasn't banned for sale in China. But in 2022, a social media post from High-Flyer said it had amassed a cluster of 10,000 more powerful Nvidia chips that are now banned in China. Copyright 2025 The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.
Share
Copy Link
Microsoft and OpenAI are probing whether a group associated with Chinese AI startup DeepSeek improperly accessed OpenAI's data through its API, potentially violating terms of service and raising concerns about AI competition and data security.
Microsoft and OpenAI have initiated an investigation into a possible unauthorized access of OpenAI's data by a group linked to Chinese AI startup DeepSeek. The probe began after Microsoft's security researchers observed suspicious activity in the fall of 2024, where individuals believed to be associated with DeepSeek were seen extracting large amounts of data through OpenAI's application programming interface (API) 1.
The investigation centers on whether DeepSeek violated OpenAI's terms of service or attempted to circumvent restrictions on data access. David Sacks, the White House's AI czar, claimed there is "substantial evidence" that DeepSeek used a technique called distillation to develop its own AI capabilities using outputs from OpenAI's models 2.
The controversy surrounds DeepSeek's recent release of its open-source AI model, R1, which reportedly rivals or outperforms leading US developers' products on various industry benchmarks. DeepSeek claims to have built R1 at a fraction of the cost compared to its competitors, estimating training expenses at just $5.58 million using 2,048 Nvidia H800 chips 3.
The announcement of DeepSeek's R1 model had a significant impact on the US tech market, causing stocks of AI-related companies like Microsoft, Nvidia, Google, and Meta to tumble, erasing nearly $1 trillion in market value 1. This development has raised concerns about the competitive landscape in the AI industry and the potential for smaller companies to challenge established tech giants.
While not directly addressing the allegations against DeepSeek, OpenAI acknowledged that companies based in China and elsewhere are "constantly trying to distill the models of leading US AI companies" 4. OpenAI spokesperson Liz Bourgeois stated that the company employs "aggressive, proactive countermeasures" to protect its technology and is working closely with the US government to safeguard its most capable models 5.
This incident highlights the ongoing challenges in the AI industry regarding data access, intellectual property protection, and international competition. It also raises questions about the ethics and legality of using data generated by one AI system to train another, a practice that is common but potentially problematic when involving proprietary technologies 5.
As the investigation unfolds, the AI community and policymakers will be closely watching its outcomes, which could have significant implications for future AI development, international collaborations, and regulatory frameworks in the rapidly evolving field of artificial intelligence.
Summarized by
Navi
[0]
Australian Financial Review
|Microsoft probing if DeepSeek-linked group improperly obtained OpenAI data[0]
[6]
NVIDIA announces significant upgrades to its GeForce NOW cloud gaming service, including RTX 5080-class performance, improved streaming quality, and an expanded game library, set to launch in September 2025.
9 Sources
Technology
6 hrs ago
9 Sources
Technology
6 hrs ago
As nations compete for dominance in space, the risk of satellite hijacking and space-based weapons escalates, transforming outer space into a potential battlefield with far-reaching consequences for global security and economy.
7 Sources
Technology
22 hrs ago
7 Sources
Technology
22 hrs ago
OpenAI updates GPT-5 to make it more approachable following user feedback, sparking debate about AI personality and user preferences.
6 Sources
Technology
14 hrs ago
6 Sources
Technology
14 hrs ago
A pro-Russian propaganda group, Storm-1679, is using AI-generated content and impersonating legitimate news outlets to spread disinformation, raising concerns about the growing threat of AI-powered fake news.
2 Sources
Technology
22 hrs ago
2 Sources
Technology
22 hrs ago
A study reveals patients' increasing reliance on AI for medical advice, often trusting it over doctors. This trend is reshaping doctor-patient dynamics and raising concerns about AI's limitations in healthcare.
3 Sources
Health
14 hrs ago
3 Sources
Health
14 hrs ago