2 Sources
2 Sources
[1]
Tech groups struggle to solve AI's big security flaw
The world's top artificial intelligence groups are stepping up efforts to solve a critical security flaw in their large language models that can be exploited by cyber criminals. Google DeepMind, Anthropic, OpenAI and Microsoft are among those trying to prevent so-called indirect prompt injection attacks, where a third party hides commands in websites or emails designed to trick the AI model into revealing unauthorised information, such as confidential data. "AI is being used by cyber actors at every chain of the attack right now," said Jacob Klein, who leads the threat intelligence team at AI start-up Anthropic. AI groups are using a variety of techniques, including hiring external testers and using AI-powered tools, to detect and reduce malicious uses of their powerful technologies. But experts warned the industry had not yet solved how to stop indirect prompt injection attacks. Part of the problem is LLMs are designed to follow instructions, and currently do not distinguish between legitimate commands from users and input that should not be trusted. This is also the reason why AI models are prone to jailbreaking, where users can prompt LLMs to disregard their safeguards. Klein said Anthropic works with external testers to make its Claude model more resistant to indirect prompt injection attacks. They also have AI tools to detect when they might be happening. "When we find a malicious use, depending on confidence levels, we may automatically trigger some intervention or it may send it to human review," he added. Google DeepMind uses a technique called automated red teaming, where the company's internal researchers constantly attack its Gemini model in a realistic way to uncover potential security weaknesses. In May, the UK's National Cyber Security Centre warned that this flaw posed an increased threat, as it risks exposing millions of companies and individuals that use LLMs and chatbots to sophisticated phishing attacks and scams. LLMs also have another major vulnerability, where outsiders can create back doors and cause the models to misbehave by inserting malicious material into data that is then used in AI training. These so-called "data poisoning attacks" are easier to conduct than scientists previously believed, according to new research published last month by AI start-up Anthropic, the UK's AI Security Institute and the Alan Turing Institute. While these vulnerabilities pose big risks, experts argue that AI is also helping to boost company's defences against cyber attacks. For years, attackers have had a slight advantage, in that they only needed to find one weakness, while defenders had to protect everything, said Microsoft's corporate vice-president and deputy chief information security officer Ann Johnson. "Defensive systems are learning faster, adapting faster, and moving from reactive to proactive," she added. The race to solve flaws in AI models comes as cyber security is emerging as one of the top concerns for companies seeking to adopt AI tools into their business. A recent Financial Times analysis of hundreds of corporate filings and executive transcripts at S&P 500 companies last year found the most commonly cited worry was cyber security, which was mentioned as a risk by more than half of the S&P 500 in 2024. Hacking experts said the advancement of AI in recent years has already boosted the multibillion-dollar cyber crime industry. It has provided amateur hackers with cheap tools to write harmful software, as well as systems for professional criminals to better automate and scale up their operations. LLMs allow hackers to quickly generate new malicious code that has not been detected yet, which makes it harder to defend against, said Jake Moore, global cyber security adviser at cyber security group ESET. A recent study by researchers at MIT found that 80 per cent of ransomware attacks they examined used AI, and in 2024, phishing scams and deepfake-related fraud linked to the technology saw a 60 per cent increase. AI tools are also being used by hackers to collect information on victims online. LLMs can scour the web efficiently for personal data on someone's public accounts, images or even find audio clips of someone speaking. These could be used to conduct sophisticated social engineering attacks for financial crimes, said Paul Fabara, Visa's chief risk and client services officer. Vijay Balasubramaniyan, chief executive and co-founder of Pindrop, a cyber security firm specialising in voice fraud, said generative AI has made creating realistic sounding deepfakes much easier and quicker than before. "Back in 2023, we'd see one deepfake attack per month across the entire customer base. Now we're seeing seven per day per customer," he added. Companies are particularly vulnerable to these kinds of attacks, said ESET's Moore. AI systems can collate information from the public internet, such as employees' LinkedIn posts, to find out what kind of programs and software companies use day to day, and then use that to find vulnerabilities. Anthropic recently intercepted a sophisticated actor using the company's language models for "vibe hacking", where the person had automated many of the processes of a large-scale attack. The bad actor used Claude Code to automate reconnaissance, harvest victims' credentials and infiltrate systems. The person had targeted 17 organisations to extort up to $500,000 from them. Cyber experts said companies need to stay vigilant in monitoring for new threats and consider restricting how many people have access to sensitive datasets and AI tools that are prone to attacks. "It doesn't take much to be a crook nowadays," said Visa's Fabara. "You get a laptop, $15 to download the cheap bootleg version of gen AI in the dark web and off you go."
[2]
Tech Giants Tackle Major AI Security Threat | PYMNTS.com
As the report notes, these attacks happen when a third party hides commands inside a website or email to trick artificial intelligence (AI) models into turning over unauthorized information. "AI is being used by cyber actors at every chain of the attack right now," said Jacob Klein, who heads the threat intelligence team at Anthropic. According to the report, companies are doing things like hiring external testers and using AI-powered tools to ferret out and prevent malicious uses of their technology. However, experts caution that the industry still hasn't determined how to stop indirect prompt injection attacks. At issue is the fact that AI large language models (LLMs) are designed to obey instructions, and in their present state do not distinguish between legitimate user commands and input that should not be trusted. This is also why AI models are vulnerable to jailbreaking, where users can prompt LLMs to ignore their safeguards, the report added. Klein said Anthropic works with outside testers to help its Claude model resist indirect prompt injection attacks. They also have AI tools to tell when the attacks might be occurring. "When we find a malicious use, depending on confidence levels, we may automatically trigger some intervention or it may send it to human review," he added. Both Google and Microsoft have addressed the threat of the attacks and their efforts to stop them on their company blogs. Meanwhile, research by PYMNTS Intelligence looks at the role AI plays in preventing cyberthreats. More than half (55%) of the chief operating officers surveyed by PYMNTS late last year said their companies had begun employing AI-based automated cybersecurity management systems. That's a threefold increase in the matter of months. These systems use generative AI (GenAI) to uncover fraudulent activities, spot anomalies and offer threat assessments in real time, making them more effective than standard reactive security measures. The move from reactive to proactive security strategies is a critical part of this transformation. "By integrating AI into security frameworks, COOs are improving threat detection and enhancing their organizations' overall resilience," PYMNTS wrote earlier this year. "GenAI is viewed as a vital tool for minimizing the risk of security breaches and fraud, and it is becoming an essential component of strategic risk management in large organizations."
Share
Share
Copy Link
Leading AI companies including Google DeepMind, Anthropic, OpenAI, and Microsoft are intensifying efforts to combat indirect prompt injection attacks and data poisoning vulnerabilities in their large language models, while cybercriminals increasingly leverage AI for sophisticated attacks.
The world's leading artificial intelligence companies are confronting a significant security crisis as cybercriminals exploit fundamental flaws in large language models. Google DeepMind, Anthropic, OpenAI, and Microsoft are among the major tech firms intensifying efforts to address indirect prompt injection attacks, where malicious actors embed hidden commands in websites or emails to manipulate AI models into revealing unauthorized information
1
.
Source: Financial Times News
"AI is being used by cyber actors at every chain of the attack right now," warned Jacob Klein, who leads the threat intelligence team at AI startup Anthropic
1
. The core issue stems from how LLMs are designed to follow instructions without distinguishing between legitimate user commands and potentially malicious input that should not be trusted2
.Tech companies are deploying various defensive measures to combat these vulnerabilities. Anthropic works with external testers to strengthen its Claude model's resistance to indirect prompt injection attacks while utilizing AI-powered detection tools. "When we find a malicious use, depending on confidence levels, we may automatically trigger some intervention or it may send it to human review," Klein explained
1
.Google DeepMind employs automated red teaming techniques, where internal researchers continuously attack the Gemini model to identify potential security weaknesses. This proactive approach represents the industry's shift from reactive to preventive security measures
1
.The advancement of AI technology has significantly boosted the multibillion-dollar cybercrime industry, providing amateur hackers with accessible tools to create harmful software while enabling professional criminals to automate and scale their operations. Recent MIT research revealed that 80 percent of examined ransomware attacks utilized AI, while phishing scams and deepfake-related fraud linked to the technology increased by 60 percent in 2024
1
.
Source: PYMNTS
Vijay Balasubramaniyan, CEO of voice fraud specialist Pindrop, highlighted the dramatic escalation in deepfake attacks: "Back in 2023, we'd see one deepfake attack per month across the entire customer base. Now we're seeing seven per day per customer"
1
.Related Stories
Despite the risks, companies are increasingly adopting AI-powered cybersecurity solutions. PYMNTS Intelligence research indicates that more than half (55%) of chief operating officers surveyed have begun implementing AI-based automated cybersecurity management systems, representing a threefold increase in recent months
2
.These systems leverage generative AI to detect fraudulent activities, identify anomalies, and provide real-time threat assessments, proving more effective than traditional reactive security measures. Microsoft's Ann Johnson noted that "defensive systems are learning faster, adapting faster, and moving from reactive to proactive"
1
.Cybersecurity has emerged as the primary concern for companies adopting AI tools, with a Financial Times analysis revealing that more half of S&P 500 companies cited it as a risk in 2024
1
.Summarized by
Navi
[1]