7 Sources
7 Sources
[1]
Weaponized AI risk is 'high,' warns OpenAI - here's the plan to stop it
The OpenAI Preparedness Framework may help track the security risks of AI models. OpenAI is warning that the rapid evolution of cyber capabilities in artificial intelligence (AI) models could result in "high" levels of risk for the cybersecurity industry at large, and so action is being taken now to assist defenders. As AI models, including ChatGPT, continue to be developed and released, a problem has emerged. As with many types of technology, AI can be used to benefit others, but it can also be abused -- and in the cybersecurity sphere, this includes weaponizing AI to automate brute-force attacks, generate malware or believable phishing content, and refine existing code to make cyberattack chains more efficient. (Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) In recent months, bad actors have used AI to propagate their scams through indirect prompt injection attacks against AI chatbots and AI summary functions in browsers; researchers have found AI features diverting users to malicious websites, AI assistants are developing backdoors and streamlining cybercriminal workflows, and security experts have warned against trusting AI too much with our data. Also: Gartner urges businesses to 'block all AI browsers' - what's behind the dire warning The dual nature (as Open AI calls it) of AI models, however, means that AI can also be leveraged by defenders to refine protective systems, to develop tools to identify threats, to potentially train or teach human specialists, and to shoulder the task of time-consuming, reptitive tasks such as alert triage, which frees up the time of cybersecurity staff for more valuable projects. According to OpenAI, the capabilities of AI systems are advancing at a rapid rate. For example, capture-the-flag (CTF) challenges, traditionally used to test cybersecurity capabilities in test environments and aimed at finding hidden "flags," are now being used to assess the cyber capabilities of AI models. OpenAI said they have improved from 27% success rates on GPT‑5 in August 2025 to 76% on GPT‑5.1-Codex-Max in November 2025 -- a notable increase in a period of only four months. Also: AI agents are already causing disasters - and this hidden threat could derail your safe rollout The minds behind ChatGPT said they expect AI models to continue on this trajectory, which would give them "high" levels of cyber capability. OpenAI said this classification means that models "can either develop working zero-day remote exploits against well-defended systems, or meaningfully assist with complex, stealthy enterprise or industrial intrusion operations aimed at real-world effects." Managing and assessing whether AI capabilities will do harm or good, however, is no simple task -- but one that OpenAI hopes to tackle with initiatives including the Preparedness Framework (.PDF). The Preparedness Framework, last updated in April 2025, outlines OpenAI's approach to balancing AI defense and risk. While it isn't new, the framework does provide the structure and guide for the organization to follow -- and this includes where it invests in threat defense. Three categories of risk, and those that could lead to "severe harm," are currently the primary focus. These are: The priority category appears to be cybersecurity at present, or at least the most publicized. In any case, the framework's purpose is to identify risk factors and maintain a threat model with measurable thresholds that indicate when AI models could cause severe harm. Also: How well does ChatGPT know me? This simple prompt revealed a lot - try it for yourself "We won't deploy these very capable models until we've built safeguards to sufficiently minimize the associated risks of severe harm," OpenAI said in its framework manifest. "This Framework lays out the kinds of safeguards we expect to need, and how we'll confirm internally and show externally that the safeguards are sufficient." OpenAI said it is investing heavily in strengthening its models against abuse, as well as making them more useful for defenders. Models are being hardened, dedicated threat intelligence and insider risk programs have been launched, and its systems are being trained to detect and refuse malicious requests. (This, in itself, is a challenge, considering threat actors can act and prompt as defenders to try and generate output later used for criminal activity.) "Our goal is for our models and products to bring significant advantages for defenders, who are often outnumbered and under-resourced," OpenAI said. "When activity appears unsafe, we may block output, route prompts to safer or less capable models, or escalate for enforcement." The organization is also working with Red Team providers to evaluate and improve its safety measures, and as Red Teams act offensively, it is hoped they can discover defensive weaknesses for remediation -- before cybercriminals do. Also: AI's scary new trick: Conducting cyberattacks instead of just helping out OpenAI is set to launch a "trusted access program" that grants a subset of users or partners access to test models with "enhanced capabilities" linked to cyberdefense, but it will be closely controlled. "We're still exploring the right boundary of which capabilities we can provide broad access to and which ones require tiered restrictions, which may influence the future design of this program," the company noted. "We aim for this trusted access program to be a building block towards a resilient ecosystem." Furthermore, OpenAI has moved Aardvark, a security researcher agent, into private beta. This will likely be of interest to cybersecurity researchers, as the point of this system is to scan codebases for vulnerabilities and provide patch guidance. According to OpenAI, Aardvark has already identified "novel" CVEs in open source software. Finally, a new collaborative advisory group will be established in the near future. Dubbed the Frontier Risk Council, this group will include security practitioners and partners who will initially focus on the cybersecurity implications of AI and associated practices and recommendations, but the council will also eventually expand to include the other categories outlined in the OpenAI Preparedness Framework in the future. We have to treat AI with caution, and this includes implementing AI and LLMs not only into our personal lives, but also limiting the exposure of AI-based security risks in business. For example, research firm Gartner recently warned organizations to avoid or block AI browsers entirely due to security concerns, including prompt injection attacks and data exposure. We need to remember that AI is a tool, albeit a new and exciting one. New technologies all come with risks -- as OpenAI clearly knows, considering its focus on the cybersecurity challenges associated with what has become the most popular AI chatbot worldwide -- and so any of its applications should be treated in the same way as any other new technological solution: with an assessment of its risks, alongside potential rewards.
[2]
OpenAI unveils new measures as frontier AI grows cyber-powerful
The company expects future models could reach "High" capability levels under its Preparedness Framework. That means models powerful enough to develop working zero-day exploits or assist with sophisticated enterprise intrusions. In anticipation, OpenAI says it is preparing safeguards as if every new model could reach that threshold, ensuring progress is paired with strong risk controls. OpenAI is expanding investments in models designed to support defensive workflows, from auditing code to patching vulnerabilities at scale. The company says its aim is to give defenders an edge in a landscape where they are often "outnumbered and under-resourced." Because offensive and defensive cyber tasks rely on the same knowledge, OpenAI says it is adopting a defense-in-depth approach rather than depending on any single safeguard. The company emphasizes shaping "how capabilities are accessed, guided, and applied" to ensure AI strengthens cybersecurity rather than lowering barriers to misuse.
[3]
Exclusive: Future OpenAI models likely to pose "high" cybersecurity risk, it says
Why it matters: The models' growing capabilities could significantly expand the number of people able to carry out cyberattacks. Driving the news: OpenAI said it has already seen a significant increase in capabilities in recent releases, particularly as models are able to operate longer autonomously, paving the way for brute force attacks. * The company notes that GPT-5 scored a 27% on a capture-the-flag exercise in August, GPT-5.1-Codex-Max was able to score 76% last month. * "We expect that upcoming AI models will continue on this trajectory," the company says in the report. "In preparation, we are planning and evaluating as though each new model could reach 'high' levels of cybersecurity capability as measured by our Preparedness Framework." Catch up quick: OpenAI issued a similar warning relative to bioweapons risk in June, and then released ChatGPT Agent in July, which did indeed rate "high" on its risk levels. * "High" is the second-highest level, below the "critical" level at which models are unsafe to be released publicly. Yes, but: The company didn't say exactly when to expect the first models rated "high" for cybersecurity risk, or which types of future models could pose such a risk. What they're saying: "What I would explicitly call out as the forcing function for this is the model's ability to work for extended periods of time," OpenAI's Fouad Matin told Axios in an exclusive interview. * These kinds of brute force attacks that rely on this extended time are more easily defended, Matin says. * "In any defended environment this would be caught pretty easily," he added. The big picture: Leading models are getting better at finding security vulnerabilities -- and not just models from OpenAI.
[4]
OpenAI warns new models pose 'high' cybersecurity risk
OpenAI warned that its upcoming models could create serious cyber risks, including helping generate zero-day exploits or aiding sophisticated attacks. The company says it is boosting defensive uses of AI, such as code audits and vulnerability fixes. It is also tightening controls and monitoring to reduce misuse. OpenAI said on Wednesday the cyber capabilities of its artificial intelligence models are increasing and warned that upcoming models are likely to pose a "high" cybersecurity risk. The AI models might either develop working zero-day remote exploits against well-defended systems or assist with complex enterprise or industrial intrusion operations aimed at real-world effects, the ChatGPT maker said in a blog post. As capabilities advance, OpenAI said it is "investing in strengthening models for defensive cybersecurity tasks and creating tools that enable defenders to more easily perform workflows such as auditing code and patching vulnerabilities". To counter cybersecurity risks, OpenAI said it is relying on a mix of access controls, infrastructure hardening, egress controls and monitoring.
[5]
OpenAI Plans to Offer AI Models' Enhanced Capabilities to Cyberdefense Workers | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. While the advancements in all AI models bring benefits for cyberdefense, they also bring dual-use risks, meaning they could be used for malicious purposes as well as defensive ones, the company said in a Wednesday (Dec. 10) blog post. Demonstrating the advancements of AI models, the post said that in capture-the-flag challenges, the assessed capabilities of OpenAI's models improved from 27% on GPT-5 in August to 76% on GPT-5.1-Codex-Max in November. "We expect that upcoming AI models will continue on this trajectory; in preparation, we are planning and evaluating as though each new model could reach 'High' levels of cybersecurity capability, as measured by our Preparedness Framework," OpenAI said in its post. "By this, we mean models that can either develop working zero-day remote exploits against well-defended systems, or meaningfully assist with complex, stealthy enterprise or industrial intrusion operations aimed at real-world effects." To help defenders while hindering misuse, OpenAI is strengthening its models for defensive cybersecurity tasks and creating tools that help defenders audit code, patch vulnerabilities and perform other workflows, according to the post. The company is also training models to refuse harmful requests, maintaining system-wide monitoring to detect potentially malicious cyber activity, blocking unsafe activity, and working with red teaming organizations to evaluate and improve its safety measures, the post said. In addition, OpenAI is preparing to introduce a program in which it will provide users working on cyberdefense with access to enhanced capabilities in its models, testing an agentic security researcher called Aardvark, and establishing an advisory group called the Frontier Risk Council that will bring together security practitioners and OpenAI teams, per the post. "Taken together, this is ongoing work, and we expect to keep evolving these programs as we learn what most effectively advances real-world security," OpenAI said in the post. PYMNTS reported in November that AI has become both a tool and a target when it comes to cybersecurity.
[6]
Malware Risks Make OpenAI Add Security Layers to AI Models
OpenAI has announced new layers of cybersecurity controls after internal tests showed that its upcoming frontier AI models have reached "high" levels of cyber-capability. The company said the models are now capable of performing advanced tasks such as vulnerability discovery, exploit development and malware generation, raising the risk of misuse if the systems fall into the wrong hands. In a detailed update published on December 10, 2025, OpenAI said it is intensifying its cybersecurity efforts to keep pace with the rising technical abilities of its models. "We're investing in strengthening them, layering in safeguards, and partnering with global security experts," the company said. According to the company, internal testing showed that performance on cybersecurity tasks, measured through capture-the-flag (CTF) challenges, jumped from 27% in GPT-5 (August 2025) to 76% in GPT-5.1-Codex-Max (November 2025). These gains mean future models may soon reach what OpenAI classifies as high levels of cybersecurity capability. Under the company's Preparedness Framework, this includes the ability to develop working zero-day exploits or assist in stealthy, real-world intrusion operations. In response, OpenAI says it is building multiple layers of safeguards to ensure such capabilities help defenders more than attackers. The rising concern around AI-enabled cyberattacks is not theoretical. Security firms have reported that hackers are already using tools like ChatGPT to write and refine malicious code. A November 2024 analysis by DOT Security noted that threat actors were using AI to generate obfuscated malware, craft more convincing phishing emails, and automate large volumes of attack code. Their researchers found that AI could also tailor malware to specific systems by analysing logs and configuration files, making traditional antivirus tools less effective. This earlier wave of AI misuse shows why frontier-level models with stronger cyber capabilities require strict safeguards, controlled access, and constant monitoring. Last month, researchers at Google's Threat Intelligence Group reported that threat actors were already using AI tools, including Gemini and open-source models, to build malware that rewrites its own code during execution to evade detection. Their investigation documented malware families like PROMPTFLUX and PROMPTSTEAL using LLMs for obfuscation, reconnaissance, data theft, and phishing. The report also showed state-backed groups from China, Iran, and North Korea manipulating AI systems with social-engineering prompts to bypass safeguards. These incidents demonstrated how AI can become an active component inside malware, underscoring why frontier models with rising cyber capabilities require tighter oversight and controlled access. OpenAI is rolling out a "defense-in-depth" safety stack that includes access controls, tightened infrastructure security, output monitoring, and threat-intelligence systems. The company said it expects threats to evolve and is designing systems that can adapt quickly. A core part of this strategy is training AI models to identify and refuse harmful cyber requests while remaining useful for legitimate work, such as security audits and vulnerability analysis. When its monitoring systems detect suspicious behaviour, OpenAI says it may block outputs, downgrade a user to a safer model, or escalate the case for further review. Enforcement involves automated tools as well as human oversight. The company has also begun end-to-end red teaming with external cybersecurity specialists, experts tasked with imitating real attackers to find weaknesses in OpenAI's protections. OpenAI says it wants advanced AI to meaningfully strengthen defenders, who often operate with fewer resources than attackers. This includes improving model performance in routine defensive tasks such as code auditing, bug detection, and patch development. One of its key initiatives is Aardvark, an "agentic security researcher" now in private beta. The system can scan entire codebases, detect vulnerabilities, and propose patches. OpenAI says Aardvark has already discovered new CVEs in open-source software. The company plans to offer the tool free to select non-commercial open-source projects to help improve supply-chain security. To manage who gets access to the most advanced cybersecurity-related functionalities, OpenAI plans to launch a trusted access program. The initiative will provide "qualifying users and customers working on cyberdefense" with tiered access to enhanced model capabilities. OpenAI says it is still determining where the boundaries will lie, deciding which features can be widely available and which require restrictions due to potential misuse. OpenAI is also creating a Frontier Risk Council, an advisory group of experienced defenders and security practitioners. The council will guide the company on how to differentiate between genuinely useful capabilities and those that could be weaponised. Insights from the group will influence future safeguards and evaluation methods. Beyond internal efforts, OpenAI is working with other AI labs through the Frontier Model Forum to develop shared threat models and best practices for handling AI-driven cyber risks. The company says such collaboration is important because any frontier-level model could potentially be misused for cyberattacks. OpenAI describes this work as a long-term commitment aimed at giving defenders an advantage as AI becomes more capable. The company says its approach will continue to evolve as it learns what safeguards and tools work best in real-world environments. It also plans to support the broader ecosystem with cybersecurity grants and community-driven initiatives. The overall goal of OpenAI is to ensure that the growing power of frontier AI models reinforces cybersecurity rather than making it easier for attackers to exploit digital systems.
[7]
OpenAI flags rising cyber threats as AI models get more powerful
Experts warn that advanced offensive AI capabilities could trigger stricter global regulations and oversight. OpenAI has warned of growing concerns about the artificial intelligence models that could increase the global cybersecurity risks, even as the company is working to improve its defensive capabilities. As per the company, the future AI startups may become capable of generating advanced zero-day exploits, carrying out high-skilled intrusion operations or breaching complex enterprise networks, tasks traditionally associated with elite human hackers. To counter the risks, OpenAI has said it is increasing the investment in the defensive applications of AI, including tools that help security teams audit code, identify vulnerabilities and automate patching processes. The company also added that it relies on the strict access controls, outbound use restrictions, hardened infrastructure and continuous monitoring to limit the misuse of its models. "As our models grow more capable in cybersecurity, we're investing in strengthening safeguards and working with global experts as we prepare for upcoming models to reach 'High' capability under our Preparedness Framework," the company wrote in its social media post. According to cybersecurity researchers, if AI systems achieve the level of offensive capability that OpenAI warns about, attackers will be able to automate cybercrime on an unprecedented scale. These models could create malicious payloads, generate exploits without human intervention, and mass-produce malware like worms, ransomware, and botnets, dramatically increasing the number and intensity of attacks. Experts warn that such breakthroughs will draw intense attention from authorities around the world, as policymakers strive for stronger compliance norms and greater accountability from AI developers. To stay up with the changing threat landscape, many organisations will certainly require AI-powered defensive solutions. It remains to be seen whether the upcoming artificial intelligence model will bring the safety that every user around the globe deserves.
Share
Share
Copy Link
OpenAI has issued a stark warning that its upcoming AI models are expected to reach high cybersecurity risk levels, potentially capable of developing zero-day exploits and assisting sophisticated enterprise intrusions. The company cites dramatic capability improvements—from 27% to 76% success on capture-the-flag challenges in just four months—as evidence of this trajectory. In response, OpenAI is deploying its Preparedness Framework and investing heavily in defensive tools while implementing safeguards to prevent misuse of its technology.
OpenAI has issued a warning that the cybersecurity risk posed by its advancing AI models is climbing to what it classifies as high levels. The company stated in a recent announcement that upcoming AI models will likely reach capabilities sufficient to develop working zero-day exploits against well-defended systems or meaningfully assist with complex, stealthy intrusion operations aimed at real-world effects
1
3
. This escalation reflects the dual-use risks inherent in advanced AI models, which can serve both defensive and offensive purposes in equal measure5
.The concern centers on weaponized artificial intelligence that could automate brute-force attacks, generate malware generation content, create phishing content, and refine existing code to make cyberattack chains more efficient
1
. According to Fouad Matin from OpenAI, the forcing function driving this risk is the model's ability to work for extended periods of time autonomously, enabling these types of persistent attacks3
.
Source: Axios
The evidence supporting OpenAI's warning comes from measurable performance improvements. In capture-the-flag challenges—traditionally used to test cybersecurity capabilities in controlled environments—GPT-5 scored just 27% in August 2025. By November 2025, GPT-5.1-Codex-Max achieved a 76% success rate, marking a substantial leap in just four months
1
3
5
. This trajectory suggests that sophisticated cyberattacks could become accessible to a broader range of threat actors, significantly expanding the pool of individuals capable of executing complex operations3
.OpenAI expects this upward trend to continue, stating the company is planning and evaluating as though each new model could reach high levels of cybersecurity capability as measured by the OpenAI Preparedness Framework
2
4
. High is the second-highest risk classification, sitting just below critical—the threshold at which models are deemed unsafe for public release3
.To address these mounting concerns, OpenAI is relying on its Preparedness Framework, last updated in April 2025, which outlines the company's approach to balancing innovation with risk mitigation
1
. The framework establishes measurable thresholds that indicate when AI models could cause severe harm across three priority categories: cybersecurity, chemical and biological threats, and persuasion capabilities1
. OpenAI has committed not to deploy highly capable models until sufficient safeguards are built to minimize associated risks1
.
Source: ET
Because offensive and defensive cyber tasks rely on the same underlying knowledge, OpenAI is adopting a defense-in-depth approach rather than depending on any single safeguard to prevent misuse of its technology
2
. The company is training models to detect and refuse malicious requests, though this presents challenges since threat actors can masquerade as defenders to generate output later used for criminal activity1
.Related Stories
While the risks are significant, OpenAI emphasizes that these same autonomous capabilities can enhance defensive AI capabilities for security professionals. The company is investing heavily in strengthening models for defensive cybersecurity tasks and creating tools that enable defenders to perform workflows such as code auditing and vulnerability patching at scale
2
4
5
.OpenAI plans to introduce a program providing cyberdefense workers with access to enhanced capabilities in its models
5
. The company is also testing Aardvark, an agentic security researcher, and establishing the Frontier Risk Council—an advisory group bringing together security practitioners and OpenAI teams5
. The goal is for AI models and products to bring significant advantages for defenders, who are often outnumbered and under-resourced1
2
.
Source: Digit
OpenAI's risk mitigation strategy combines multiple layers of protection. The company is implementing access controls, infrastructure hardening, egress controls, and system-wide monitoring to detect potentially malicious cyber activity
4
5
. When activity appears unsafe, the company may block output, route prompts to safer or less capable models, or escalate for enforcement1
.The organization is working with Red Teams providers to evaluate and improve its safety measures, leveraging offensive testing to discover defensive weaknesses for remediation
1
. Dedicated threat intelligence and insider risk programs have been launched as part of this comprehensive approach1
. OpenAI acknowledges this is ongoing work and expects to keep evolving these programs as it learns what most effectively advances real-world security5
."Summarized by
Navi
[2]
1
Policy and Regulation

2
Technology

3
Technology
