OpenAI warns upcoming AI models will likely pose high cybersecurity risk with zero-day exploits

Reviewed byNidhi Govil

7 Sources

Share

OpenAI has issued a stark warning that its upcoming AI models are expected to reach high cybersecurity risk levels, potentially capable of developing zero-day exploits and assisting sophisticated enterprise intrusions. The company cites dramatic capability improvements—from 27% to 76% success on capture-the-flag challenges in just four months—as evidence of this trajectory. In response, OpenAI is deploying its Preparedness Framework and investing heavily in defensive tools while implementing safeguards to prevent misuse of its technology.

OpenAI Sounds Alarm on Rising Cyber Capabilities

OpenAI has issued a warning that the cybersecurity risk posed by its advancing AI models is climbing to what it classifies as high levels. The company stated in a recent announcement that upcoming AI models will likely reach capabilities sufficient to develop working zero-day exploits against well-defended systems or meaningfully assist with complex, stealthy intrusion operations aimed at real-world effects

1

3

. This escalation reflects the dual-use risks inherent in advanced AI models, which can serve both defensive and offensive purposes in equal measure

5

.

The concern centers on weaponized artificial intelligence that could automate brute-force attacks, generate malware generation content, create phishing content, and refine existing code to make cyberattack chains more efficient

1

. According to Fouad Matin from OpenAI, the forcing function driving this risk is the model's ability to work for extended periods of time autonomously, enabling these types of persistent attacks

3

.

Source: Axios

Source: Axios

Dramatic Capability Surge in Recent Months

The evidence supporting OpenAI's warning comes from measurable performance improvements. In capture-the-flag challenges—traditionally used to test cybersecurity capabilities in controlled environments—GPT-5 scored just 27% in August 2025. By November 2025, GPT-5.1-Codex-Max achieved a 76% success rate, marking a substantial leap in just four months

1

3

5

. This trajectory suggests that sophisticated cyberattacks could become accessible to a broader range of threat actors, significantly expanding the pool of individuals capable of executing complex operations

3

.

OpenAI expects this upward trend to continue, stating the company is planning and evaluating as though each new model could reach high levels of cybersecurity capability as measured by the OpenAI Preparedness Framework

2

4

. High is the second-highest risk classification, sitting just below critical—the threshold at which models are deemed unsafe for public release

3

.

Deploying the Preparedness Framework

To address these mounting concerns, OpenAI is relying on its Preparedness Framework, last updated in April 2025, which outlines the company's approach to balancing innovation with risk mitigation

1

. The framework establishes measurable thresholds that indicate when AI models could cause severe harm across three priority categories: cybersecurity, chemical and biological threats, and persuasion capabilities

1

. OpenAI has committed not to deploy highly capable models until sufficient safeguards are built to minimize associated risks

1

.

Source: ET

Source: ET

Because offensive and defensive cyber tasks rely on the same underlying knowledge, OpenAI is adopting a defense-in-depth approach rather than depending on any single safeguard to prevent misuse of its technology

2

. The company is training models to detect and refuse malicious requests, though this presents challenges since threat actors can masquerade as defenders to generate output later used for criminal activity

1

.

Strengthening Defensive Cybersecurity Systems

While the risks are significant, OpenAI emphasizes that these same autonomous capabilities can enhance defensive AI capabilities for security professionals. The company is investing heavily in strengthening models for defensive cybersecurity tasks and creating tools that enable defenders to perform workflows such as code auditing and vulnerability patching at scale

2

4

5

.

OpenAI plans to introduce a program providing cyberdefense workers with access to enhanced capabilities in its models

5

. The company is also testing Aardvark, an agentic security researcher, and establishing the Frontier Risk Council—an advisory group bringing together security practitioners and OpenAI teams

5

. The goal is for AI models and products to bring significant advantages for defenders, who are often outnumbered and under-resourced

1

2

.

Source: Digit

Source: Digit

Multi-Layered Defense Strategy

OpenAI's risk mitigation strategy combines multiple layers of protection. The company is implementing access controls, infrastructure hardening, egress controls, and system-wide monitoring to detect potentially malicious cyber activity

4

5

. When activity appears unsafe, the company may block output, route prompts to safer or less capable models, or escalate for enforcement

1

.

The organization is working with Red Teams providers to evaluate and improve its safety measures, leveraging offensive testing to discover defensive weaknesses for remediation

1

. Dedicated threat intelligence and insider risk programs have been launched as part of this comprehensive approach

1

. OpenAI acknowledges this is ongoing work and expects to keep evolving these programs as it learns what most effectively advances real-world security

5

."

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo