OpenAI's GPT-5 and GPT-OSS Models Jailbroken Within Hours of Release

4 Sources

OpenAI's latest AI models, including GPT-5 and GPT-OSS, were successfully jailbroken within hours of their release, despite claims of enhanced security measures. Researchers used sophisticated techniques to bypass safety protocols, raising concerns about AI model vulnerabilities.

OpenAI's New Models Quickly Compromised

OpenAI's recent release of GPT-5 and GPT-OSS models, touted as more secure and resistant to jailbreaks, faced a significant setback as researchers and AI enthusiasts successfully bypassed their safety measures within hours of their launch 1 4.

Source: Decrypt

Source: Decrypt

Jailbreak Techniques Employed

Researchers from NeuralTrust Inc. demonstrated a sophisticated jailbreak method dubbed "Echo Chamber and Storytelling" 2. This technique involves:

  1. Context poisoning over multiple conversation turns
  2. Subtle narrative manipulation
  3. Avoiding explicit malicious intent in prompts

The attack successfully compelled GPT-5 to provide step-by-step instructions for creating a Molotov cocktail, all while maintaining a seemingly innocuous conversation 3.

Pliny the Liberator's Swift Action

Notorious AI jailbreaker Pliny the Liberator announced on social media that he had successfully cracked the GPT-OSS models shortly after their release. His method involved:

  1. Multi-stage prompts
  2. Use of dividers (his signature "LOVE PLINY" markers)
  3. Generating unrestricted content in leetspeak to evade detection 1

OpenAI's Security Claims

Prior to the release, OpenAI had emphasized the robustness of their new models:

  • GPT-OSS-120b underwent "worst-case fine-tuning" in biological and cyber domains
  • The Safety Advisory Group reviewed and approved the testing protocols
  • Models were subjected to standard refusal and jailbreak resistance tests
  • A $500,000 red teaming challenge was launched to identify novel risks 4

Implications for AI Security

Source: SiliconANGLE

Source: SiliconANGLE

The rapid jailbreaking of these models highlights several critical issues in AI security:

  1. Evolving Attack Methodologies: Techniques like the Echo Chamber method demonstrate how attackers can manipulate AI models over multiple conversation turns, bypassing single-prompt safety checks 2.

  2. Limitations of Current Safety Measures: The success of these jailbreaks exposes weaknesses in current AI safety architectures, particularly in handling multi-turn conversations 3.

  3. Need for Comprehensive Security Approaches: Experts suggest that organizations using these models should evaluate defenses that operate at the conversation level, including monitoring context drift and detecting persuasion cycles 3.

Industry Reactions

The AI community has responded with a mix of concern and fascination. Some view these jailbreaks as a "victory" for AI resistance against big tech control, while others emphasize the urgent need for more robust security measures 1.

Satyam Sinha, CEO of Acuvity Inc., noted, "These findings highlight a reality we're seeing more often in AI security: model capability is advancing faster than our ability to harden it against incidents" 2.

As the AI landscape continues to evolve rapidly, these incidents underscore the ongoing challenges in balancing advanced capabilities with robust security measures. The race between AI developers and those seeking to bypass safety protocols remains a critical aspect of the field's development.

Explore today's top stories

Google Unveils AI-Powered Pixel 10 Smartphones with Advanced Gemini Features

Google launches its new Pixel 10 smartphone series, showcasing advanced AI capabilities powered by Gemini, aiming to challenge competitors in the premium handset market.

Bloomberg Business logoThe Register logoReuters logo

20 Sources

Technology

7 hrs ago

Google Unveils AI-Powered Pixel 10 Smartphones with

Google Unveils AI-Powered Pixel 10 Series: A New Era of Smartphone Intelligence

Google's Pixel 10 series introduces groundbreaking AI features, including Magic Cue, Camera Coach, and Voice Translate, powered by the new Tensor G5 chip and Gemini Nano model.

TechCrunch logoZDNet logoengadget logo

12 Sources

Technology

8 hrs ago

Google Unveils AI-Powered Pixel 10 Series: A New Era of

NASA and IBM Unveil Surya: An AI Model to Predict Solar Flares and Space Weather

NASA and IBM have developed Surya, an open-source AI model that can predict solar flares and space weather with improved accuracy, potentially helping to protect Earth's infrastructure from solar storm damage.

New Scientist logoengadget logoGizmodo logo

6 Sources

Technology

15 hrs ago

NASA and IBM Unveil Surya: An AI Model to Predict Solar

Google Unveils Pixel Watch 4: A Leap Forward in AI-Powered Wearables

Google's latest smartwatch, the Pixel Watch 4, introduces significant upgrades including a curved display, enhanced AI features, and improved health tracking capabilities.

TechCrunch logoCNET logoZDNet logo

17 Sources

Technology

7 hrs ago

Google Unveils Pixel Watch 4: A Leap Forward in AI-Powered

FieldAI Secures $405M Funding to Revolutionize Robot Intelligence with Physics-Based AI Models

FieldAI, a robotics startup, has raised $405 million to develop "foundational embodied AI models" for various robot types. The company's innovative approach integrates physics principles into AI, enabling safer and more adaptable robot operations across diverse environments.

TechCrunch logoReuters logoGeekWire logo

7 Sources

Technology

7 hrs ago

FieldAI Secures $405M Funding to Revolutionize Robot
TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo