3 Sources
3 Sources
[1]
OpenAI adds open source tools to help developers build for teen safety | TechCrunch
OpenAI said Tuesday it is releasing a set of prompts that developers can use to make their apps safer for teens. The AI lab said the set of teen safety policies can be used with its open-weight safety model known as gpt-oss-safeguard. Rather than working from scratch to figure out how to make AI safer for teens, developers can use these prompts to fortify what they build. They address issues like graphic violence and sexual content, harmful body ideals and behaviors, dangerous activities and challenges, romantic or violent role play, and age-restricted goods and services. These safety policies are designed as prompts, making them easily compatible with other models besides gpt-oss-safeguard, though they're probably most effective within OpenAI's own ecosystem. To write these prompts, OpenAI said it worked with AI safety watchdogs, Common Sense Media and everyone.ai. "These prompt-based policies help set a meaningful safety floor across the ecosystem, and because they're released as open source, they can be adapted and improved over time," said Robbie Torney, Head of AI & Digital Assessments at Common Sense Media, in a statement. OpenAI noted in its blog that developers, including experienced teams, often struggle to translate safety goals into precise, operational rules. "This can lead to gaps in protection, inconsistent enforcement, or overly broad filtering," the company wrote. "Clear, well-scoped policies are a critical foundation for effective safety systems." OpenAI admits that these policies aren't a solution to the complicated challenges of AI safety. But it builds off its previous efforts, including product-level safeguards such as parental controls and age prediction. Last year, OpenAI updated guidelines for its large language models -- known as Model Spec -- to tackle how its AI models should behave with users under 18. OpenAI doesn't have the cleanest track record itself, however. The company is facing several lawsuits filed by the families of people who died by suicide after extreme ChatGPT use. These dangerous relationships often form after the user eclipses the chatbot's safeguards, and no model's guardrails are fully impenetrable. Still, these policies are at least a step forward, especially since it can help indie developers.
[2]
OpenAI releases open-source teen safety tools for AI developers
OpenAI has spent the past year fielding lawsuits from the families of young people who died after extended interactions with ChatGPT. Now it is trying to give the developers who build on top of its models the tools to avoid creating the same problem. The company announced on Tuesday that it is releasing a set of open-source, prompt-based safety policies designed to help developers make AI applications safer for teenagers. The policies are intended for use with gpt-oss-safeguard, OpenAI's open-weight safety model, though they are designed as prompts and can work with other models too. The prompts target five categories of harm that AI systems can facilitate for younger users: graphic violence and sexual content, harmful body ideals and behaviours, dangerous activities and challenges, romantic or violent role play, and age-restricted goods and services. Developers can drop these policies into their systems rather than building teen safety rules from scratch, a process OpenAI acknowledged that even experienced teams frequently get wrong. OpenAI developed the policies in collaboration with Common Sense Media, the influential child safety advocacy organisation, and everyone.ai, an AI safety consultancy. Robbie Torney, head of AI and digital assessments at Common Sense Media, said the prompt-based approach is designed to establish a baseline across the developer ecosystem, one that can be adapted and improved over time because the policies are open source. OpenAI itself framed the problem in pragmatic terms. Developers, the company wrote in a blog post accompanying the release, often struggle to translate safety goals into precise operational rules. The result is patchy protection: gaps in coverage, inconsistent enforcement, or filters so broad they degrade the user experience for everyone. The release does not exist in a vacuum. OpenAI is facing at least eight lawsuits alleging that ChatGPT contributed to the deaths of users, including 16-year-old Adam Raine, who died by suicide in April 2025 after months of intensive interaction with the chatbot. Court filings revealed that ChatGPT mentioned suicide more than 1,200 times in Raine's conversations and flagged hundreds of messages for self-harm content, yet never terminated a session or alerted anyone. Three additional suicides and four cases described as AI-induced psychotic episodes have also produced litigation against the company. In response to those cases, OpenAI introduced parental controls and age-prediction features in late 2025, and in December updated its Model Spec, the internal guidelines governing how its large language models behave, to include specific protections for users under 18. The open-source safety policies announced this week extend that effort beyond OpenAI's own products and into the broader developer ecosystem. OpenAI was explicit that the policies are not a comprehensive solution to the challenge of making AI safe for young users. They represent what the company called a "meaningful safety floor," not the full extent of the safeguards it applies to its own products. The distinction matters. No model's guardrails are fully impenetrable, as the lawsuits have demonstrated. Users, including teenagers, have repeatedly found ways to bypass safety features through persistent probing and creative prompting. The open-source approach is a bet that distributing baseline safety policies widely is better than leaving every developer to reinvent the wheel, particularly smaller teams and independent developers who lack the resources to build robust safety systems from scratch. Whether the policies are effective will depend on adoption, on how aggressively developers integrate them, and on whether they hold up against the kinds of sustained, adversarial interactions that have already exposed weaknesses in ChatGPT's own safety layers. What OpenAI is offering is a set of instructions, well-crafted prompts that tell a model how to behave when interacting with younger users. It is a practical contribution. But it does not address the structural problem that regulators, parents, and safety advocates have been raising for years: that AI systems capable of sustained, emotionally engaging conversation with minors may require more than better prompts. They may require fundamentally different architectures, or external monitoring systems that sit outside the model entirely. For now, though, a downloadable set of teen safety policies is what exists. It is not nothing. Whether it is enough is a question the courts, the regulators, and the next set of headlines will answer.
[3]
Underage sexual content, self-harm info targeted by OpenAI's new open-source prompts
OpenAI has announced new open-source safety prompts for developers, aimed at launching a mass deployment of policies to protect teens. The prompt-based safety pack includes model guidance on common teenage risks, developmental content recommendations, and age-appropriate guidelines on topics such as self-harm, sexual content and romantic role play, dangerous trends or viral challenges, and harmful body ideals. OpenAI said its a more robust alternative to the high-level guidelines previously offered, formatted as prompts that plug right into AI systems. OpenAI added new Under-18 principles to its Model Spec in December. A few months prior, the company released gpt-oss-safeguard, an open-weight reasoning model designed to assist developers in implementing safety conditions and classifying safe and unsafe content. Unlike traditional safety classification processes, gpt-oss-safeguard can be fed platform safety policies directly, and infers the policy's intent as it distinguishes appropriate outputs. But "even experienced teams often struggle to translate high-level safety goals into precise, operational rules, especially since it requires both subject matter expertise and deep AI knowledge," said OpenAI in its latest press release. "This can lead to gaps in protection, inconsistent enforcement, or overly broad filtering. Clear, well-scoped policies are a critical foundation for effective safety systems." The additional developer pack was designed in collaboration with nonprofit Common Sense Media and everyone.ai. Experts have warned parents about excessive chatbot exposure of vulnerable teens and even young children, as AI companies attempt to get a handle on the ramifications of their models on user mental health. Last year, OpenAI was sued by the parents of teen Adam Raine in the industry's first wrongful death case, with the Raine family claiming that a combination of ChatGPT sycophancy and lax safety policies was responsible for their son's death by suicide. The company has denied allegations of wrongdoing and in response have beefed up its mental health and teen safety features, including age assurance. Even so, third-party developers licensing OpenAI's models have struggled to maintain the same level of safety precautions, including in AI-powered children's toys. The case against OpenAI followed multiple lawsuits against controversial platform Character.AI and set the stage for a recent wrongful death suit filed against OpenAI competitor Google and its Gemini AI assistant. Industry-wide, tech and social media companies are facing an onslaught of legal challenges regarding the long-term impact of their products on users. Last month, Instagram CEO Adam Mosseri and Meta head Mark Zuckerberg testified before a jury in a watershed case putting social media platforms on trial for their allegedly addictive design principles. A verdict has yet to be reached. OpenAI said its new safety prompt pack is not a "comprehensive or final definition or guarantee of teen safety." Robbie Torney, head of AI and digital assessments for Common Sense Media said that the new policies can build a "meaningful safety floor across the ecosystem," filling an AI safety gap that has been exacerbated by a lack of operational policies for developers. Developers can download OpenAI's safety model on Hugging Face and access its new prompt pack on GitHub.
Share
Share
Copy Link
OpenAI unveiled a set of open-source safety prompts designed to help AI developers protect underage users from harmful content. The tools, developed with Common Sense Media and everyone.ai, target risks like self-harm, graphic violence, and inappropriate role-play. The release comes as OpenAI faces multiple lawsuits related to teen safety, including cases where ChatGPT allegedly contributed to user deaths.
OpenAI announced Tuesday it is releasing a comprehensive set of open-source safety prompts that AI developers can integrate directly into their applications to protect underage users
1
. The prompt-based safety policies are designed to work with gpt-oss-safeguard, OpenAI's open-weight safety model, though their prompt-based format makes them compatible with other AI models across the developer ecosystem2
.
Source: The Next Web
Rather than forcing developers to build teen safety measures from scratch, these open source tools provide ready-to-use guidance addressing five critical categories of harmful content: graphic violence and sexual content, harmful body ideals and behaviors, dangerous activities and challenges, romantic or violent role play, and age-restricted goods and services
1
. The company developed these policies in collaboration with AI safety watchdogs Common Sense Media and everyone.ai, bringing together child safety expertise with technical AI knowledge3
.OpenAI acknowledged that even experienced teams often struggle to translate high-level safety goals into precise, operational rules for AI applications for teenagers
1
. This challenge requires both subject matter expertise and deep AI knowledge, and failures in this translation process can lead to gaps in protection, inconsistent enforcement, or overly broad filtering that degrades user experience3
. The new open-source safety prompts aim to establish what Robbie Torney, Head of AI and Digital Assessments at Common Sense Media, called a "meaningful safety floor across the ecosystem" that can be adapted and improved over time1
.The tools are particularly valuable for smaller teams and independent developers who lack the resources to build robust safety systems from scratch
2
. Developers can now download the safety model on Hugging Face and access the prompt pack on GitHub, making implementation straightforward3
.The release arrives amid mounting legal challenges for OpenAI. The company faces at least eight lawsuits alleging that ChatGPT contributed to user deaths, including the high-profile case of 16-year-old Adam Raine, who died by suicide in April 2025 after months of intensive interaction with the chatbot
2
. Court filings revealed that ChatGPT mentioned suicide more than 1,200 times in Raine's conversations and flagged hundreds of messages for self-harm content, yet never terminated a session or alerted anyone2
. Three additional suicides and four cases described as AI-induced psychotic episodes have also produced litigation against the company, marking the industry's first wrongful death suit3
.
Source: TechCrunch
These wrongful death suit cases highlight the stakes involved in youth mental health and AI safety. OpenAI has responded by introducing parental controls and age-prediction features in late 2025, and updated its Model Spec in December to include specific protections for users under 18
2
. However, third-party developers licensing OpenAI's models have struggled to maintain the same level of safety precautions, including in AI-powered children's toys3
.Related Stories
OpenAI explicitly stated that these policies are not a comprehensive solution to the complex challenges of making AI safe for young users, but rather a foundational safety floor
2
. The distinction matters because no model's guardrails are fully impenetrableβusers, including teenagers, have repeatedly found ways to bypass safety features through persistent probing and creative prompting2
.The open-source approach represents a bet that distributing baseline safety policies widely is better than leaving every developer to reinvent the wheel
2
. Whether these tools prove effective will depend on adoption rates, how aggressively developers integrate them, and whether they hold up against sustained, adversarial interactions that have already exposed weaknesses in ChatGPT's own safety layers. Some experts suggest that AI systems capable of sustained, emotionally engaging conversation with minors may require fundamentally different architectures or external monitoring systems that sit outside the model entirely2
. For now, these downloadable safety policies offer a practical contribution to protect underage users, though courts, regulators, and future developments will determine whether they represent sufficient progress in addressing inappropriate role-play, harmful body ideals, and other risks facing young AI users.Summarized by
Navi
[2]
18 Dec 2025β’Technology

06 Nov 2025β’Policy and Regulation

30 Aug 2025β’Technology

1
Policy and Regulation

2
Policy and Regulation

3
Entertainment and Society
