4 Sources
4 Sources
[1]
Enterprise AI coding grows teeth: GPT‑5.2‑Codex weaves security into large-scale software refactors
With the recent release of GPT 5.2, OpenAI updated other related models, including its popular coding model Codex, bringing more agentic use cases to its fold. GPT-5.2-Codex, which OpenAI called in a blog post "the most advanced agentic coding model yet for complex, real-world software engineer," has been optimized for long-horizon work with agents and will have stronger cybersecurity capabilities. The model is an offshoot of GPT-5.2, optimized for agentic building. "GPT‑5.2-Codex represents a step forward in how advanced AI can support real-world software engineering and specialized domains like cybersecurity -- helping developers and defenders tackle complex, long-horizon work, and strengthening the tools available for responsible security research," the company said in its blog post. Enterprises can access the new Codex model "in all Codex surfaces for paid ChatGPT users, and working towards safely enabling access to GPT‑5.2-Codex for API users in the coming weeks." The company is also piloting a program with invite-only trusted users to access "more permissive models for vetted professionals and organizations" for defensive cybersecurity work to determine a balance between accessibility and safety. Advances in cybersecurity with models OpenAI calls GPT-5.2-Codex its strongest cybersecurity model yet. Still, as its capabilities grow, the company said it needs to design a deployment approach that accounts for future growth and supports defensive cybersecurity. "As our models continue to advance along the intelligence frontier, we've observed that these improvements also translate to capability jumps in specialized domains such as cybersecurity," the company said. OpenAI said in its system card that it tested the model on three benchmarks: Capture-the-Flag (CTF) evals, CVE-Bench and Cyber Range. GPT-5.2-Codex became the company's strongest-performing model in CTF evals, which they attributed to compaction, or "the ability for the model to work coherently across multiple context windows." The model scored 87% in CVE-Bench, outperforming other models, with GPT-5.1-Codex-Max coming in a close second. This increase would be helpful for tasks involving running commands around vulnerability discovery and trying tools "with an almost brute-force approach." In the long-form Cyber Range test, the model had a combined pass rate of 72.7%. GPT-5.1-Codex-Max scored 81.8%. Cybersecurity deployment project OpenAI said some users of its GPT-5.1-Codex-Max, which launched in November, uncovered a source code exposure vulnerability in React and subsequently reported it. According to OpenAI, Andrew MacPherson, a security researcher at Privy, used GPT-5.1-Codex-Max to assess how well the model could support real-world vulnerability research. The model instead surfaced unexpected behavior. With improvements in cybersecurity capabilities for GPT-5.2-Codex and potentially for models that come after it, OpenAI said it needs to balance the deployment of frontier models with the necessary tools for defensive cybersecurity. While GPT-5.2-Codex "does not reach a high level of cyber capability under our Preparedness Framework," the company plans to bring selected users to test security capabilities. (OpenAI's Preparedness Framework to measure and track potential harms from AI to humans) "Security teams can run into restrictions when attempting to emulate threat actors, analyze malware to support remediation, or stress test critical infrastructure. We are developing a trusted access pilot to remove that friction for qualifying users and organizations and enable trusted defenders to use frontier AI cyber capabilities to accelerate cyberdefense." OpenAI said. Agentic frontiers GPT-5.2 already received praise from users for its use in business tasks and workflows. With the Codex version, some of those capabilities could transfer, especially as enterprises plan to use the model to code their agents. The company said the model improves long-horizon work through compaction, offering strong performance on extensive code changes. It also features improved performance on Windows. In benchmark testing, GPT-5.2-Codex performed the best on accuracy compared to its previous versions. "With these improvements, Codex is more capable at working in large repositories over extended sessions with full context intact. It can more reliably complete complex tasks like large refactors, code migrations, and feature builds -- continuing to iterate without losing track, even when plans change or attempts fail," OpenAI said. Since it launched in previews in May, Codex has helped usher in acceptance of agentic and vibe coding in the enterprise AI builder space. Along with Windsurf, Cursor, Claude Code and the many coding agents from Google, the platform moved LLMs from simple code completion to generating and starting asynchronous coding projects for users.
[2]
OpenAI's GPT-5.2-Codex advances software engineering with better reasoning and context understanding - SiliconANGLE
OpenAI's GPT-5.2-Codex advances software engineering with better reasoning and context understanding OpenAI Group PBC has released a new version of GPT-Codex, its agentic artificial intelligence coding model that's designed to automate complex software engineering tasks. The latest version, GPT-5.2-Codex, builds upon the capabilities of GPT-5.2, adding improvements in context compaction, large code refactoring, Windows environment performance and cybersecurity, the company said. According to OpenAI's blog post, GPT-5.2-Codex achieved an unmatched score on the SWE-Bench Pro benchmark, with 56.4% accuracy, besting all other coding models launched so far. It also racked up a score of 64% on the Terminal-Bench 2.0 benchmark, outperforming earlier versions of Codex. It's aided by stronger vision capabilities that allow it to better interpret screenshots, technical diagrams and user interfaces, so it can translate software design mockups into functional prototypes. OpenAI said GPT-5.2-Codex is meant to advance software engineering, which is the process of designing, developing, testing and maintaining applications by combining engineering principles with programming knowledge. The goal is to create high-quality, reliable and maintainable software that's able to evolve to meet user's needs. The new model's ability to tackle time-consuming tasks makes it especially good at "refactoring", which is a key element of software engineering that involves adapting an application's codebase, not to add new features, but to enhance its quality. For instance, it can tweak an application's codebase to reduce its memory usage or increase its response times, OpenAI said. GPT-5.2-Codex represents the culmination of several iterative advances in OpenAI's generative AI coding capabilities. Earlier models such as GPT-5-Codex and GPT-5.1-Codex-Max progressively improved aspects such as multistep reasoning, long-context understanding and tool integration within coding environments, and GPT-5.2-Codex builds on this work in various ways. For instance, OpenAI said it performs better at long-range task execution thanks to its context compaction capabilities, which allow it to undertake sustained, multistep coding tasks without forgetting context. It's also better at large-scale code management, improving its code refactoring, migration and feature-building capabilities, the company said. Moreover, it shows improved performance in Windows-based coding environments, and there are more advanced cybersecurity features that enable AI-assisted bug detection, testing and mitigation. OpenAI said the focus on improving security is critical to AI-driven software engineering, because modern enterprise infrastructures demand reliable software. Developers and security teams need all the help they can get when it comes to uncovering and fixing complex software vulnerabilities, and they also need to be sure that whatever AI coding tools are being used don't create more. Codex's ability to fix software was highlighted earlier this month when the security researcher Andrew MacPherson used GPT-5.1-Codex-Max to examine the CVE-2025-55182 vulnerability in React. In a blog post, he explained how the model used a combination of iterative assessments, fuzz testing and exploit analysis to mitigate the issue, while also surfacing and mitigating previously unknown vulnerabilities in the process. OpenAI said the improvements introduced in GPT-5.2-Codex will have real implications for enterprises, enabling them to automate the most complex and repetitive software engineering tasks and integrate more sophisticated features in their applications. By simultaneously supporting cybersecurity operations, it can help organizations to improve efficiency, reduce human error and maintain a competitive advantage in software engineering, the company promised. The company said GPT-5.2-Codex is available from today to all paid ChatGPT users. It's planning to extend access to application programming interface users in the coming week, and will also launch an invite-only trusted access pilot program for vetted security professionals focused on defensive cybersecurity.
[3]
Inside OpenAI's Pro-Grade Codex 5.2 Made for Software Teams & Security Work
What if the next big leap in artificial intelligence wasn't meant for everyone? OpenAI's GPT-5.2 Codex is making waves, not as a general-purpose AI, but as a highly specialized system crafted for professionals in software engineering and cybersecurity. Learn how this advanced model emphasizes precision and reliability, featuring innovations like agentic coding for automating intricate tasks and fuzz testing to uncover hidden vulnerabilities. By targeting high-stakes applications, GPT-5.2 Codex challenges us to rethink what it means for AI to innovate. Below, Universe of AI explores the new features that set GPT-5.2 Codex apart. From streamlining complex workflows to fortifying cybersecurity strategies, this model marks OpenAI's decisive move toward specialization over generalization. It's not just an evolution, it's a bold redefinition of AI's role in solving critical problems. Whether you're intrigued by its unprecedented capabilities or eager to understand its implications for the future, this explainer offers a fascinating dive into how AI can excel when it's built with a clear purpose. GPT-5.2 Codex Overview OpenAI's Strategic Focus on Specialization The development of GPT-5.2 Codex reflects OpenAI's strategic pivot toward specialization in AI. Rather than competing solely on metrics like speed or cost, OpenAI has chosen to prioritize depth, expertise, and targeted functionality. This model is specifically designed to address the intricate demands of modern software development and cybersecurity, areas where precision and reliability are paramount. By narrowing its focus to professional use cases, OpenAI positions GPT-5.2 Codex as an indispensable tool for organizations that require advanced AI capabilities to solve real-world challenges. This approach not only differentiates GPT-5.2 Codex from its predecessors but also highlights OpenAI's commitment to delivering value to enterprise users. What Makes GPT-5.2 Codex Stand Out? GPT-5.2 Codex introduces a suite of features that enhance productivity, security, and efficiency in professional settings. These capabilities are carefully designed to meet the specific needs of developers and cybersecurity experts, making it a powerful tool for tackling complex tasks. * Agentic Coding: Automates long-term, intricate software engineering tasks, making sure seamless continuity and efficiency in workflows. This feature reduces the need for constant human intervention, allowing teams to focus on higher-level problem-solving. * Code Refactoring: Simplifies large-scale code modifications, minimizing human error and significantly improving productivity. This capability is particularly valuable for maintaining and upgrading legacy systems. * Windows Optimization: Offers enhanced compatibility with Windows systems, a critical requirement for many enterprise environments. This ensures smooth integration into existing infrastructures. * Cybersecurity Tools: Includes advanced features like vulnerability detection and fuzz testing, allowing organizations to proactively secure their systems against potential threats. These specialized tools are not just enhancements but essential components for professionals dealing with the complexities of modern software development and cybersecurity. By addressing these specific needs, GPT-5.2 Codex sets itself apart as a highly targeted and effective solution. Why OpenAI Built GPT-5.2 Codex Dive deeper into Codex with other articles and guides we have written below. Performance Benchmarks: A New Standard GPT-5.2 Codex has established itself as a leader in the AI industry by setting new performance benchmarks. It achieved a remarkable 56.4% accuracy rate on the Software Engineering Bench Pro, outperforming previous models in tasks such as debugging, code generation, and system optimization. On Terminal Bench 2.0, it scored an impressive 64%, excelling in automating server setups, compiling code, and managing large-scale deployments. These results highlight the model's ability to handle complex, high-stakes scenarios with precision and reliability. The performance of GPT-5.2 Codex is not just a testament to its technical capabilities but also a reflection of OpenAI's commitment to creating AI tools that deliver measurable value. By consistently outperforming its predecessors, GPT-5.2 Codex sets a new standard for what specialized AI can achieve in professional environments. Transforming Cybersecurity In the realm of cybersecurity, GPT-5.2 Codex offers fantastic capabilities that enhance both efficiency and effectiveness. The model assists researchers in identifying vulnerabilities within production code, streamlining the patching process, and automating repetitive tasks. These features enable organizations to respond more effectively to emerging threats, reducing the time and effort required to secure their systems. One of the standout features of GPT-5.2 Codex is its fuzz testing capability. This advanced tool helps teams uncover hidden software flaws that could be exploited by malicious actors. By identifying these vulnerabilities early in the development process, organizations can implement proactive measures to mitigate risks. This not only strengthens the overall security posture but also reduces the potential for costly breaches and downtime. Bridging Design and Development Beyond its applications in coding and cybersecurity, GPT-5.2 Codex demonstrates advanced capabilities in visual understanding, making it a valuable asset for user interface (UI) prototyping. The model can generate functional prototypes directly from design mockups, effectively bridging the gap between design and development. This capability accelerates the creation of intuitive, user-friendly interfaces, streamlining the development process for teams working on complex projects. By allowing seamless collaboration between designers and developers, GPT-5.2 Codex enhances productivity and ensures that the final product aligns with user expectations. This feature is particularly beneficial for organizations that prioritize innovation and user experience in their software solutions. Looking Ahead: The Future of GPT-5.2 Codex OpenAI envisions GPT-5.2 Codex as a foundational step toward even more advanced AI solutions. The model's reasoning and cybersecurity capabilities are expected to evolve, allowing it to tackle increasingly complex challenges in the future. OpenAI is actively preparing for these advancements by prioritizing safety, reliability, and ethical considerations in its development process. While GPT-5.2 Codex has not yet reached its highest risk threshold, its current capabilities already demonstrate significant potential for transforming professional workflows. OpenAI's forward-looking approach ensures that future iterations of the model will continue to deliver value while addressing the growing demands of enterprise users. This commitment to innovation and responsibility positions GPT-5.2 Codex as a key player in the ongoing evolution of AI technology. Media Credit: Universe of AI
[4]
OpenAI rolls out GPT‑5.2-Codex for advanced coding and cybersecurity workflows
OpenAI has released GPT‑5.2-Codex, the most advanced agentic coding model yet for complex, real-world software engineering. It is designed to handle long-horizon tasks, large code changes, and cybersecurity workflows. GPT‑5.2-Codex is a version of GPT‑5.2 further optimized for agentic coding in Codex. Key improvements include long-horizon work via context compaction, stronger performance on large code changes like refactors and migrations, improved reliability in Windows environments, and significantly stronger cybersecurity capabilities. The company said that as models advance along the intelligence frontier, the improvements also lead to capability gains in specialized domains such as cybersecurity. For example, a security researcher recently used GPT‑5.1-Codex-Max with Codex CLI to identify and responsibly disclose a React vulnerability that could expose source code. While GPT‑5.2-Codex has stronger cybersecurity capabilities than previous models, it does not yet reach a 'High' level under OpenAI's Preparedness Framework, and its deployment is structured to accommodate future capability growth. 1. Pushing the frontier on real-world software engineering GPT‑5.2-Codex builds on GPT‑5.2's strengths in professional knowledge work and GPT‑5.1-Codex-Max's frontier agentic coding and terminal-using capabilities. It now offers improved long-context understanding, reliable tool calling, enhanced factuality, and native context compaction, making it a dependable partner for long-running coding tasks while remaining token-efficient. Performance on cybersecurity evaluations shows a sharp capability increase from GPT‑5-Codex to GPT‑5.1-Codex-Max, and now to GPT‑5.2-Codex. OpenAI evaluates models as if they could reach 'High' cybersecurity capability in the future and has added safeguards to manage dual-use risks. Modern society depends on software reliability in sectors like banking, healthcare, communications, and essential services. Vulnerabilities may exist long before detection, and identifying, validating, and fixing them relies on engineers and independent security researchers. On December 11, 2025, the React team disclosed three security vulnerabilities affecting React Server Components. Andrew MacPherson, a principal security engineer at Privy (a Stripe company), used GPT‑5.1-Codex-Max with Codex CLI to study a prior critical React vulnerability, React2Shell (CVE-2025-55182). MacPherson first attempted zero-shot analyses, then higher-volume iterative prompting. When these failed, he guided Codex through standard defensive security workflows, including setting up a local test environment, reasoning through attack surfaces, and fuzzing malformed inputs. Codex surfaced unexpected behaviors, leading to the discovery of previously unknown vulnerabilities, which were responsibly disclosed to the React team. These cases show how advanced AI can accelerate defensive security work, while also highlighting the dual-use risk of misuse by bad actors. 4. Empowering cyberdefense through trusted access Security teams often face restrictions when emulating threat actors, analyzing malware, or stress-testing infrastructure. OpenAI is piloting a trusted access program to reduce friction for qualifying users and organizations, enabling them to use frontier AI capabilities for defensive purposes. The invite-only pilot is for vetted security professionals with a history of responsible disclosure and organizations with clear cybersecurity use cases. Participants receive access to advanced models to conduct legitimate dual-use work. OpenAI encourages qualified professionals to express interest and provide feedback. GPT‑5.2-Codex advances real-world software engineering and cybersecurity workflows. By gradually rolling out the model with safeguards, access controls, and collaboration with the security community, OpenAI aims to maximize defensive impact while reducing misuse risk. Insights from this release will guide future expansions as software and cyber frontiers evolve. GPT‑5.2-Codex is available today across all Codex surfaces for paid ChatGPT users. API access is expected in the coming weeks. The invite-only trusted access pilot for vetted cybersecurity professionals and organizations is running in parallel, balancing accessibility with safety.
Share
Share
Copy Link
OpenAI unveiled GPT-5.2-Codex, its most advanced agentic coding model designed for complex software engineering and cybersecurity tasks. The model achieved 56.4% accuracy on SWE-Bench Pro and 64% on Terminal-Bench 2.0, outperforming all previous versions. It introduces enhanced context compaction for long-horizon work, improved vulnerability detection, and a trusted access program for vetted security professionals.
OpenAI has released GPT-5.2-Codex, calling it "the most advanced agentic coding model yet for complex, real-world software engineering."
1
The model represents a significant advancement in how AI supports software engineering and cybersecurity, with optimizations specifically targeting long-horizon work with agents and enhanced defensive capabilities. Available now to all paid ChatGPT users across Codex surfaces, GPT-5.2-Codex builds upon the capabilities of GPT-5.2 while introducing specialized features for professional developers and security teams.2

Source: Geeky Gadgets
The release marks OpenAI's strategic shift toward specialization over generalization, prioritizing depth and expertise in targeted domains rather than competing solely on speed or cost.
3
This agentic coding model is designed to handle the intricate demands of modern software development where precision and reliability are paramount, positioning it as an indispensable tool for organizations requiring advanced AI capabilities to solve real-world challenges.GPT-5.2-Codex achieved an unprecedented 56.4% accuracy on SWE-Bench Pro, outperforming all other coding models launched to date.
2
On Terminal-Bench 2.0, the model scored 64%, excelling in tasks like automating server setups, compiling code, and managing large-scale deployments.3
These benchmarks demonstrate the model's ability to handle complex, high-stakes scenarios with measurable precision.The model's performance improvements stem from several key enhancements. Context compaction enables GPT-5.2-Codex to work coherently across multiple context windows, allowing it to undertake sustained, multistep coding tasks without losing track of previous work.
1
This capability proves especially valuable for large code refactoring, code migrations, and feature builds where plans may change or initial attempts fail. The model can now reliably complete complex tasks in large repositories over extended sessions with full context intact.4

Source: SiliconANGLE
Additional improvements include stronger vision capabilities that allow the model to better interpret screenshots, technical diagrams, and user interfaces, translating software design mockups into functional prototypes.
2
Windows environment performance has also been enhanced, addressing a critical requirement for many enterprise infrastructures.3
OpenAI calls GPT-5.2-Codex its strongest cybersecurity model yet, with significant advances in vulnerability detection and defensive capabilities.
1
In testing, the model scored 87% on CVE-Bench, outperforming other models, with GPT-5.1-Codex-Max coming in second.1
On Capture-the-Flag evaluations, GPT-5.2-Codex became the company's strongest-performing model, demonstrating its ability to work coherently across complex security challenges. The long-form Cyber Range test showed a combined pass rate of 72.7%.1
A compelling real-world example emerged when Andrew MacPherson, a principal security engineer at Privy, used GPT-5.1-Codex-Max to study the React2Shell vulnerability (CVE-2025-55182).
4
MacPherson guided Codex through standard defensive security workflows, including setting up a local test environment and fuzz testing malformed inputs. The model surfaced unexpected behaviors, leading to the discovery of previously unknown software vulnerabilities affecting React Server Components, which were responsibly disclosed to the React team.2
This case demonstrates how enterprise AI can accelerate defensive cybersecurity work while also highlighting the dual-use risks. Modern society depends on software reliability across banking, healthcare, communications, and essential services, where vulnerabilities may exist long before detection.
4
The model's ability to assist in identifying vulnerabilities, streamlining patching processes, and automating repetitive tasks enables organizations to respond more effectively to emerging threat actors.3
Related Stories
While GPT-5.2-Codex demonstrates stronger cybersecurity capabilities than previous models, OpenAI acknowledges it does not yet reach a "High" level under the company's Preparedness Framework.
4
To balance accessibility with safety as capabilities continue advancing, OpenAI is launching an invite-only trusted access program for vetted security professionals and organizations with clear defensive cybersecurity use cases.1
Security teams often face restrictions when attempting to emulate threat actors, analyze malware to support remediation, or stress test critical infrastructure.
4
The trusted access program aims to remove friction for qualifying users, enabling them to use frontier AI capabilities for defensive purposes. Participants must demonstrate a history of responsible disclosure and legitimate dual-use work requirements.4
This structured deployment approach accounts for future capability growth while supporting defensive cybersecurity needs. OpenAI plans to extend access to API users in the coming weeks, gradually expanding availability as it gathers insights from the security community.
2
The company encourages qualified professionals to express interest and provide feedback to shape future iterations.The improvements in GPT-5.2-Codex have direct implications for enterprise software engineering workflows. The model's ability to handle long-horizon tasks makes it particularly effective at code refactoring, a key element of software engineering that involves adapting an application's codebase to enhance quality without adding new features.
2
Organizations can use the model to reduce memory usage, increase response times, and maintain legacy systems more efficiently.
Source: VentureBeat
Since its preview launch in May, Codex has helped accelerate acceptance of agentic and vibe coding in the enterprise AI builder space.
1
Alongside platforms like Windsurf, Cursor, Claude Code, and Google's coding agents, it has moved large language models from simple code completion to generating and managing asynchronous coding projects. By automating complex and repetitive tasks while simultaneously supporting cybersecurity operations, GPT-5.2-Codex enables organizations to improve efficiency, reduce human error, and maintain competitive advantages in software engineering.2
Summarized by
Navi
[1]
[2]
1
Policy and Regulation

2
Technology

3
Technology
