30 Sources
30 Sources
[1]
OpenAI launches new MacOS app for agentic coding | TechCrunch
AI is already having a seismic impact on how software is written, with much of the grunt work of programming now performed by swarms of agents and subagents. But as developers experiment with new interfaces and form factors for human-AI collaboration, it's become hard for even the most advanced AI labs to keep up. The current trend is for agentic software development -- systems where AI agents can work independently on coding tasks -- epitomized by the Claude Code and Cowork apps. In the meantime, OpenAI has been gradually building out its Codex tool, which launched as a command line tool last April and expanded to a web interface one month later. Now, OpenAI is taking a major step towards catching up. On Monday, the company launched a new MacOS app for Codex, integrating many of the agentic practices that have become popular in the past year. The new app is designed to work with multiple agents in parallel, integrating agent skills and other state-of-the-art workflows. The launch also comes less than two months after the launch of GPT-5.2-Codex, OpenAI's most powerful coding model, which the company hopes will be enough to tempt over Claude Code users. "If you really want to do sophisticated work on something complex, 5.2 is the strongest model by far," CEO Sam Altman told reporters on a press call. "However, it's been harder to use, so taking that level of model capability and putting it in a more flexible interface, we think is going to matter quite a bit." While Altman's confidence in GPT-5.2 is understandable, coding benchmarks tell a more complicated story. GPT-5.2 does hold the top spot on TerminalBench (a test measuring how well AI handles command-line programming tasks), at least as of press time. But agents from Gemini 3 and Claude Opus have logged roughly equivalent scores -- lower, but within the margin of error of the benchmark. Results from SWE-bench, another coding benchmark that tests AI's ability to fix real-world software bugs, are similar, showing no clear advantage for GPT-5.2. However, agentic use cases have been difficult to benchmark effectively, and state-of-the-art models can vary significantly in user experience. The Codex app also comes with a range of new features that OpenAI says will help it achieve parity or, in some cases, outpace the various Claude apps. The Codex app will allow for automations that can be set to run in the background on an automatic schedule, with results placed in a queue to be reviewed when the user returns. Users can also select different personalities for the agent -- from pragmatic to empathetic -- depending on their working style. But for the company, the biggest selling point is the sheer speed of development that's made possible by AI. "You can use this from a clean sheet of paper, brand new, to make a really quite sophisticated piece of software in a few hours," Altman said. "As fast as I can type in new ideas, that is the limit of what can get built."
[2]
OpenAI's new GPT-5.3-Codex is 25% faster and goes way beyond coding now - what's new
"Underspecified" prompts now produce richer, more usable results. OpenAI today announced the launch of GPT-5.3-Codex, which OpenAI claims is the most capable agentic coding model yet released. Interestingly enough, Anthropic, maker of Claude Code, is also releasing a new powerful model at exactly the same time. Anthropic's model, Opus 4.6, is a more general-purpose model, but Anthropic is well known for its Claude Code offering. Also: OpenAI's Codex just got its own Mac app - and anyone can try it for free now As with Anthropic's Opus 4.6, OpenAI describes GPT-5.3 as a "frontier model." This is a term the AI industry uses to describe models at the bleeding edge of performance and capability. According to OpenAI, the new 5.3 Codex runs 25% faster, enabling longer-running tasks. OpenAI's GPT-5.3-Codex is being announced just days after OpenAI's announcement and release of a Mac app dedicated to Codex. Astute watchers of OpenAI's GPT models will notice the 5.3 version number on this release. OpenAI's current hottest GPT release is GPT-5.2, not GPT-5.3. Although nothing has been announced or even hinted at, I'm guessing we're not too far away from a general GPT-5.3 release in the next few days or weeks. I'm not sure if this is incredibly exciting or deeply disturbing. GPT-5.3-Codex is OpenAI's "first model that was instrumental in creating itself." The Codex team used Codex to debug its own training, manage its own deployment, and diagnose test results. Also: I got 4 years of product development done in 4 days for $200, and I'm still stunned OpenAI's blog post says, "With GPT-5.3-Codex, Codex goes from an agent that can write and review code to an agent that can do nearly anything developers and professionals can do on a computer." Let's just hope it doesn't decide that humans are superfluous. Acknowledging that developers, designers, product managers, and data scientists do a lot more in their jobs than just code, GPT-5.3-Codex is designed to support "all of the work in the software lifecycle -- debugging, deploying, monitoring, writing PRDs, editing copy, user research, tests, metrics, and more." The company reports that it can help you build whatever you want to build, specifically mentioning building slide decks and spreadsheets. Also: Want local vibe coding? This AI stack replaces Claude Code and Codex - and it's free The company says that this model is designed to be steered mid-task, with continuous interaction and no loss of context. The new Codex Mac app (and, presumably, an upcoming Windows app) can definitely help in maintaining interaction with the model as it works on projects for you. The span of operation is also much longer. Codex can run processes that take more than one day. As a test, OpenAI built two web-based games. Using the "skill" feature introduced with the Mac app, testers used a web gaming development skill to build the two games over millions of tokens. The blog post specifically highlighted how GPT-5.3-Codex better understands intent. If you're making a "day-to-day website," the blog post reports that "Simple or underspecified prompts now default to sites with more functionality and sensible defaults, giving you a stronger starting canvas to bring your ideas to life." In other words, if you ask Codex to "make me something pretty," it will. I had some significant success asking Codex 5.2 to redesign some parts of my WordPress security plugin to make the administration screens more attractive and compelling. Some of the design features that existed in earlier Codex versions have been improved upon in this latest release. Other examples cited by the blog include dynamic pricing displays and automated testimonial carousels. One of the subtexts I picked up in the technical briefing I participated in last week for the Mac Codex app was the desire on the part of some of its customers for faster response times. I know that I often set Codex or Claude Code off on a programming task and then switch to some other activity while waiting for the process to complete. Also: 10 ChatGPT Codex secrets I only learned after 60 hours of pair programming with it While my experience has been that agentic coding speeds up my development process from months to days, that wait is still a bit painful. So increasing the performance, and thereby the output speed of the models, is always a win. According to the blog post, "GPT-5.3-Codex sets a new industry high on SWE-Bench Pro and Terminal Bench." The company says assignments take fewer tokens, increasing efficiency. OpenAI also says, "GPT-5.3-Codex sets a new industry high on SWE-Bench Pro and Terminal Bench, and shows strong performance on OSWorld and GDPVal." These are benchmarks the company uses to measure coding, agentic operations, and real-world capabilities. OpenAI has a published Preparedness Framework, which it uses for measuring and protecting against severe harm from frontier AI capabilities. In the blog post announcing GPT-5.3-Codex, the company says that this new model is the first one it classifies as "high capability" for cybersecurity tasks. It's been trained to identify software vulnerabilities, with expanded safeguards and monitoring. Concurrently, the company is launching Trusted Access for Cyber, a pilot program to accelerate cyber defense research, and is donating $10M in API credit grants to support cybersecurity research. If your organization is engaged in "good faith security research," you can apply for API credits through the company's Cybersecurity Grant Program. Also: Is ChatGPT Plus still worth your $20? I compared it to the Free, Go, and Pro plans - here's my advice Supporting these actions, the blog post reports, "We're taking a precautionary approach and deploying our most comprehensive cybersecurity safety stack to date. Our mitigations include dual-use safety training, automated monitoring, trusted access for advanced capabilities, and enforcement pipelines, including threat intelligence." GPT-5.3-Codex is available now with paid ChatGPT plans across the Codex app, CLI, IDE extension, and web. API access is planned, and the company expects to deliver 25% faster interactions. The Codex usage promotion we reported earlier this week for free ChatGPT users still stands, but they'll be limited to GPT-5.2-Codex for now. Also: Stop using ChatGPT for everything: My go-to AI models for research, coding, and more (and which I avoid) What do you think about GPT-5.3-Codex and the direction OpenAI is taking with more autonomous, long-running coding agents? Have you tried Codex or similar tools like Claude Code in real development work? If so, did the speed or autonomy change how you work? How do you feel about an AI model helping debug, deploy, and even participate in its own development? Do the cybersecurity safeguards and "frontier model" positioning reassure you, or raise new concerns? Let us know in the comments below.
[3]
ChatGPT 5.3 Codex Model Helped to Build Itself
ChatGPT's latest model, built for developers and coders, was "instrumental in creating itself," according to OpenAI. The latest update, ChatGPT 5.3 Codex, is aimed at developers who want tools to write and develop code faster than ever. The blog post from OpenAI says, "The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations -- our team was blown away by how much Codex was able to accelerate its own development." ChatGPT wasn't the primary development tool for this latest model, but GPT 5.3 Codex suggested refinements and improvements throughout the process. OpenAI says it helped to accelerate research, analyze interaction quality, and propose fixes for issues. OpenAI says 5.3 Codex began working on improvements to itself, starting from its "early versions." The company also says many researchers and engineers now say their jobs have "fundamentally" changed in the last couple of months because of these changes. In an interview with Ars Technica in December, OpenAI's product lead for Codex, Alexander Embiricos, alluded to improvements in the model's feedback loop. He said, "I think the vast majority of Codex is built by Codex, so it's almost entirely just being used to improve itself." OpenAI now says its new Codex 5.3 runs 25% faster than previous versions while using fewer resources. You can now access the model through paid ChatGPT plans, with API access coming soon. OpenAI's CEO, Sam Altman, has been clear in the past that it aims to develop AI-powered researchers to help it improve its own models. For example, he said in October last year that he wanted to build an AI research intern by September 2026 and a "true automated" AI researcher by March 2028. Disclosure: Ziff Davis, PCMag's parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
[4]
OpenAI launches Codex app to gain ground in AI coding race
SAN FRANCISCO, Feb 2 (Reuters) - OpenAI is launching a mobile app for its coding tool, Codex, in hopes of seizing momentum -- and customers -- from its rivals in the AI code-generation space. OpenAI said it designed its Codex app to make it easy for users to simultaneously manage multiple artificial-intelligence agents over a long period of time. The app can also then use code to do things like gather and analyze information, OpenAI officials said during a briefing with reporters. Coding is arguably the most successful application for AI models in recent years. Coding tools are key to helping AI startups attract business customers and the space has grown increasingly competitive. OpenAI lags behind its rivals in this area, particularly the AI startup Anthropic, which has dominated the coding market with its Claude Code tool. Anthropic says Claude Code reached $1 billion in revenue, on an annualized basis, in the six months after it was made available to the public. The Codex app, released on Monday, is part of OpenAI's redoubled efforts to gain ground. OpenAI said it's also tried to make the app more user-friendly to help make these advanced capabilities more accessible to the masses. Many observers of the coding space say the code-generation tools aren't yet good enough to entirely replace human tech workers. But these tools do make their work significantly faster. "The models just don't run out of dopamine," CEO Sam Altman said, relaying a recent conversation with a colleague. "They keep trying, they don't run out of motivation." Reporting by Deepa Seetharaman in San Francisco; Editing by Mark Porter Our Standards: The Thomson Reuters Trust Principles., opens new tab
[5]
OpenAI launches Codex app as enterprises weigh autonomous AI coding tools
The app gives developers a centralized workspace to manage multiple AI coding agents across projects without losing task context, OpenAI said. OpenAI has launched a standalone Codex app to manage multiple AI coding agents across projects, pushing beyond chat-based code generation as enterprises weigh the impact of more autonomous tools on development workflows and governance. The move comes as OpenAI faces intensifying competition from rivals such as Anthropic and GitHub. Last month, Anthropic introduced Cowork, a research-preview feature that extends Claude Code beyond programming into broader enterprise workflows. "The Codex app provides a focused space for multi-tasking with agents," OpenAI said in a statement, noting that agents "run in separate threads organized by projects," allowing developers to switch between tasks without losing context.
[6]
OpenAI's Codex just got its own Mac app - and anyone can try it for free now
Switching between IDE, terminal, and app keeps context across tools. OpenAI today announced a new Mac app dedicated to working with its Codex AI coding agent. This is different from the general-purpose ChatGPT app that OpenAI has been shipping for a while. The new coding app is intended to be something of a command center, not only for directing coding agents, but also for managing multiple coding agents across projects and tasks that run for long periods of time. Also: I got 4 years of product development done in 4 days for $200, and I'm still stunned In a briefing I attended with OpenAI executives a few days before the launch, OpenAI cofounder and CEO Sam Altman said, "There's obviously been a huge shift in software agents over the last few months as the models have crossed a threshold of real utility. 5.2, in particular, is a model that many of us have found can do extremely complex things, and we realized we started to feel limited by the interface." The new Mac app is intended to help mitigate that limitation. OpenAI described Codex as expanding from single-agent coding operations to being at the core of multi-agent software lifecycles. Rather than pairing with a developer on a single edit, OpenAI sees programmers as coordinating teams of agents across all of the design, build, ship, and ongoing maintenance stages of work. (Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) The latest version of Codex, GPT-5.2-Codex, only launched in mid-December. Yet, according to the company, Codex usage has nearly doubled since that time, with more than a million developers using it in the last month. I used it. In fact, with just the $20-per-month ChatGPT Plus plan, I found a mystery bug. Subsequently, I also used GPT-5.2-Codex on that same Plus plan to add two fairly significant features to my security product and ship a major update. Also: I used GPT-5.2-Codex to find a mystery bug and hosting nightmare - it was beyond fast In the briefing, Altman stated that GPT-5.2-Codex "is the fastest adopted model that we have ever made." He also reported on an extreme level of momentum using this tool, with usage growing more than 20 times since last August. In addition to independent developers like me, the company reported that major customers include Cisco, Ramp, Virgin Atlantic, Vanta, Duolingo, and Gap. The new app is designed to run multiple tasks in parallel, and help users supervise agents across longer end-to-end work. The company is also introducing skills, similar to those in Claude Code. These are meant to allow developers to define repeatable workflows like fetching logs and fixing tests, along with coding-adjacent tasks like summarizing threads and completing tickets. Right now, the app is Mac only. However, I expect that there will be a Windows app eventually. OpenAI shipped the ChatGPT app in Mac-only form, but some months later launched a Windows version. In addition to the app itself, OpenAI announced a new plan mode for Codex that allows for a read-only review (meaning the AI won't muck with your code) and selectable personalities. Personally, I've had just about enough personality from the human programmers I've managed, so I'd prefer a nice, personality-free personality in my coding agent. Recently, OpenAI also announced that Codex has an IDE extension for use in the JetBrains IDEs. Readers may recall that back in June I moved off of PhpStorm, my favorite JetBrains development environment. I moved to VS Code simply because the AI tools were more available for that environment. It's nice to see JetBrains IDE availability for those of us who prefer it over VS Code. Also: Stop using ChatGPT for everything: My go-to AI models for research, coding, and more (and which I avoid) During the briefing, I asked about the relationship between the various coding options now offered with Codex. We can code in our favorite IDE, in the terminal, and now in this new app. I wanted to know whether it would be possible to get the same richness of experience in the app that we have in the IDE, with our entire codebase in front of us. Sam told me, "I was astonished by this, but I did a fairly big project in a few days earlier this week and over the weekend." He said, "I did not open an IDE during the process. Not a single time. I did look at some code, but I was not doing it the old-fashioned way." He further told me, speaking about this level of coding ability from the AI, "I did not think that was going to be happening by now. I never had that experience before." I was given access to a preview build of the Mac app. I found I was easily able to work in my codebase. In fact, it knew about my most recent work in the IDE. It just automatically picked up from where I last left off. It seems that we'll be able to switch between whatever coding tools we want, and Codex will maintain context. Codex does have a cloud tool, which lives in GitHub. In fact, that's how Codex initially launched. When I first tested it in that context, I didn't like it. But once I started working with Codex on my own computer using the IDE extensions, I found that my productivity skyrocketed. The problem with running an AI on a local computer, especially with write access, is that it can be a security concern. In the briefing, one of OpenAI's Codex developers posed the worry from a design perspective, saying, "If the agent runs on the user's computer, how do we make that as safe and secure as possible?" Also: How to learn ChatGPT in an hour - for free The key is limiting sandbox access to only approved folders. In my case, that's my project folder. The tool also has permissioned network access and remembers approvals over time. The new Mac app adds a sandbox mode and lets developers set approval levels, including Untrusted, On failure, On request, and Never (meaning the app is never permitted to ask for elevated permissions). The OpenAI development team described its current coding agents as now reliable enough to let it push entire workflows into the new app. A number of the OpenAI developers said that, in just the last few weeks, they've started working almost entirely in the Mac app. They haven't even needed to use their IDEs or terminal windows. OpenAI currently makes Codex available with limited usage in the $20-per-month ChatGPT Plus plan. It offers much more extensive resource availability in its $200-per-month ChatGPT Pro plan. While I ran into roadblocks building new products (full days of coding) in the $20-per-month plan, I never ran into a roadblock when I was using the $200-per-month plan. Also: Is ChatGPT Plus still worth your $20? I compared it to the Free, Go, and Pro plans - here's my advice But companies have asked OpenAI for something more. It's not just about how many tokens can be allocated. For those companies, it's about speed of computation. You can wait quite a few minutes after each major AI command. OpenAI said that it's looking at creating higher compute-intensive, speed-intensive tiers for companies willing to pay for pure speed. These would allow for very long context coding work and include high-speed infrastructure options. To encourage users to give Codex a try, the company announced, "For a limited time we're including Codex with ChatGPT Free and Go, and we're doubling the rate limits on Plus, Pro, Business, Enterprise, and Edu plans." So feel free to give Codex a try. Also: 10 ChatGPT Codex secrets I only learned after 60 hours of pair programming with it What about you? Are you planning to try the new Codex app when it launches, or do you prefer sticking with an IDE extension like VS Code or JetBrains? Have you used GPT-5.2-Codex (or other coding agents) for real features and bug hunts yet? Did AI tools actually save you time, or did they create cleanup and tech-debt headaches later? Comment below.
[7]
OpenAI brings its Codex coding app to Mac, with new multi-agent abilities included
Since last spring, OpenAI has offered Codex. What started life as the company's response to Claude Code is becoming something more sophisticated with the release of a new dedicated macOS app. At its most basic form, Codex is a programming agent capable of writing code for users, but now it can also manage multiple AI assistants that can work together to complete more complex tasks. OpenAI gives an example of how this could work in practice. The company used Codex to create a Mario Kart-like racing game, complete with a selection of different playable cars, eight tracks and a collection of powerups players can use against the competition. For a single AI agent, generating a game from scratch, with all the needed visual assets, would be a tough ask, but Codex was able to complete the task because it could delegate the work of making the game to different models with complementary capabilities. For example, it turned to GPT Image for the visual assets, while a separate model simultaneously coded the web game. "It took on the roles of designer, game developer and QA tester to validate its work by actually playing the game," OpenAI says of the process. If that sounds complicated, OpenAI has tried to make it more approachable with a section of the app titled Skills. The feature bundles "instructions, resources, and scripts so Codex can reliably connect to tools, run workflows, and complete tasks according to your team's preferences," the company explains. "The Codex app includes a dedicated interface to create and manage skills. You can explicitly ask Codex to use specific skills, or let it automatically use them based on the task at hand." As you might imagine, Codex can also automate repetitive tasks. A dedicated Automations section of the app allows you to schedule tasks, which the software will complete in the background. "At OpenAI, we've been using Automations to handle the repetitive but important tasks, like daily issue triage, finding and summarizing CI failures, generating daily release briefs, checking for bugs, and more," the company said. The release of the Codex macOS app comes as AI startups explore what a group of AI agents working in parallel can accomplish. At the start of the year, Anysphere, the company behind Cursor, found it was possible to build a working web browser from scratch using such an approach, though it did encounter problems along the way. For a limited time, OpenAI is making Codex available to ChatGPT Free and Go users so they can see what's possible with this new software. At the same time, the company is doubling rates for Plus and Pro subscribers.
[8]
OpenAI launches standalone Codex app for Apple computers
The new app has a simple interface and is designed to serve as a "command center" that makes it easy for software developers to manage multiple AI agents at once, OpenAI said. An AI agent is a tool that can independently complete tasks, like writing code, on behalf of a user. AI coding assistants have exploded in popularity over the last year, and OpenAI said more than 1 million developers have used Codex in the past month. The company initially launched Codex in April, and it made the product generally available in October. The new Codex app is part of OpenAI's ongoing effort to lure users and market share away from rivals like Anthropic and Cursor, which have their own buzzy offerings for developers. OpenAI CEO Sam Altman said the Codex app is "the most loved internal product we've ever had" during a briefing with reporters on Friday. "It's been totally an amazing thing for us to be using recently at OpenAI," Altman said. "'I've been staying up late at night with excitement, building all sorts of things myself."
[9]
GPT 5.3 Codex, OpenAI's new agentic coding model, helped create itself
Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. The big picture: As the race for AI supremacy intensifies, both OpenAI and Anthropic unveiled upgraded models this week. Anthropic's Claude Opus 4.6 marks a significant evolution in how AI tackles complex workplace tasks while OpenAI's GPT-5.3 Codex is the company's most advanced agentic coding model, capable of contributing to its own development. GPT-5.3 Codex merges the advanced coding capabilities of GPT-5.2 Codex with the reasoning and professional knowledge of GPT-5.2 into a single, unified model that is 25 percent faster than its predecessors. According to OpenAI, the model even contributed to its own development, as early versions were used to debug training processes, manage deployment, and analyze test results. The release represents a major leap for Codex, evolving it from an agent capable of writing and reviewing basic code into one of the most powerful coding models available - capable of performing nearly any task a developer can handle. It is also significantly faster than previous versions, achieving record-breaking scores on SWE-Bench Pro and Terminal-Bench. To test GPT-5.3 Codex's web development skills and long-running agentic capabilities, OpenAI had the model create two "highly functional (and) complex" games. The first is an upgraded version of an existing racing title, Voxel Velocity, while the second is a brand-new diving game, Dive In, which lets players explore reefs, collect fish, and navigate underwater hazards. To build the games, the Codex team leveraged the model's recently introduced "develop web game" skill in the Mac app. The team provided pre-selected, generic follow-up prompts, such as "fix the bug" and "improve the game." GPT-5.3 Codex reportedly executed these tasks autonomously over the course of several days, producing the desired results without human intervention. Both games are currently playable on OpenAI's official website, offering users a firsthand look at GPT-5.3 Codex's capabilities. The release also highlights the potential for fully self-generating AI, raising concerns about job displacement in the tech sector and accelerating a future in which human programmers could become increasingly redundant. Beyond creating fully functional video games, GPT-5.3 Codex can also produce polished, ready-to-publish websites. The model is designed to assist coders, designers, product managers, and data scientists with a wide range of software tasks, including debugging, deploying, monitoring, writing PRDs, editing copy, conducting user research, running tests, analyzing metrics, and more. Trained on and deployed using Nvidia's GB200 NVL72 systems, GPT-5.3 Codex is now available to paid ChatGPT subscribers across all platforms - including web, mobile app, CLI, and IDE extension - with API access expected to launch soon.
[10]
OpenAI's Codex app: When your IDE gets a brain
OpenAI's new desktop coding hub is less magic wand, more manic coworker. OpenAI has given software developers a new desktop toy, and judging by the early reactions, it might feel like someone finally handed coders the Swiss Army knife they've been dreaming about or the kind of gadget that makes them wonder if they're working with a robot coworker now. The company rolled out the Codex app for macOS, a focused interface for managing AI coding agents, designed to let developers do more than just "generate a few lines of code." Instead, Codex can juggle multiple tasks in parallel, run background workflows, and act on instructions that span hours or even days. At its core, the new app is a response to a shift that's been quietly happening for the past year: AI isn't just helping write short snippets anymore. It's taking on whole coding projects, running tests, dealing with pull requests, and even undertaking the kind of repetitive maintenance tasks that make developers groan. Containers, threads, isolated worktrees, and integrated Git tools support all of this seamlessly within the app, so engineers don't need to jump between terminals, IDEs, and cloud consoles just to keep a fleet of AI agents moving forward. OpenAI's own announcement makes this shift clear: existing tools were built for real-time interaction or single tasks, but today's coding workflows are multifaceted and sprawling. Instead of coaxing a model to generate code line by line, the Codex app lets you orchestrate agents that work independently on different parts of a codebase. Each agent runs in its own thread and its own worktree, which means you can explore multiple ideas without fear of one bot overwriting another's progress. If you want Codex to review diffs, make comments inline, or even open suggested changes in your local editor, it can. If you want it to run in the background while you sleep, it can do that too. There's even support for what OpenAI calls "skills" and "automations": reusable workflows that let Codex go beyond raw code generation into things like gathering information, problem solving, carrying out scheduled tasks, or managing routine reviews. In early internal use cases, Codex has built complete applications, playing the roles of designer, developer, and QA tester in a single string of prompts, an example that hints at what happens when AI stops being an assistant and starts feeling a bit like a team member with infinite coffee. It's worth noting that the current release is macOS only, which prompted the usual chorus of developer grumbling about Windows and Linux support. Early adopters on Reddit (and elsewhere) joked that the lack of cross-platform availability feels oddly quaint for an AI tool, even as they praised how much the Codex app can do compared to traditional CLI workflows. OpenAI is also sweetening the deal: for a limited time, Codex will be accessible to free and Go users, and rate limits for paid Plus, Pro, Business, Enterprise, and Edu plans have been doubled across all surfaces where Codex runs, whether that's the app, the CLI, IDE extensions, or cloud threads. The company's broader strategy here is unmistakable. AI coding assistants are one of the most competitive battlegrounds in generative AI right now, and rivals like Anthropic's Claude Code have already reported strong revenue metrics in this space. The launch of a desktop command center for Codex feels like OpenAI saying, "We're in this to win," not just to ship features. So what does this actually mean for developers? It doesn't mean humans are about to be replaced overnight. Codex is powerful, but like all AI tools, it still makes mistakes and needs human oversight, especially when the stakes are high and production quality matters. Seen through that lens, the Codex app is less a magic wand and more a powerful collaboration layer, one that could reshape how engineering teams operate without pretending to fully automate the craft of building software. In practice, that suggests a near future where managing AI agents becomes as normal as managing packages or Git branches. Instead of the terminal being the center of your workflow, Codex and its agents might take that spot, responding to prompts, running scheduled tasks, and even adopting configurable "personalities" that fit how you prefer to work. At a time when the debate around AI productivity often centers on displacement or disruption, the Codex app points in a more nuanced direction: what happens when coding tools become partners rather than assistants? The answer won't be immediate, and the work still demands human judgement, but for many developers the first glimpse of that future arrived today.
[11]
OpenAI: New coding model GPT-5.3-Codex helped build itself
OpenAI released a new coding model today, GPT-5.3-Codex. The company said the new model has improved "reasoning and professional knowledge capabilities" and will operate 25 percent faster than its predecessor. Anthropic, one of OpenAI's chief rivals, also released a new coding model today, Claude Opus 4.6. Earlier this week, OpenAI also launched the Codex app for macOS, a new app interface for managing multiple AI agents at once. Crucially, OpenAI says that the new GPT-5.3-Codex model is its "first model that was instrumental in creating itself." Anthropic said something similar about Clade Cowork recently, and engineers at OpenAI and Anthropic say almost all their coding is now done by AI. In a blog post announcing GPT-5.3-Codex, OpenAI wrote, "The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations -- our team was blown away by how much Codex was able to accelerate its own development." Why does it matter if an AI model helped create itself? People who believe in the technological singularity, or "the singularity," talk about a tipping point at which technology becomes self-improving, leading to an uncontrolled explosion of technological advancement. And now we have some real-world examples of AI improving itself. (At least, according to the AI companies behind the announcements.) We don't know exactly how much GPT-5.3-Codex was involved in its own development, but the news does reveal just how advanced frontier AI models have become, particularly in writing code. "With GPT‑5.3-Codex, Codex goes from an agent that can write and review code to an agent that can do nearly anything developers and professionals can do on a computer," the OpenAI blog post reads. GPT-5.3-Codex is available now in the Codex app.
[12]
OpenAI launches a Codex desktop app for macOS to run multiple AI coding agents in parallel
OpenAI on Monday released a new desktop application for its Codex artificial intelligence coding system, a tool the company says transforms software development from a collaborative exercise with a single AI assistant into something more akin to managing a team of autonomous workers. The Codex app for macOS functions as what OpenAI executives describe as a "command center for agents," allowing developers to delegate multiple coding tasks simultaneously, automate repetitive work, and supervise AI systems that can run for up to 30 minutes independently before returning completed code. "This is the most loved internal product we've ever had," Sam Altman, OpenAI's chief executive, told VentureBeat in a press briefing ahead of Monday's launch. "It's been totally an amazing thing for us to be using recently at OpenAI." The release arrives at a pivotal moment for the enterprise AI market. According to a survey of 100 Global 2000 companies published last week by venture capital firm Andreessen Horowitz, 78% of enterprise CIOs now use OpenAI models in production, though competitors Anthropic and Google are gaining ground rapidly. Anthropic posted the largest share increase of any frontier lab since May 2025, growing 25% in enterprise penetration, with 44% of enterprises now using Anthropic in production. The timing of OpenAI's Codex app launch -- with its focus on professional software engineering workflows -- appears designed to defend the company's position in what has become the most contested segment of the AI market: coding tools. Why developers are abandoning their IDEs for AI agent management The Codex app introduces a fundamentally different approach to AI-assisted coding. While previous tools like GitHub Copilot focused on autocompleting lines of code in real-time, the new application enables developers to "effortlessly manage multiple agents at once, run work in parallel, and collaborate with agents over long-running tasks." Alexander Embiricos, the product lead for Codex, explained the evolution during the press briefing by tracing the product's lineage back to 2021, when OpenAI first introduced a model called Codex that powered GitHub Copilot. "Back then, people were using AI to write small chunks of code in their IDEs," Embiricos said. "GPT-5 in August last year was a big jump, and then 5.2 in December was another massive jump, where people started doing longer and longer tasks, asking models to do work end to end. So what we saw is that developers, instead of working closely with the model, pair coding, they started delegating entire features." The shift has been so profound that Altman said he recently completed a substantial coding project without ever opening a traditional integrated development environment. "I was astonished by this...I did this fairly big project in a few days earlier this week and over the weekend. I did not open an IDE during the process. Not a single time," Altman said. "I did look at some code, but I was not doing it the old-fashioned way, and I did not think that was going to be happening by now." How skills and automations extend AI coding beyond simple code generation The Codex app introduces several new capabilities designed to extend AI coding beyond writing lines of code. Chief among these are "Skills," which bundle instructions, resources, and scripts so that Codex can "reliably connect to tools, run workflows, and complete tasks according to your team's preferences." The app includes a dedicated interface for creating and managing skills, and users can explicitly invoke specific skills or allow the system to automatically select them based on the task at hand. OpenAI has published a library of skills for common workflows, including tools to fetch design context from Figma, manage projects in Linear, deploy web applications to cloud hosts like Cloudflare and Vercel, generate images using GPT Image, and create professional documents in PDF, spreadsheet, and Word formats. To demonstrate the system's capabilities, OpenAI asked Codex to build a racing game from a single prompt. Using an image generation skill and a web game development skill, Codex built the game by working independently using more than 7 million tokens with just one initial user prompt, taking on "the roles of designer, game developer, and QA tester to validate its work by actually playing the game." The company has also introduced "Automations," which allow developers to schedule Codex to work in the background on an automatic schedule. "When an Automation finishes, the results land in a review queue so you can jump back in and continue working if needed." Thibault Sottiaux, who leads the Codex team at OpenAI, described how the company uses these automations internally: "We've been using Automations to handle the repetitive but important tasks, like daily issue triage, finding and summarizing CI failures, generating daily release briefs, checking for bugs, and more." The app also includes built-in support for "worktrees," allowing multiple agents to work on the same repository without conflicts. "Each agent works on an isolated copy of your code, allowing you to explore different paths without needing to track how they impact your codebase." OpenAI battles Anthropic and Google for control of enterprise AI spending The launch comes as enterprise spending on AI coding tools accelerates dramatically. According to the Andreessen Horowitz survey, average enterprise AI spend on large language models has risen from approximately $4.5 million to $7 million over the last two years, with enterprises expecting growth of another 65% this year to approximately $11.6 million. Leadership in the enterprise AI market varies significantly by use case. OpenAI dominates "early, horizontal use cases like general purpose chatbots, enterprise knowledge management and customer support," while Anthropic leads in "software development and data analysis, where CIOs consistently cite rapid capability gains since the second half of 2024." When asked during the press briefing how Codex differentiates from Anthropic's Claude Code, which has been described as having its "ChatGPT moment," Sottiaux emphasized OpenAI's focus on model capability for long-running tasks. "One of the things that our models are extremely good at -- they really sit at the frontier of intelligence and doing reliable work for long periods of time," Sottiaux said. "This is also what we're optimizing this new surface to be very good at, so that you can start many parallel agents and coordinate them over long periods of time and not get lost." Altman added that while many tools can handle "vibe coding front ends," OpenAI's 5.2 model remains "the strongest model by far" for sophisticated work on complex systems. "Taking that level of model capability and putting it in an interface where you can do what Thibault was saying, we think is going to matter quite a bit," Altman said. "That's probably the, at least listening to users and sort of looking at the chatter on social that's that's the single biggest differentiator." The surprising satisfies on AI progress: how fast humans can type The philosophical underpinning of the Codex app reflects a view that OpenAI executives have been articulating for months: that human limitations -- not AI capabilities -- now constitute the primary constraint on productivity. In a December appearance on Lenny's Podcast, Embiricos described human typing speed as "the current underappreciated limiting factor" to achieving artificial general intelligence. The logic: if AI can perform complex coding tasks but humans can't write prompts or review outputs fast enough, progress stalls. The Codex app attempts to address this by enabling what the team calls an "abundance mindset" -- running multiple tasks in parallel rather than perfecting single requests. During the briefing, Embiricos described how power users at OpenAI work with the tool. "Last night, I was working on the app, and I was making a few changes, and all of these changes are able to run in parallel together. And I was just sort of going between them, managing them," Embiricos said. "Behind the scenes, all these tasks are running on something called gate work trees, which means that the agents are running independently, and you don't have to manage them." In the Sequoia Capital podcast "Training Data," Embiricos elaborated on this mindset shift: "The mindset that works really well for Codex is, like, kind of like this abundance mindset and, like, hey, let's try anything. Let's try anything even multiple times and see what works." He noted that when users run 20 or more tasks in a day or an hour, "they've probably understood basically how to use the tool." Building trust through sandboxes: how OpenAI secures autonomous coding agents OpenAI has built security measures into the Codex architecture from the ground up. The app uses "native, open-source and configurable system-level sandboxing," and by default, "Codex agents are limited to editing files in the folder or branch where they're working and using cached web search, then asking for permission to run commands that require elevated permissions like network access." Embiricos elaborated on the security approach during the briefing, noting that OpenAI has open-sourced its sandbox technology. "Codex has this sandbox that we're actually incredibly proud of, and it's open source, so you can go check it out," Embiricos said. The sandbox "basically ensures that when the agent is working on your computer, it can only make writes in a specific folder that you want it to make rights into, and it doesn't access network without information." The system also includes a granular permission model that allows users to configure persistent approvals for specific actions, avoiding the need to repeatedly authorize routine operations. "If the agent wants to do something and you find yourself annoyed that you're constantly having to approve it, instead of just saying, 'All right, you can do everything,' you can just say, 'Hey, remember this one thing -- I'm actually okay with you doing this going forward,'" Embiricos explained. Altman emphasized that the permission architecture signals a broader philosophy about AI safety in agentic systems. "I think this is going to be really important. I mean, it's been so clear to us using this, how much you want it to have control of your computer, and how much you need it," Altman said. "And the way the team built Codex such that you can sensibly limit what's happening and also pick the level of control you're comfortable with is important." He also acknowledged the dual-use nature of the technology. "We do expect to get to our internal cybersecurity high moment of our models very soon. We've been preparing for this. We've talked about our mitigation plan," Altman said. "A real thing for the world to contend with is going to be defending against a lot of capable cybersecurity threats using these models very quickly." The same capabilities that make Codex valuable for fixing bugs and refactoring code could, in the wrong hands, be used to discover vulnerabilities or write malicious software -- a tension that will only intensify as AI coding agents become more capable. From Android apps to research breakthroughs: how Codex transformed OpenAI's own operations Perhaps the most compelling evidence for Codex's capabilities comes from OpenAI's own use of the tool. Sottiaux described how the system has accelerated internal development. "A Sora Android app is an example of that where four engineers shipped in only 18 days internally, and then within the month we give access to the world," Sottiaux said. "I had never noticed such speed at this scale before." Beyond product development, Sottiaux described how Codex has become integral to OpenAI's research operations. "Codex is really involved in all parts of the research -- making new data sets, investigating its own screening runs," he said. "When I sit in meetings with researchers, they all send Codex off to do an investigation while we're having a chat, and then it will come back with useful information, and we're able to debug much faster." The tool has also begun contributing to its own development. "Codex also is starting to build itself," Sottiaux noted. "There's no screen within the Codex engineering team that doesn't have Codex running on multiple, six, eight, ten, tasks at a time." When asked whether this constitutes evidence of "recursive self-improvement" -- a concept that has long concerned AI safety researchers -- Sottiaux was measured in his response. "There is a human in the loop at all times," he said. "I wouldn't necessarily call it recursive self-improvement, a glimpse into the future there." Altman offered a more expansive view of the research implications. "There's two parts of what people talk about when they talk about automating research to a degree where you can imagine that happening," Altman said. "One is, can you write software, extremely complex infrastructure, software to run training jobs across hundreds of thousands of GPUs and babysit them. And the second is, can you come up with the new scientific ideas that make algorithms more efficient." He noted that OpenAI is "seeing early but promising signs on both of those." The end of technical debt? AI agents take on the work engineers hate most One of the more unexpected applications of Codex has been addressing technical debt -- the accumulated maintenance burden that plagues most software projects. Altman described how AI coding agents excel at the unglamorous work that human engineers typically avoid. "The kind of work that human engineers hate to do -- go refactor this, clean up this code base, rewrite this, write this test -- this is where the model doesn't care. The model will do anything, whether it's fun or not," Altman said. He reported that some infrastructure teams at OpenAI that "had sort of like, given up hope that you were ever really going to long term win the war against tech debt, are now like, we're going to win this, because the model is going to constantly be working behind us, making sure we have great test coverage, making sure that we refactor when we're supposed to." The observation speaks to a broader theme that emerged repeatedly during the briefing: AI coding agents don't experience the motivational fluctuations that affect human programmers. As Altman noted, a team member recently observed that "the hardest mental adjustment to make about working with these sort of like aI coding teammates, unlike a human, is the models just don't run out of dopamine. They keep trying. They don't run out of motivation. They don't get, you know, they don't lose energy when something's not working. They just keep going and, you know, they figure out how to get it done." What the Codex app costs and who can use it starting today The Codex app launches today on macOS and is available to anyone with a ChatGPT Plus, Pro, Business, Enterprise, or Edu subscription. Usage is included in ChatGPT subscriptions, with the option to purchase additional credits if needed. In a promotional push, OpenAI is temporarily making Codex available to ChatGPT Free and Go users "to help more people try agentic workflows." The company is also doubling rate limits for existing Codex users across all paid plans during this promotional period. The pricing strategy reflects OpenAI's determination to establish Codex as the default tool for AI-assisted development before competitors can gain further traction. More than a million developers have used Codex in the past month, and usage has nearly doubled since the launch of GPT-5.2-Codex in mid-December, building on more than 20x usage growth since August 2025. Customers using Codex include large enterprises like Cisco, Ramp, Virgin Atlantic, Vanta, Duolingo, and Gap, as well as startups like Harvey, Sierra, and Wonderful. Individual developers have also embraced the tool: Peter Steinberger, creator of OpenClaw, built the project entirely with Codex and reports that since fully switching to the tool, his productivity has roughly doubled across more than 82,000 GitHub contributions. OpenAI's ambitious roadmap: Windows support, cloud triggers, and continuous background agents OpenAI outlined an aggressive development roadmap for Codex. The company plans to make the app available on Windows, continue pushing "the frontier of model capabilities," and roll out faster inference. Within the app, OpenAI will "keep refining multi-agent workflows based on real-world feedback" and is "building out Automations with support for cloud-based triggers, so Codex can run continuously in the background -- not just when your computer is open." The company also announced a new "plan mode" feature that allows Codex to read through complex changes in read-only mode, then discuss with the user before executing. "This means that it lets you build a lot of confidence before, again, sending it to do a lot of work by itself, independently, in parallel to you," Embiricos explained. Additionally, OpenAI is introducing customizable personalities for Codex. "The default personality for Codex has been quite terse. A lot of people love it, but some people want something more engaging," Embiricos said. Users can access the new personalities using the /personality command. Altman also hinted at future integration with ChatGPT's broader ecosystem. "There will be all kinds of cool things we can do over time to connect people's ChatGPT accounts and leverage sort of all the history they've built up there," Altman said. Microsoft still dominates enterprise AI, but the window for disruption is open The Codex launch occurs as most enterprises have moved beyond single-vendor strategies. According to the Andreessen Horowitz survey, "81% now use three or more model families in testing or production, up from 68% less than a year ago." Despite the proliferation of AI coding tools, Microsoft continues to dominate enterprise adoption through its existing relationships. "Microsoft 365 Copilot leads enterprise chat though ChatGPT has closed the gap meaningfully," and "Github Copilot is still the coding leader for enterprises." The survey found that "65% of enterprises noted they preferred to go with incumbent solutions when available," citing trust, integration, and procurement simplicity. However, the survey also suggests significant opportunity for challengers: "Enterprises consistently say they value faster innovation, deeper AI focus, and greater flexibility paired with cutting edge capabilities that AI native startups bring." OpenAI appears to be positioning Codex as a bridge between these worlds. "Codex is built on a simple premise: everything is controlled by code," the company stated. "The better an agent is at reasoning about and producing code, the more capable it becomes across all forms of technical and knowledge work." The company's ambition extends beyond coding. "We've focused on making Codex the best coding agent, which has also laid the foundation for it to become a strong agent for a broad range of knowledge work tasks that extend beyond writing code." When asked whether AI coding tools could eventually move beyond early adopters to become mainstream, Altman suggested the transition may be closer than many expect. "Can it go from vibe coding to serious software engineering? That's what this is about," Altman said. "I think we are over the bar on that. I think this will be the way that most serious coders do their job -- and very rapidly from now." He then pivoted to an even bolder prediction: that code itself could become the universal interface for all computer-based work. "Code is a universal language to get computers to do what you want. And it's gotten so good that I think, very quickly, we can go not just from vibe coding silly apps but to doing all the non-coding knowledge work," Altman said. At the close of the briefing, Altman urged journalists to try the product themselves: "Please try the app. There's no way to get this across just by talking about it. It's a crazy amount of power." For developers who have spent careers learning to write code, the message was clear: the future belongs to those who learn to manage the machines that write it for them.
[13]
OpenAI's Latest AI Was Created Using "Itself," Company Claims
On Thursday, OpenAI announced GPT-5.3-Codex, a new coding model that's purportedly 25 percent faster than its predecessor, along with other impressive benchmarks. The real head-turner, however, is another claim the Sam Altman-led company made about its development. GPT-5.3-Codex, supposedly, is its first model "that was instrumental in creating itself," with its team "blown away" by the results. Is the Singularity nearly upon us? Is this the long-awaited sign of recursive self-improvement, a point at which the machines are finally able to continually rewrite their own code to transform themselves into superior beings? Not quite. In less sensational wording, OpenAI restates the AI's role as being used to "accelerate its own development." What that looked like, according to the blog post, is that "the Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations." In other words, GPT-5.3.-Codex helped the human coders with some of their tasks -- impressive, maybe, but hardly a sign of humankind's immediate obsolescence. The imaginations of AI boosters, nonetheless, went into overdrive. The responses to a post on the r/singularity subreddit about the news were filled with a mixture of doom-tinged hype, hype-tinged doom, and a good deal of gallows humor. "I hope everyone remembers how good a mid level manager I was before the machines came," one user wrote. The sentiment extended to X. "Holy moly -- so it begins!" tweeted another user, who runs an AI newsletter. Taken together, it's a sign of how the discourse around AI is still dominated by sci-fi concepts and rhetoric and distorted by hype. AI companies have been more than occasionally guilty of stoking the flames. Last month, the head of Anthropic's Claude Code Boris Cherny claimed that "pretty much" 100 percent of the company's code is now AI-generated using its own model. What this actually looks like behind the scenes, though, remains hazy. And it's still far removed from the idea of AI models building new models completely autonomously. Anthropic, regardless, isn't alone. With the new Codex release, Altman -- who's been on an emotional rollercoaster as of late -- added to the pile with a sob story about how his coding tool was so good that it made him depressed. "I built an app with Codex last week. It was very fun," Altman tweeted. "Then I started asking it for ideas for new features and at least a couple of them were better than I was thinking of. I felt a little useless and it was sad." Who said the "vibe" in "vibe-coding" had to be good?
[14]
OpenAI's new model leaps ahead in coding capabilities -- but raises unprecedented cybersecurity risks | Fortune
OpenAI believes it has finally pulled ahead in one of the most closely watched races in artificial intelligence: AI-powered coding. Its newest model, GPT-5.3-Codex, represents a solid advance over rival systems, showing markedly higher performance on coding benchmarks and reported results than earlier generations of both OpenAI's and Anthropic's models -- suggesting a long-sought edge in a category that could reshape how software is built. But the company is rolling out the model with unusually tight controls and delaying full developer access as it confronts a harder reality: the same capabilities that make GPT-5.3-Codex so effective at writing, testing, and reasoning about code also raise serious cybersecurity concerns. In the race to build the most powerful coding model, OpenAI has run headlong into the risks of releasing it. GPT-5.3-Codex is available to paid ChatGPT users, who can use the model for everyday software development tasks such as writing, debugging, and testing code through OpenAI's Codex tools and ChatGPT interface. But for now, the company is not opening unrestricted access for high-risk cybersecurity uses, and OpenAI is not immediately enabling full API access that would allow the model to be automated at scale. Those more sensitive applications are being gated behind additional safeguards, including a new trusted-access program for vetted security professionals, reflecting OpenAI's view that the model has crossed a new cybersecurity risk threshold. The company's blog post accompanying the model release on Thursday said that while it does not have "definitive evidence" the new model can fully automate cyber attacks, "we're taking a precautionary approach and deploying our most comprehensive cybersecurity safety stack to date. Our mitigations include safety training, automated monitoring, trusted access for advanced capabilities, and enforcement pipelines including threat intelligence." OpenAI CEO Sam Altman posted on X about the concerns, saying that GPT-5.3-Codex is "our first model that hits 'high' for cybersecurity on our preparedness framework," an internal risk classification system OpenAI uses for model releases. In other words, this is the first model OpenAI believes is good enough at coding and reasoning that it could meaningfully enable real-world cyber harm, especially if automated or used at scale.
[15]
OpenAI says new coding model helped build itself
Artificial intelligence is entering the era of self-improvement. On Thursday afternoon, OpenAI released a new cutting-edge coding model that the company said assisted in its own creation. "GPT-5.3-Codex is our first model that was instrumental in creating itself," OpenAI stated in a blog post accompanying the model's release. "The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations -- our team was blown away by how much Codex was able to accelerate its own development." As an AI-powered coding model, Codex allows users to build complex software programs and applications using instructions written in natural language. While AI researchers and engineers used to write individual lines of code, coding models such as Codex, Anthropic's Claude Code and Cursor outsource line-by-line writing to AI, while humans are only required to provide high-level instructions. These new AI systems and coding assistants are revolutionizing how developers write software, but they're also changing how AI companies themselves are developing the next generation of AI models and coding assistants. This tightening feedback loop, in which AI systems help design their successors, are changing the business of AI and breaking down walls between internal teams, while also raising questions about safety and the role of human engineers. The new coding model released Thursday afternoon, entitled GPT-5.3-Codex, builds on OpenAI's GPT-5.2-Codex model and combines insights from the AI company's GPT-5.2 model, which excels on non-coding tasks like text analysis and reasoning. OpenAI says the new model runs 25% faster than the previous version and achieves state-of-the-art performance while using fewer computing resources. OpenAI also said the model is the first to be designated as "high-capability" for cybersecurity-related tasks as defined by its internal risk-management assessments and guidelines. Increasingly capable coding models have been projected to ease hackers' or bad actors' ability to conduct cyberattacks, while also potentially boosting cyber defenses. Alex Embiricos, who leads the Codex team's product division, said the increasing speed and capabilities of coding agents is supercharging AI development. "If you look at where we were and the amount of progress we've had in the past year on our models' coding ability, it's crazy," Embiricos told NBC News. "I think the velocity there has been super high." "Researchers are using Codex themselves," he said. "A researcher will do a training run, and they'll be using Codex to 'babysit' or monitor that training run, or they'll use Codex to analyze some data about the training run, or they'll use Codex to clean up a data set or something like that." Embiricos said developers and engineers across OpenAI are using Codex not only to write new code, but also to validate and evaluate code to ensure it is effective and efficient. Beyond creating the underlying models that generate code, AI is also being harnessed to create the interfaces that customers and developers outside the company use to write code. On Monday, OpenAI released a Codex desktop app that is meant to assist in managing multiple largely autonomous AI computing tasks simultaneously. The app, meant to enable users to more easily create code and orchestrate multiple autonomous personal coding assistants, or AI coding agents, has already been downloaded more than 500,000 times. Ed Bayes, who leads product design for Codex, said the rise of Codex and coding models had completely changed how he designs products like the app. "I spend 90% of my time in code now. A year ago, it was flipped, and it was maybe 10%," Bayes told NBC News, reminiscing about using graphic design software or laying out design prototypes using Photoshop. "It's radically changing the way that people are able to build and prototype." Bayes said the new app, meant to both appeal to seasoned coding veterans and welcome less experienced coders with its intuitive layout and natural-language inputs, was itself a product of increasingly powerful Codex models. "A year ago, you couldn't build software in such a short period of time without it looking like it was built in such a short period of time. But now, engineers are becoming designers. Designers are becoming engineers. I think these walls are starting to kind of fall down." AI engineers and researchers at companies such as OpenAI and rival Anthropic have long sought to apply AI systems to the problem of developing AI models themselves. In October, OpenAI CEO Sam Altman stated the company aimed to have "an automated AI research intern by September" and a "true automated AI researcher by March of 2028." Many experts credit AI's improving software-writing abilities to the fact that coding tasks can be easily graded or evaluated by an AI system. If an AI system can easily confirm that a certain task is completed, that positive feedback can inform the system's future development. This quick feedback cycle is critical to current training methods for AI systems, which are highly dependent on so-called reinforcement learning, and eases automation of research tasks. Experts have predicted that AI systems capable of autonomously improving themselves, otherwise known as recursive self-improvement, might vastly increase the speed at which AI development takes place, potentially shortening years of complex research into weeks or even days. Some foresee this leading to a sort of explosion in the intelligence of AI systems, while others caution that such iteration might work to increase capabilities in some domains -- like software engineering -- while AI systems will continue to struggle on cognitive or physical abilities. Early last week, Anthropic CEO Dario Amodei told NBC News that the self-improving phenomenon is also starting to take shape at Anthropic. "We essentially have Claude designing the next version of Claude itself, not completely, not in all ways, but in many ways, that loop starts to close very fast."
[16]
OpenAI introduces Frontier agent management platform, GPT-5.3-Codex - SiliconANGLE
OpenAI introduces Frontier agent management platform, GPT-5.3-Codex OpenAI Group PBC today introduced a platform called Frontier that companies can use to build and manage artificial intelligence agents. The offering debuted alongside a new version of the company's programming-optimized model. GPT-3.3-Codex, as the algorithm is called, is better than its predecessor at not only code generation but also general productivity tasks. Frontier enables workers to create AI agents in a chat box similar to the ChatGPT interface. The first step is to enter a natural language description of what tasks an agent should perform and how. From there, workers can integrate the agent with the applications that it will use to automate manual work. OpenAI says that Frontier can connect to customer relationship management platforms, data warehouses and a range of other services. Workers may optionally equip an agent with skills. Those are user-created extensions that make OpenAI-powered agents better at performing multi-step tasks. A skill can, for example, include pre-packaged Kubernetes configuration scripts that an agent can use to create container clusters. OpenAI says that Frontier-connected agents "build memories" of the tasks they perform. Those memories enable them to improve their output quality over time. According to the company, administrators can monitor an agent's performance in a dashboard that visualizes metrics such as the number of customer support tickets that it process and its success rate. Frontier also includes other observability features. The platform creates an audit log of every task completed by an agent to ease quality evaluations. For example, the audit log of a technical support interaction might include visualizations that quantify the agent's politeness and output accuracy. OpenAI plans to provide customers with access to so-called forward deployed engineers who can help them develop Frontier best practices. Additionally, it will work with fellow AI companies to help them build Frontier-powered features. The initial partner roster includes sales automation provider Clay Labs Inc., medical notetaking specialist Ambience Healthcare Inc. and several other startups. GPT-5.3-Codex, the coding-focused model that OpenAI debuted today in conjunction with Frontier, generates responses 25% faster than its predecessor. The algorithm set new records on the SWE-Bench Pro and Terminal-Bench programming benchmarks. The former test includes coding questions that span 4 programming languages, while the latter evaluates AI models' ability to use the command line. OpenAI says that GPT-5.3-Codex also lends itself to a range of other use cases. The company evaluated the model using a benchmark called OSWorld that includes a large number of online research and file editing tasks. GPT-5.3-Codex scored 64.7%, 26.5% more than its predecessor.
[17]
OpenAI launches Codex App to bring its coding models, used to build OpenClaw, to more users | Fortune
OpenAI's move follows rival Anthropic's debut of a similar application for its popular Claude coding assistant, called Code Cowork, in January. Like the new Codex App, Claude Cowork is designed for non-software developers to start using Claude to build AI agents that work across the software they already have installed on their computer. OpenAI's Codex tool has already been credited with some productivity gains. Independent developer Peter Steinberger, creator of the now viral AI tool OpenClaw (which spawned "social network for AI agents" Moltbook), said in a post on X that his productivity had roughly doubled since switching to OpenAI's Codex. Steinberger has said he built the entire OpenClaw tool with OpenAI's Codex, despite publicly saying Anthropic's Claude Opus is the "best general-purpose agent" and the model recommended for use in OpenClaw. Codex is also already being used within OpenAI to build new features and products. According to the company, a four-person engineering team built and shipped the Sora for Android app in just 28 days using Codex. With the Codex App, OpenAI appears to be trying to shore up users in the developer tools market. The company says that over a million developers have used Codex in the past month, including teams at startups like Harvey and Sierra as well as large enterprises like Cisco. But OpenAI is in an increasingly cutthroat fight with Anthropic and Google to be the platform on which software developers -- and increasingly others too -- build new software applications. Anthropic has reportedly been outpacing OpenAI in enterprise adoption -- though OpenAI disputes some of the data -- and has positioned itself as an enterprise AI company first and a consumer one second. Claude Code is now used by major companies including Uber, Netflix, Spotify, Salesforce, Accenture, and Snowflake, according to Anthropic. OpenAI is betting that developers want centralized control over agent workflows. The new app lets users run multiple AI agents at once across different projects, automate repetitive tasks, and monitor what the agents are doing. The company currently recommends using the GPT-5.2-Codex model for coding projects and GPT-5.2 for analysis and writing tasks, with adjustable thinking intensity and personality options.
[18]
GPT 5.3 Codex is here and it debugged its own training says OpenAI
OpenAI released GPT-5.3-Codex, a new coding model with improved reasoning and professional knowledge capabilities that operates 25 percent faster than its predecessor. Anthropic launched rival model Claude Opus 4.6 on the same day. Earlier this week, OpenAI introduced the Codex app for macOS to manage multiple AI agents simultaneously. OpenAI describes GPT-5.3-Codex as its first model instrumental in creating itself. The company explains that developers leveraged early versions of the model during its development process. These versions assisted in specific technical tasks essential to building the final product. In a blog post announcing the model, OpenAI states: "The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations -- our team was blown away by how much Codex was able to accelerate its own development." Debugging training involves identifying and correcting errors in the model's learning algorithms. Managing deployment covers coordinating the rollout of the model across servers and infrastructure. Diagnosing test results requires analyzing performance metrics from benchmarks and evaluations to refine outputs. Anthropic recently made a comparable assertion about its Clade Cowork model. Engineers at both OpenAI and Anthropic report that artificial intelligence now performs almost all of their coding work. This shift indicates heavy reliance on AI tools for software development tasks at these organizations. The OpenAI blog post further details the model's expanded abilities. It specifies: "With GPT‑5.3‑Codex, Codex goes from an agent that can write and review code to an agent that can do nearly anything developers and professionals can do on a computer." Writing code entails generating functional scripts and programs. Reviewing code includes checking for bugs, inefficiencies, and adherence to standards. The broader scope encompasses tasks like data analysis, file management, and interface interactions typically handled by human experts on computing devices. The announcements highlight self-improving models as real-world examples relevant to discussions of the technological singularity. These discussions define the singularity as a tipping point where technology becomes self-improving, leading to an uncontrolled explosion of technological advancement.
[19]
OpenAI Just Launched a New Vibe-Coding App. Here's How to Start Building With It Today
OpenAI just lobbed a grenade at vibe-coding startups like Cursor and Windsurf. The company behind ChatGPT has announced the Codex MacOS App, its take on an integrated development environment (IDE) that's been customized to make full use of OpenAI models. For non-technical people, you can think of it as OpenAI's purpose-built vibe coding app. App developers use IDEs to write code and view changes to code, but over the last few years, several tech startups have released upgraded IDEs with generative AI capabilities, enabling them to write code from prompts. This has led to the rise of "vibe coding," a new method of programming in which even non-developers can create software using nothing but their own words. Now, OpenAI has released its own IDE, but unlike others, you can't actually write code in it. But thanks to AI, you don't need to actually write code in order to make apps. Instead, this new Codex app focuses on providing a hub where you can run multiple AI agents simultaneously to accomplish multiple coding tasks at once, like directing one agent to handle the user interface of an app you want to create while another sets up the backend.
[20]
GPT-5.3-Codex Arrives as OpenAI's First AI Model That Helped Build Itself
It is OpenAI's first model rated "high capability" in cybersecurity OpenAI released GPT-5.3-Codex, a new artificial intelligence (AI) model for agentic coding tasks, on Thursday. The San Francisco-based AI giant said that the new model will allow Codex to do more, including developing complex video games and applications from scratch. Calling it the company's most capable agentic coding mode, OpenAI said it can handle full workflows, debug entire codebases, research requirements, and deploy changes. Interestingly, the GPT-5.3-Codex is also the AI giant's first AI model that played a key role in the development of itself. GPT-5.3-Codex Released In a post, the AI giant detailed its latest frontier model for agentic coding. It is currently available with all the paid ChatGPT plans globally across mobile and desktop apps, command-line interface (CLI), integrated development environment (IDE) extension, and web. The company will soon offer the model via OpenAI's application programming interface (API) as well. The latest model fuses the coding frontier performance of GPT-5.2-Codex with the reasoning and professional knowledge depth of GPT-5.2, creating a single, unified system. The model is 25 percent faster, allowing it to tackle long-horizon projects that involve research, tool usage and intricate execution steps. Crucially, users can steer it mid-task, which was not possible earlier. As a result, users can get progress updates, ask questions, suggest course corrections, or debate approaches without the agent dropping context. The model even helped build itself. Early versions assisted the Codex team in debugging training runs, managing deployment and diagnosing evaluation results. OpenAI says the team was impressed by how much acceleration this self-assistance provided during development. Coming to performance, OpenAI shared benchmark scores based on internal evaluations. On SWE-Bench Pro, a tough real-world software engineering test across multiple languages, it hits 56.8 percent accuracy, edging out GPT-5.2-Codex at 56.4 percent and GPT-5.2 at 55.6 percent. Terminal-Bench 2.0 sees a jump to 77.3 percent from 64.0 percent on the prior Codex variant. In OSWorld-Verified, which measures agent performance in visual desktop environments for productivity tasks, it reaches 64.7 percent compared to 38.2 percent for GPT-5.2-Codex. GPT-5.3-Codex can build complex web games from underspecified prompts, iterating autonomously over millions of tokens. One demo shared in the announcement post showed a racing game complete with maps, items and racers. It also generates more production-ready websites, automatically handling features like discount displays or testimonial carousels. Outside pure coding, it supports the full software lifecycle, including writing PRDs, editing copy, user research, building slide decks, analyzing spreadsheets and monitoring systems. OpenAI has also focused on the safety guardrails of the AI model. GPT-5.3-Codex is the first model OpenAI classifies as High capability under its Preparedness Framework for cybersecurity tasks. It deploys a comprehensive cybersecurity safety stack, including safety training, automated monitoring, trusted access controls, and threat intelligence enforcement.
[21]
OpenAI launches Codex macOS app with parallel agentic coding
OpenAI launched a macOS app for its Codex coding tool on Monday, incorporating agentic practices that enable multiple agents to operate in parallel while integrating advanced workflows, less than two months after releasing GPT-5.2-Codex, its most powerful coding model. Artificial intelligence systems now handle substantial portions of software development tasks previously done manually, with swarms of agents and subagents performing the routine aspects of programming. Developers continue to test various interfaces and formats for human-AI collaboration, creating challenges for AI research organizations to match evolving practices. Agentic software development, where AI agents operate autonomously on coding assignments, represents the prevailing approach, as demonstrated by applications such as Claude Code and Cowork. OpenAI first introduced Codex as a command-line interface tool in April, followed by a web-based version one month afterward. The macOS application marks the company's effort to align with these agentic advancements. The app supports simultaneous operation of multiple agents, combining their individual capabilities with contemporary workflow methods. OpenAI positions this release to attract users from competing tools like Claude Code. GPT-5.2-Codex serves as OpenAI's leading coding model. During a press call, CEO Sam Altman stated, "If you really want to do sophisticated work on something complex, 5.2 is the strongest model by far." He added, "However, it's been harder to use, so taking that level of model capability and putting it in a more flexible interface, we think is going to matter quite a bit." Coding performance evaluations present a mixed picture for GPT-5.2. The model achieves the highest score on TerminalBench, which assesses AI performance on command-line programming tasks. Scores from Gemini 3 and Claude Opus trail slightly but fall within the benchmark's margin of error. On SWE-bench, which evaluates AI ability to resolve actual software bugs encountered in production environments, outcomes show no definitive superiority for GPT-5.2 over competitors. Benchmarking agentic applications proves challenging due to their complexity, and performance in practical scenarios differs markedly among leading models based on individual user interactions. The Codex macOS app includes several capabilities aimed at matching or surpassing features in Claude-based applications. Users can configure automations to execute in the background according to preset schedules, with outputs stored in a queue for later examination upon the user's return. Additionally, the app offers selectable agent personalities tailored to diverse preferences, spanning from pragmatic approaches focused on efficiency to empathetic styles that prioritize user alignment. OpenAI emphasizes the application's capacity to accelerate development processes significantly. Altman described this aspect, noting, "You can use this from a clean sheet of paper, brand new, to make a really quite sophisticated piece of software in a few hours." He continued, "As fast as I can type in new ideas, that is the limit of what can get built." These elements position the Codex macOS app within the competitive landscape of agentic coding tools, building on OpenAI's incremental expansions of the Codex platform since its initial command-line debut.
[22]
OpenAI Launches Codex App: Multi-Agent AI for Developers
On February 2, 2026, OpenAI announced the release of the Codex app for macOS, a specialized desktop interface that will transform the way engineers work together on lengthy software projects, manage AI agents, and execute concurrent activities. With this release, professional engineers and aspiring developers may now scale their productivity like never before thanks to a significant advancement in the integration of AI-assisted coding into actual development workflows. Fundamentally, the Codex app functions as a "command center for agents" - a potent hub where users can manage multiple AI agents, monitor project workflows, and oversee extensive, long-term operations that extend well beyond simple code development. According to OpenAI, existing tools such as traditional IDEs and command-line interfaces were not built to support this next wave of AI-assisted development, where agents operate autonomously for hours, days, or even weeks. The Codex app's capability to manage several agents at once is one of its unique features. Developers can switch between tasks without losing context because each agent operates in a separate thread that is arranged according to projects. Through the app's user-friendly interface, users may interact with an agent's output by examining changes, commenting on differences, and even making manual adjustments in their choice editor. Additionally, worktrees, a workflow innovation that allows many agents to work on separate copies of the same codebase without causing merge conflicts, are supported by the app. This facilitates the exploration of various implementation methodologies and the parallel experimentation with various features, which are difficult to accomplish with conventional single-threaded coding tools. The coding experience is changed from straightforward interactions with a single assistant to a well-coordinated arrangement of intelligent collaborators in this multi-agent environment. Each collaborator is able to do unique development tasks while staying in line with the project's overall objectives. Although Codex started out as a code-generating tool, it now serves as a more comprehensive productivity partner. The Codex app allows users to link the agent to external tools and workflows through a notion called skills. To enable the AI to communicate with tools such as design platforms, cloud deployment services, and project management systems, skills combine instructions, utilities, and scripts. Developers can utilize their abilities, for instance, to deploy completed apps to well-known cloud platforms like Vercel or Netlify, connect with issue trackers to track releases or triage issues, or retrieve design assets from tools like Figma and convert them into production-ready UI code. With these features, Codex becomes an independent workflow engine rather than just a coding assistant. Automations, which enable users to set up repetitive background actions, are another potent feature. Codex can run continuously because of automations, which relieve developers of time-consuming, repetitive tasks like creating progress summaries, automated testing, and daily issue triage. Security is the cornerstone upon which OpenAI has constructed the Codex app. Similar to the Codex CLI, the agent stack uses native, open-source system-level sandboxing. Agents can only edit files in their assigned directories by default; they cannot carry out tasks requiring elevated rights unless specifically permitted. In order to automate these permissions for trusted tasks, teams may also set up rules. Codex continues to be a secure and reliable tool for businesses of all sizes, from individual developers to enterprise teams, thanks to this careful balancing act between power and control. The Codex app is available to macOS users right now. All ChatGPT subscription tiers - Plus, Pro, Business, Enterprise, and Edu - include it, while users can choose to buy more usage credits. Even users of ChatGPT Free and Go have temporary access to Codex to investigate its features on several platforms, such as the cloud, IDE extensions, and CLI. In the future, OpenAI intends to bring the program to Windows and improve workflows based on feedback from actual developers. Future developments will probably include autonomous cloud-based triggers, which will allow Codex to function flawlessly in the background, as well as ongoing advancements to further simplify multi-agent communication. The Codex app represents a significant leap forward in AI-driven software development. By enabling seamless multi-agent orchestration, deep tool integration through skills, and intelligent automation, OpenAI has created a platform that not only boosts developer productivity but also reimagines how complex software projects can be built, with AI as a trusted collaborator at every stage.
[23]
New OpenAI Codex App Launches : Multi-Agent Worktrees on macOS
What if you could delegate your most tedious tasks to a virtual assistant that not only understands your workflow but actively improves it? OpenAI has today launched its new Codex app for macOS, a release that promises to redefine how developers and technical teams collaborate with artificial intelligence. With features like multi-agent management and advanced task automation, Codex isn't just another productivity booster, it's a bold step toward a future where AI becomes an indispensable partner in tackling complex challenges. This isn't about replacing human ingenuity but amplifying it, and the possibilities are as exciting as they are fantastic. Learn how the OpenAI Codex app has been designed to improve workflows by automating repetitive tasks, streamlining team collaboration, and offering customizable interaction styles to suit diverse working preferences. Whether you're a developer juggling multiple projects or a technical professional looking to standardize processes, this app offers something uniquely tailored to your needs. But Codex isn't just about efficiency, it's about empowering users to focus on what truly matters: solving problems, innovating, and creating. As we unpack its core features and potential applications, you might find yourself rethinking what's possible in your own work. OpenAI has introduced the Codex app for macOS, a sophisticated tool designed to enhance collaboration between developers and artificial intelligence. By incorporating features such as multi-agent management, skill integration, and advanced task automation, Codex aims to optimize workflows, improve productivity, and address complex challenges in software development and other technical fields. This release marks a significant advancement in AI-powered tools, offering a versatile platform for professionals seeking efficiency and innovation in their work. The Codex app is equipped with a range of features tailored to meet the needs of developers and technical teams. These functionalities are designed to simplify processes, reduce repetitive tasks, and enhance collaboration: Learn more about OpenAI Codex by reading our previous articles, guides and features : One of the standout features of Codex is its ability to manage multiple AI agents simultaneously. This functionality allows you to delegate tasks across agents, allowing parallel execution without losing context or efficiency. For instance, while one agent focuses on debugging code, another can generate documentation or run tests, all within the same project environment. Tasks are organized into structured worktrees, making sure conflict-free collaboration on shared repositories. This approach not only enhances scalability but also ensures that even the most complex projects can be managed effectively over time. Codex introduces a robust skill integration system that goes beyond basic code generation. You can develop, manage, and reuse skills tailored to specific tasks, such as automating repetitive actions, solving technical problems, or generating overviews. These skills can be shared across teams, fostering collaboration and making sure consistency in workflows. For example, a team working on a large-scale application can create a library of reusable skills to handle common processes, reducing errors and saving valuable time. This feature makes Codex a powerful tool for standardizing operations and improving team efficiency. The app's advanced automation capabilities allow you to schedule and execute repetitive tasks with minimal effort. By integrating reusable skills into these automations, Codex ensures that routine operations are completed efficiently and accurately. For example, tasks such as running tests, generating performance overviews, or updating documentation can be automated, freeing you to concentrate on strategic problem-solving and creative work. The ability to queue results for review further enhances the reliability and effectiveness of these automated processes. Recognizing the diverse working styles of developers, Codex offers customizable interaction modes to suit individual preferences. You can choose between pragmatic and conversational styles, tailoring the app's responses to your specific needs. The pragmatic mode delivers concise, task-focused outputs, ideal for quick decision-making, while the conversational mode provides detailed explanations and guidance for more in-depth understanding. This flexibility ensures that Codex adapts to your workflow, enhancing usability and making it a valuable tool for a wide range of technical professionals. Security is a fundamental aspect of Codex's design. The app employs system-level sandboxing to restrict agent access and protect sensitive data. Permissions for elevated commands can be configured, and project-specific rules can be established to ensure that agents operate within clearly defined boundaries. These security measures are particularly critical for developers working on confidential or high-stakes projects, providing peace of mind without compromising functionality. By prioritizing data protection, Codex enables you to focus on your work with confidence. The Codex app is currently available to macOS users with ChatGPT Plus, Pro, Business, Enterprise, and Edu subscriptions. Free and Go users can also access the app temporarily, although paid plans offer increased rate limits and additional features. This tiered subscription model ensures that Codex is accessible to a broad audience while catering to the specific needs of professional developers and organizations. By offering flexible options, OpenAI aims to make Codex a valuable tool for users at all levels of expertise. OpenAI has outlined ambitious plans to expand Codex's capabilities in the near future. Upcoming updates will include support for Windows, making the app accessible to a wider range of users. Additionally, the introduction of cloud-based triggers will enable agents to perform tasks continuously in the background, even when the app is not actively in use. These enhancements are designed to further integrate Codex into technical workflows, making sure it remains a vital tool for developers and other professionals. While Codex is primarily designed for software development, its potential applications extend far beyond this domain. The app's ability to reason about and generate code makes it a valuable resource for industries requiring technical expertise and complex problem-solving. Whether managing large-scale projects, automating repetitive tasks, or tackling individual challenges, Codex's advanced AI capabilities can help you achieve your objectives more efficiently. Its versatility positions it as a fantastic tool for knowledge work across various fields.
[24]
OpenAI Introduces Codex App With Agentic Coding for macOS
Developers can review agent diffs directly within project threads OpenAI has released the Codex app for macOS, a dedicated desktop application designed to help developers manage and collaborate with multiple AI coding agents at once. The new app gives programmers a macOS-native interface where teams of artificial intelligence (AI) agents can run tasks in parallel, organise workflows by thread, and interact with code repositories directly, the company said in its announcement. Notably, in December, the AI giant added Skills to Codex, letting users get more capabilities via specialised prompt structures. OpenAI Releases Codex App for macOS In a blog post, the company announced and detailed the new agentic vibe coding platform for macOS. The Codex app serves as a central hub for agent-driven coding tasks that can operate simultaneously. Traditionally, AI coding tools handle one task at a time or require context switches between browser tabs, text editors and terminal windows. The new app is designed to let multiple AI agents work on different tasks in parallel within the same environment, which OpenAI said can speed up development cycles and reduce manual coordination. Codex itself is a cloud-based software engineering agent that OpenAI first unveiled in April 2025. It draws from large language models (LLMs) fine-tuned for coding to translate natural-language prompts into code, propose pull requests, fix bugs, generate tests, and help with exploratory work across repositories. The Codex model powers tools such as GitHub Copilot and the Codex CLI, and is capable of working on tasks across multiple programming languages, including Python, JavaScript, Go and others. The Codex app includes built-in support for threads, wherein each coding agent runs in a separate stream tied to specific project tasks. Developers can open multiple threads for features such as new module creation, debugging sessions, or code review suggestions, and track each agent's output without context overlap. This organisation helps keep changes distinct, reducing the risk of conflicts when working with the same repository across different tasks. Additionally, the app also supports diff review and iteration within threads. Developers can inspect changes, comment on them, and refine code directly where the work was created, without leaving the context of the agent's output. Skills, defined as reusable workflows combining instructions, scripts, and resources, let teams create common tasks so agents can repeat them consistently across projects. OpenAI stated the Codex app is available for macOS, with support for Apple Silicon, and can be downloaded and installed directly on compatible systems. Users can sign in with their ChatGPT account or an OpenAI application programming interface (API) key to access project threads and agent workflows. The company is also working on bringing the app to additional platforms such as Windows and Linux in the coming months.
[25]
OpenAI launches Codex app to gain ground in AI coding race - The Economic Times
The app can also then use code to do things like gather and analyze information, OpenAI officials said during a briefing with reporters. Coding is arguably the most successful application for AI models in recent years. Coding tools are key to helping AI startups attract business customers and the space has grown increasingly competitive.OpenAI is launching a mobile app for its coding tool, Codex, in hopes of seizing momentum -- and customers -- from its rivals in the AI code-generation space. OpenAI said it designed its Codex app to make it easy for users to simultaneously manage multiple artificial-intelligence agents over a long period of time. The app can also then use code to do things like gather and analyze information, OpenAI officials said during a briefing with reporters. Coding is arguably the most successful application for AI models in recent years. Coding tools are key to helping AI startups attract business customers and the space has grown increasingly competitive. OpenAI lags behind its rivals in this area, particularly the AI startup Anthropic, which has dominated the coding market with its Claude Code tool. Anthropic says Claude Code reached $1 billion in revenue, on an annualized basis, in the six months after it was made available to the public. The Codex app, released on Monday, is part of OpenAI's redoubled efforts to gain ground. OpenAI said it's also tried to make the app more user-friendly to help make these advanced capabilities more accessible to the masses. Many observers of the coding space say the code-generation tools aren't yet good enough to entirely replace human tech workers. But these tools do make their work significantly faster. "The models just don't run out of dopamine," CEO Sam Altman said, relaying a recent conversation with a colleague. "They keep trying, they don't run out of motivation."
[26]
OpenAI Seeks to Catch-up with Claude with a macOS Codex App
The question though is whether Sam Altman's company delayed recognizing that enterprise AI is more valuable than weekly individual unique users When OpenAI first appeared as a harbinger of Artificial Intelligence (AI) three years ago, we were dazzled by how it helped us simplify our written work, right from emails to thank you notes, and from birthdays to news briefs. However, the true test followed a year later when the same AI tools began writing software. Three years down the line agentic software development is the in thing where agents work on coding tasks in relative independence. Claude Code and Cowork apps have captured headlines on this front putting Anthropic ahead of its competitors in the enterprise-ready AI solutions market. Even Google and Microsoft appear bemused at this strategic shift. However, the company that started it all appears to be only playing catch-up now. OpenAI began work on its Codex tool and launched a command line tool last April before expanding it into a web interface a month later. Now the company has put another foot forward by launching a new macOS app for Codex that integrates several agentic practices that gained popularity. Can't blame OpenAI for the delay though as when companies made a beeline towards creating enterprise-level solutions that would fulfil the need for grunt work among programmers, the demand for agents and sub-agents became so intense that keeping pace became tough. The new app from OpenAI works with multiple agents in parallel and integrates agent skills and other state-of-the-art workflows, the company says. Incidentally, the latest launch comes about two months after the company launched GPT-5.2 Codex, considered to be the post powerful to come out of Sam Altman's workshop. Now Altman is hoping that the app offers enough to users to move them over from Claude Code users. In a media interaction, he said "If you really want to do sophisticated work on something complex, 5.2 is the strongest model by far. However, it's been harder to use, so taking that level of model capability and putting it in a more flexible interface, we think is going to matter quite a bit," he noted. Of course, this could well be wishful thinking on his part. Why so? Because Anthropic has is already a few streets ahead with its agentic tool Cowork. The company launched plug-ins that further automates specialised tasks within various departments of an enterprise where it gets used. This allows the company to extend Claude Code with bespoke functionalities across projects or teams, has been made available within Cowork, which was released on January 12 for Claude Max subscribers operating on the Apple ecosystem. From Altman's perspective, GPT-5.2 is the ultimate and reports suggest that it does hold the top spot on TerminalBench, which is used to measure how well AI tackles command-line programming tasks. But agents from Gemini 3 and Claude Opus have now receive equivalent scores suggesting that OpenAI does not have any significant advantage. As for the Codex app, OpenAI says its features would help it achieve parity and in some cases outpace various Claude apps. It allows automations that can be set up to run in the background on an automatic schedule. In addition, users can select different personalities for the agent depending on their working style. The verdict is still not out on this one functionality. As for Altman, he believes that his company has a winner on its hands due to the pace at which these developments have been made possible by AI. "You can use this from a clean sheet of paper, brand new, to make a really quite sophisticated piece of software in a few hours. As fast as I can type in new ideas, that is the limit of what can get built," he claims. For now, we can only wait and see whether Claude users agree.
[27]
OpenAI's GPT-5.3 Codex Signals Direct Challenge to Anthropic's Claude Opus 4.6
OpenAI has launched a new AI model called GPT-5.3 Codex, along with a new enterprise platform, Frontier. GPT-5.3 Codex is aimed at boosting Codex, OpenAI's agentic coding tool marketed for software developers. Frontier aims to help enterprises build, deploy and manage AI agents. Earlier versions of Codex mainly focused on helping developers write and review code. With the new GPT-5.3 Codex model, OpenAI says Codex can now do much more. According to the company, the model turns Codex into an agent that can do 'nearly anything developers and professionals do on a computer, expanding who can build software and how work gets done.' OpenAI stated in the blog post, "GPT-5.3 Codex performs better on several benchmarks and is 25 percent ." The company also claimed, "New model can create highly functional complex games and apps from scratch over the course of days." As part of internal testing, the company used Codex to build two web-based games, consuming millions of tokens over extended runs. " sets a new industry high on SWE-Bench Pro and Terminal-Bench, and shows strong performance on OSWorld and GDPval, four benchmarks we use to measure coding, agentic and real-world capabilities," OpenAI mentioned. It also better understands your intent when you ask it to build day-to-day websites than GPT-5.2-Codex.
[28]
OpenAI launches Codex app to gain ground in AI coding race
SAN FRANCISCO, Feb 2 (Reuters) - OpenAI is launching a mobile app for its coding tool, Codex, in hopes of seizing momentum -- and customers -- from its rivals in the AI code-generation space. OpenAI said it designed its Codex app to make it easy for users to simultaneously manage multiple artificial-intelligence agents over a long period of time. The app can also then use code to do things like gather and analyze information, OpenAI officials said during a briefing with reporters. Coding is arguably the most successful application for AI models in recent years. Coding tools are key to helping AI startups attract business customers and the space has grown increasingly competitive. OpenAI lags behind its rivals in this area, particularly the AI startup Anthropic, which has dominated the coding market with its Claude Code tool. Anthropic says Claude Code reached $1 billion in revenue, on an annualized basis, in the six months after it was made available to the public. The Codex app, released on Monday, is part of OpenAI's redoubled efforts to gain ground. OpenAI said it's also tried to make the app more user-friendly to help make these advanced capabilities more accessible to the masses. Many observers of the coding space say the code-generation tools aren't yet good enough to entirely replace human tech workers. But these tools do make their work significantly faster. "The models just don't run out of dopamine," CEO Sam Altman said, relaying a recent conversation with a colleague. "They keep trying, they don't run out of motivation." (Reporting by Deepa Seetharaman in San Francisco; Editing by Mark Porter)
[29]
OpenAI brings GPT 5.3 Codex model and Frontier platform: What it means for AI users
OpenAI says GPT-5.3 Codex performs better on several benchmarks and is 25 percent faster than the previous version. OpenAI has launched a new AI model called GPT-5.3 Codex, along with a new enterprise platform named Frontier. GPT-5.3 Codex is designed to boost Codex, OpenAI's agentic coding tool marketed for software developers. Meanwhile, Frontier aims to help enterprises build, deploy and manage AI agents. Earlier versions of Codex mainly focused on helping developers write and review code. With the new GPT-5.3 Codex model, OpenAI says Codex can now do much more. According to the company, the model turns Codex into an agent that can do 'nearly anything developers and professionals do on a computer, expanding who can build software and how work gets done.' OpenAI says GPT-5.3 Codex performs better on several benchmarks and is 25 percent faster than the previous version. The company also claims the model can create 'highly functional complex games and apps from scratch over the course of days.' Also read: Anthropic mocks OpenAI's idea of bringing ads to ChatGPT, Sam Altman responds One notable detail about GPT-5.3 Codex is how it was built. OpenAI says it is the first model that 'was instrumental in creating itself.' Early versions of the model were used by OpenAI staff to debug the system and check how well it was working. Alongside GPT-5.3 Codex, OpenAI unveiled Frontier, a platform made for businesses that want to use AI agents at scale. Frontier helps companies build, manage, and control AI agents, including ones created outside OpenAI. Also read: Apple iPhone 18 Pro Max and iPhone 18 Pro leaks: Launch timeline, India pricing, specs, and more OpenAI says Frontier works like managing employees, with onboarding and feedback systems to help agents improve over time. Companies like HP, Oracle, State Farm, and Uber are already using it, though Frontier is currently limited to select users and will roll out more widely in the coming months. For AI users, these launches hint at a future where AI agents play a bigger role in everyday work, from coding to enterprise operations. Also read: OpenAI models to power Amazon's Alexa and other AI products: Report
[30]
OpenAI launches Codex macOS app agentic coding: Features, availability and other details
The app includes a library of skills for tools and workflows that have become popular at OpenAI. OpenAI has launched a new Codex app for macOS, a new interface designed to make working with AI coding agents easier, faster, and more collaborative. The app allows developers to manage multiple agents at once, run tasks in parallel, and oversee long-running projects, changing how software is built. Since its debut in April 2025, Codex has enabled developers to handle complex tasks from start to finish. The new macOS app acts as a central command centre, providing a workspace to supervise multiple AI agents, review their work, and collaborate more efficiently than traditional IDEs or terminal tools allow. Also read: Elon Musk's SpaceX acquires xAI with plans for AI data centres in space The Codex app lets developers work with multiple agents simultaneously, each running in separate threads organised by project. Users can switch between tasks without losing context, review code changes, comment on diffs, and open files in their preferred editor for manual adjustments. This setup ensures developers remain fully in control while agents handle routine coding work. To prevent conflicts when multiple agents work on the same repository, the app includes support for worktrees, providing each agent with an isolated copy of the code. Developers can explore different approaches without worrying about breaking the main project. They can review changes locally or let agents continue progressing independently. The app includes a library of skills for tools and workflows that have become popular at OpenAI. Developers can convert Figma designs into production-ready UI code, manage bugs and releases in Linear, deploy apps to popular cloud platforms, generate images for UI mockups and product visuals, reference OpenAI API documentation, and create or edit PDFs, spreadsheets, and docx files. Also read: Apple may launch Samsung Galaxy Z Flip competitor anytime soon: Report The Codex macOS app is now available. Users with a ChatGPT Plus, Pro, Business, Enterprise or Edu subscription can use Codex across the CLI, web, IDE-extension and app with their ChatGPT login. For a limited time, ChatGPT Free and Go users can access Codex as well. OpenAI is also temporarily doubling rate limits for all paid plans.
Share
Share
Copy Link
OpenAI released a new MacOS app for Codex, its AI coding tool, alongside the GPT-5.3-Codex model that runs 25% faster than previous versions. The launch marks OpenAI's push to catch up with Anthropic's Claude Code, which has dominated the agentic coding market and reached $1 billion in annualized revenue within six months of public release.
OpenAI unveiled a new MacOS app for Codex on Monday, marking a significant step in the company's effort to catch up with competitors in the rapidly evolving AI coding landscape
1
. The launch comes as agentic coding—where AI agents work independently on programming tasks—has become the dominant trend in software development1
. OpenAI faces stiff competition from Anthropic, whose Claude Code tool has captured significant market share and reportedly reached $1 billion in annualized revenue within six months of public availability4
.
Source: Inc.
The new Codex app provides developers with a centralized workspace to manage multiple AI coding agents across projects without losing task context
5
. According to OpenAI, agents run in separate threads organized by projects, allowing developers to switch between tasks seamlessly. Sam Altman, OpenAI's CEO, emphasized the app's ability to accelerate development: "You can use this from a clean sheet of paper, brand new, to make a really quite sophisticated piece of software in a few hours"1
.Days after the MacOS app launch, OpenAI announced GPT-5.3-Codex, describing it as the most capable agentic coding model yet released
2
. The new model runs 25% faster than previous versions while using fewer resources, enabling longer-running tasks that can span more than one day2
3
. The model sets new industry benchmarks on SWE-Bench Pro and TerminalBench, though competitors from Gemini 3 and Claude Opus have logged roughly equivalent scores within the margin of error2
.
Source: Analytics Insight
Remarkably, GPT-5.3-Codex was instrumental in creating itself, with the Codex team using early versions to debug its own training, manage deployment, and diagnose test results
3
. Alexander Embiricos, OpenAI's product lead for Codex, stated in December that "the vast majority of Codex is built by Codex, so it's almost entirely just being used to improve itself"3
. This self-improvement capability aligns with Sam Altman's stated goal of building an AI research intern by September 2026 and a fully automated AI researcher by March 20283
.GPT-5.3-Codex represents a shift from code-generation to comprehensive workflow support, handling debugging, deployment, monitoring, writing PRDs, editing copy, user research, tests, and metrics
2
. OpenAI describes it as moving "from an agent that can write and review code to an agent that can do nearly anything developers and professionals can do on a computer"2
. The model can build slide decks and spreadsheets, extending its utility beyond traditional programming tasks2
.
Source: ET
The new app includes features designed to improve user experience, including background automations that run on automatic schedules with results queued for review
1
. Users can select different agent personalities—from pragmatic to empathetic—depending on their working style1
. The model also better understands intent, with simple or underspecified prompts now defaulting to sites with more functionality and sensible defaults2
.Related Stories
The launch positions OpenAI to compete more directly with Anthropic, which recently introduced Cowork, extending Claude Code beyond programming into broader enterprise workflows
5
. Sam Altman acknowledged the competitive pressure, stating that GPT-5.2 "is the strongest model by far" for sophisticated work, though "it's been harder to use"1
. The new interface aims to make that power more accessible to attract Claude Code users.While many observers note that autonomous AI coding tools aren't yet good enough to entirely replace human tech workers, they significantly accelerate development work
4
. Altman highlighted this endurance advantage: "The models just don't run out of dopamine. They keep trying, they don't run out of motivation"4
. As enterprises weigh the impact of these autonomous AI coding tools on development workflows and governance, the competition between OpenAI and Anthropic will likely shape how organizations approach the software development lifecycle in the coming months.Summarized by
Navi
[3]
19 Dec 2025•Technology

19 Nov 2025•Technology

15 Sept 2025•Technology

1
Policy and Regulation

2
Technology

3
Policy and Regulation
