Curated by THEOUTPOST
On Sat, 17 May, 12:05 AM UTC
13 Sources
[1]
OpenAI introduces Codex, its first full-fledged AI agent for coding
We've been expecting it for a while, and now it's here: OpenAI has introduced an agentic coding tool called Codex in research preview. The tool is meant to allow experienced developers to delegate rote and relatively simple programming tasks to an AI agent that will generate production-ready code and show its work along the way. Codex is a unique interface (not to be confused with the Codex CLI tool introduced by OpenAI last month) that can be reached from the side bar in the ChatGPT web app. Users enter a prompt and then click either "code" to have it begin producing code, or "ask" to have it answer questions and advise. Whenever it's given a task, that task is performed in a distinct container that is preloaded with the user's codebase and is meant to accurately reflect their development environment. To make Codex more effective, developers can include an "AGENTS.md" file in the repo with custom instructions, for example to contextualize and explain the code base or to communicate standardizations and style practices for the project -- kind of a README.md but for AI agents rather than humans. Codex is built on codex-1, a fine-tuned variation of OpenAI's o3 reasoning model that was trained using reinforcement learning on a wide range of coding tasks to analyze and generate code, and to iterate through tests along the way. OpenAI's announcement post about Codex is filled with objection handling to tackle the common refrains against AI coding agents; based on older tools and models, many developers accurately point out that LLM coding tools (especially when used for vibe coding instead of just for code completion or as an advisor) have been known to produce scripts that don't follow standards, are opaque or difficult to debug, or are insecure. The fine tuning that led to codex-1 is meant to address these concerns in part, and it's also key that Codex shows its thinking and work every step of the way as it goes through its tasks (which can take anywhere from one to 30 minutes to complete). All that said, OpenAI notes that "it still remains essential for users to manually review and validate all agent-generated code before integration and execution." Codex is available in a research preview, but it's rolling out to all ChatGPT Pro, Enterprise, and Team users now. Plus and Edu support is coming at a later date. For now, "users will have generous access at no additional cost for the coming weeks" so that they "can explore what Codex can do," but OpenAI says it intends to introduce rate limits and a new pricing scheme later.
[2]
OpenAI launches Codex, an AI coding agent, in ChatGPT
OpenAI announced on Friday it's launching a research preview of Codex, the company's most capable AI coding agent yet. Codex is powered by codex-1, a version of the company's o3 AI reasoning model optimized for software engineering tasks. OpenAI says codex-1 produces "cleaner" code than o3, adheres more precisely to instructions, and will iteratively run tests on its code until passing results are achieved. The Codex agent runs in a sandboxed, virtual computer in the cloud. By connecting with GitHub, Codex's environment can come preloaded with your code repositories. OpenAI says the AI coding agent will take anywhere from one to 30 minutes to write simple features, fix bugs, answer questions about your codebase, and run tests, among other tasks. Codex can handle multiple software engineering tasks simultaneously, says OpenAI, and it doesn't limit users from accessing their computer and browser while it's running. Codex is rolling out starting today to subscribers to ChatGPT Pro, Enterprise, and Team. OpenAI says users will have "generous access" to Codex to start, but in the coming weeks, the company will implement rate limits for the tool. Users will then have the option to purchase additional credits to use Codex, an OpenAI spokesperson tells TechCrunch. OpenAI plans to expand Codex access to ChatGPT Plus and Edu users soon. AI tools for software engineers, also known as vibe coders, have surged in popularity in recent months. The CEOs of Google and Microsoft claim that roughly 30% of their companies' code is now written by AI. In February, Anthropic released its own agentic coding tool, Claude Code, and in April, Google updated its AI coding assistant, Gemini Code Assist, with more agentic abilities. All that vibe coding has made the businesses behind AI coding platforms some of the fastest-growing in tech. Cursor, among the most popular AI coding tools, reached annualized revenue of around $300 million in April and is reportedly raising new funds at a $9 billion valuation. Now, OpenAI wants a piece of the pie. The ChatGPT maker has reportedly closed on a deal to acquire Windsurf, the developer behind another popular AI coding platform, for $3 billion. The launch of Codex shows very clearly that OpenAI is building out its own AI coding tools, in addition. Users with access to Codex can find the tool in ChatGPT's sidebar, and assign the agent new coding tasks by typing a prompt and clicking the "Code" button. Users can also ask questions about their codebase and click the "Ask" button. Below the prompting bar, users can see other tasks they've assigned Codex to do, and monitor their progress. In a briefing ahead of Codex's launch, OpenAI's Agents Research Lead, Josh Tobin, told TechCrunch the company eventually wants its AI coding agents to act as "virtual teammates," completing tasks autonomously that take human engineers "hours or even days" to accomplish. OpenAI claims it's already using Codex internally to offload repetitive tasks, scaffold new features, and draft documentation. OpenAI Product Lead Alexander Embiricos says a lot of the safety work for the company's o3 model applies to Codex as well. In a blog post, OpenAI says Codex will reliably refuse requests to develop "malicious software." Furthermore, Codex operates in an air-gapped environment, with no access to the broader internet or external APIs. This limits how dangerous Codex could be in the hands of a bad actor -- but it may also hamper its usefulness. It's worth noting that AI coding agents, much like all generative AI systems today, are prone to mistakes. A recent study from Microsoft found that industry-leading AI coding models, such as Claude 3.7 Sonnet and o3-mini, struggled to reliably debug software. However, that doesn't seem to be dampening investor excitement in these tools. OpenAI is also updating Codex CLI, the company's recently launched open-source coding agent that runs in your terminal, with a version of its o4-mini model that's optimized for software engineering. That model is now the default in Codex CLI, and will be available in OpenAI's API for $1.50 per 1M input tokens (roughly 750,000 words, more than the entire Lord of the Rings book series) and $6 per 1M output tokens. Codex's launch marks OpenAI's latest effort to beef up ChatGPT with additional products besides the notorious chatbot. In the past year, OpenAI has added priority access to the company's AI video platform, Sora, its research agent, Deep Research, as well as its web browsing agent, Operator, as benefits for subscribers. These offerings could entice more users to sign up for a ChatGPT subscription, and, in the case of Codex specifically, convince existing subscribers to pay OpenAI more money for increased rate limits.
[3]
OpenAI Launches an Agentic, Web-Based Vibe-Coding Tool
With vibe coding all the rage, OpenAI says Codex can take on more development chores in a safe and explainable way. OpenAI is launching a cloud-based software engineering agent called Codex in an attempt to ride a wave of hype surrounding vibe coding or building software using AI. It says this tool will let developers automate more of their work in a way that should be both safer and less opaque than existing tools. OpenAI's Codex is available through the web for ChatGPT Pro users from today. It can generate lines of code but also move through directories and run commands inside a virtual computer, automating more of the work that developers go through when writing code. "We're about to undergo a pretty seismic shift in terms of how developers can be most accelerated by agents," says Alexander Embiricos, a member of the product team at OpenAI working on agents. The latest models from rivals Anthropic and Google are already both highly skilled at coding. This OpenAI launch has pre-empted Google's expected release of a more capable coding tool at its I/O event next week, according to a report in The Information. According to numerous reports, OpenAI is in talks to acquire Windsurf (formerly Codeium), a startup that makes a popular AI coding tool, for $3billion. A key challenge with vibe-coding is that delegating to AI can result in software that is opaque and more difficult for a person to understand and fix when bugs creep in. OpenAI says the model behind Codex has been trained to explain what it is doing more clearly and help developers fix what they are building, and that the use of a virtual computer makes the system safer by design. It is already possible to write and analyze code using ChatGPT and similar chatbots. OpenAI already offers a Codex command-line tool that can generate code. The new web-based Codex, which OpenAI calls "research previous," runs its own mini computer within a browser. This allows it to run commands, explore folders and files, and test the code it has written autonomously. "That's really the way that we think most development is going to happen in the future," Embiricos says. "The agent will work on its own computer and will delegate to it." OpenAI says that Codex is being used by outside companies including Cisco, Temporal, Superhuman, and Kodiak. Vibe-coding has become a phenomenon thanks to a generation of AI models that are remarkably good at writing and fixing code. The same models allow more skilled developers to speed up their work, too. OpenAI has launched two other agentic AI tools over the past year: Operator, which controls a web browser and can automate online chores, and Deep Research, which carries out detailed web search and analysis in order to compiler reports. Josh Tobin, who leads the agents research team at OpenAI, says Codex reflects a bigger vision for ChatGPT to evolve from a chatbot into a teammate. "We think that ChatGPT will become almost like a virtual coworker," Tobin says. "Where you can go to it not just for answers to quick questions, [but also to] collaborate with it on larger chunks of work across a wide range of different tasks."
[4]
OpenAI Takes on Google, Anthropic With New AI Agent for Coders
OpenAI is rolling out a new artificial intelligence agent for ChatGPT users that's designed to help streamline software development as the company pushes into a crowded market of startups and large tech firms offering AI tools for coders. The agent, called Codex, will be able to write software features, fix bugs and run tests, the company said in a blog post Friday. Codex, which is still in the early stages and has limited functionality, is geared towards workers with some technical knowledge and will first be released as a "research preview" to paid ChatGPT Pro, Enterprise and Team users.
[5]
ChatGPT rolls out Codex, an AI tool for software programming
OpenAI is rolling out 'Codex' for ChatGPT, which is an AI agent that automates and delegates programming tasks for software engineers. OpenAI isn't explicitly claiming that Codex will eventually replace junior software engineers. Instead, the company states Codex could help developers achieve more by delegating their tasks to different agents. The idea is to move faster with development and become more productive with AI, but how does Codex work? According to OpenAI, Codex is based on codex-1, which is a new version of ChatGPT based on the existing o3 model, but it has been optimized for coding, which results in increased accuracy. Codex pulls codebase from Github and closely mirrors the existing PR style. It can write new code, propose pull requests, and run each task in its own sandbox. "Task completion typically takes between 1 and 30 minutes, depending on complexity, and you can monitor Codex's progress in real time," OpenAI noted in a blog post. "Once Codex completes a task, it commits its changes in its environment. Codex provides verifiable evidence of its actions through citations of terminal logs and test outputs, allowing you to trace each step taken during task completion." Codex is rolling out, but only if you've a Pro subscription, which costs $200 per month.
[6]
OpenAI launches research preview of Codex AI software engineering agent for developers -- with parallel tasking
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Surprise! Just days after reports emerged suggesting OpenAI was buying white-hot coding startup Windsurf, the former company appears to be launching its own competitor service as a research preview under its brand name Codex, going head-to-head against Windsurf, Cursor, and the growing list of AI coding tools offered by startups and large tech companies including Microsoft and Amazon. Unlike OpenAI's previous Codex code completion AI model, the new version is a full cloud-based AI software engineering (SWE) agent that can execute multiple development tasks in parallel. Starting today it will be available for ChatGPT Pro, Enterprise, and Team users, with support for Plus and Edu users expected soon. Codex's evolution: from model to autonomous AI coding agent This release marks a significant step forward in Codex's development. The original Codex debuted in 2021 as a model for translating natural language into code available through OpenAI's nascent application programming interface. It was the engine behind GitHub Copilot, the popular autocomplete-style coding assistant designed to work within IDEs like Visual Studio Code. That initial iteration focused on code generation and completion, trained on billions of lines of public source code. However, the early version came with limitations. It was prone to syntactic errors, insecure code suggestions, and biases embedded in its training data. Codex occasionally proposed superficially correct code that failed functionally, and in some cases, made problematic associations based on prompts. Despite those flaws, it showed enough promise to establish AI coding tools as a rapidly growing product category. That original model has since been deprecated and turned into the name of a new suite of products, according to an OpenAI spokesperson. GitHub Copilot officially transitioned off OpenAI's Codex model in March 2023, adopting GPT-4 as part of its Copilot X upgrade to enable deeper IDE integration, chat capabilities, and more context-aware code suggestions. Agentic visions The new Codex goes far beyond its predecessor. Now built to act autonomously over longer durations, Codex can write features, fix bugs, answer codebase-specific questions, run tests, and propose pull requests -- each task running in a secure, isolated cloud sandbox. The design reflects OpenAI's broader ambition to move beyond quick answers and into collaborative work. Josh Tobin, who leads the Agents Research Team at OpenAI, said during a recent briefing: "We think of agents as AI systems that can operate on your behalf for a longer period of time to accomplish big chunks of work by interacting with the real world." Codex fits squarely into this definition. "Our vision is that ChatGPT will become almost like a virtual coworker -- not just answering quick questions, but collaborating on substantial work across a range of tasks," he added. New capabilities, new interface, new workflows Codex tasks are initiated through a sidebar interface in ChatGPT, allowing users to prompt the agent with tasks or questions. The agent processes each request in an air-gapped environment loaded with the user's repository and configured to mirror the development setup. It logs its actions, cites test outputs, and summarizes changes -- making its work traceable and reviewable. Alexander Embiricos, head of OpenAI's Desktop & Agents team (and the former CEO and co-founder of screenshare collaboration startup Multi that OpenAI acquired for an undisclosed sum last year) said in a briefing with journalists that "the Codex agent is a cloud-based software engineering agent that can work on many tasks in parallel, with its own computer to run safely and independently." Internally, he said, engineers already use it "like a morning to-do list -- fire off tasks to Codex and return to a batch of draft solutions ready to review or merge." Codex also supports configuration through AGENTS.md files -- project-level guides that teach the agent how to navigate a codebase, run specific tests, and follow house coding styles. "We trained our model to read code and infer style -- like whether or not to use an Oxford comma -- because code style matters as much as correctness," Embiricos said. Security and practical use Codex executes tasks without internet access, drawing only on user-provided code and dependencies. This design ensures secure operation and minimizes potential misuse. "This is more than just a model API," said Embiricos. "Because it runs in an air-gapped environment with human review, we can give the model a lot more freedom safely." OpenAI also reports early external use cases. Cisco is evaluating Codex for accelerating engineering work across its product lines. Temporal uses it to run background tasks like debugging and test writing. Superhuman leverages Codex to improve test coverage and enable non-engineers to suggest lightweight code changes. Kodiak, an autonomous vehicle firm, applies it to improve code reliability and gain insights into unfamiliar stack components. OpenAI is also rolling out updates to Codex CLI, its lightweight terminal agent for local development. The CLI now uses a smaller model -- codex-mini-latest -- optimized for low-latency editing and Q&A. The pricing is set at $1.50 per million input tokens and $6 per million output tokens, with a 75% caching discount. Codex is currently free to use during the rollout period, with rate limits and on-demand pricing options planned. Does this mean OpenAI IS NOT buying Windsurf? *Thinking face emoji* The release of Codex comes amid increased competition in the AI coding tools space -- and signals that OpenAI is intent on building, rather than buying, its next phase of products. According to recent data from SimilarWeb, traffic to developer-focused AI tools has surged by 75% over the past 12 weeks, underscoring the growing demand for coding assistants as essential infrastructure rather than experimental add-ons. Reports from TechCrunch and Bloomberg suggest OpenAI held acquisition talks with fast-growing AI dev tool startups Cursor and Windsurf. Cursor allegedly walked away from the table; Windsurf reportedly agreed in principle to be acquired by OpenAI for a price of $3 billion, though no deal has been officially confirmed by either OpenAI or Windsurf. Just yesterday, in fact, Windsurf debuted its own family of coding-focused foundation models, SWE-1, purpose-built to support the full software engineering lifecycle, from debugging to long-running project maintenance. SWE-1 models were reported custom made, trained entirely in-house using a new sequential data model tailored to real-world development workflows. Many things may be happening behind the scenes between the two companies, but to me, the timing of Windsurf launching its own coding foundation model -- instead of its strategy to-date of using Llama variants and giving users the option to slot in OpenAI and Anthropic models -- followed one day later by OpenAI releasing its own Windsurf competitor, seems to suggest the two are not aligning soon. But on the other hand, the fact that this new Codex AI SWE agent is in "research preview" to start may be a form of OpenAI pressuring Windsurf or Cursor or anyone else to come to the bargaining table and strike a deal. Asked about the potential for a Windsurf acquisition and reports of one thereof, an OpenAI spokesperson told VentureBeat they had nothing to share on that front. In either case, Embiricos frames Codex as far more than a mere code tool or assistant. "We're about to undergo a seismic shift in how developers work with agents -- not just pairing with them in real time, but fully delegating tasks," he said. "The first experiments were just reasoning models with terminal access. The experience was magical -- they started doing things for us." Built for dev teams, not merely solo devs Codex is designed with professional developers in mind, but Embiricos noted that even product managers have found it helpful for suggesting or validating changes before pulling in human SWEs. This versatility reflects OpenAI's strategy of building tools that augment productivity across technical teams. Trini, an engineering lead on the project, summarized the broader ambition behind Codex: "This is a transformative change in how software engineers interface with AI and computers in general. It amplifies each person's potential." OpenAI envisions Codex as the centerpiece of a new development workflow where engineers assign high-level tasks to agents and collaborate with them asynchronously. The company is building toward deeper integrations across GitHub, ChatGPT Desktop, issue trackers, and CI systems. The long-term goal is to blend real-time pairing and long-horizon task delegation into a seamless development experience. As Josh Tobin put it, "Coding underpins so many useful things across the economy. Accelerating coding is a particularly high-leverage way to distribute the benefits of AI to humanity, including ourselves." Whether or not OpenAI closes deals for competitors, the message is clear: Codex is here, and OpenAI is betting on its own agents to lead the next chapter in developer productivity.
[7]
OpenAI's New Codex Agents Get Closer to Downsizing Your Dev Team - Decrypt
OpenAI just unveiled the latest iteration of Codex -- a system of cloud-powered AI agents that can tackle multiple programming tasks simultaneously without tying up your laptop's resources. The announcement comes just one month after OpenAI released its free, open-source version, "Codex CLI." Unlike the normal code-completion or code-generation tools we've become accustomed to during vibe coding sessions -- in which we iterate with an AI chatbot several times until the results are satisfactory -- Codex operates as a semi-autonomous agent with its own computing resources. It's capable of handling everything from bug hunting to complex refactoring. "Software engineering is changing, and by the end of 2025 it's going to look fundamentally different," OpenAI President Greg Brockman said in an official presentation. "Today, we're going to take a step towards where we think software engineering is going, and we are releasing a new system, which is a remote software agent that can run many tasks in parallel." The demo showed how OpenAI developers deployed multiple AI agents working in parallel on different portions of a codebase, each in its own isolated environment, doing different tasks in parallel with minimal human guidance. "This change probably would have taken me at least 30 minutes or even hours to debug," Hansen, one team member, noted during the presentation, while the AI completed the task in the background during their conversation. "We find Codex to be as trustworthy, if not more trustworthy, than our own co-workers," Katie, another OpenAI researcher, claimed. Unlike the previous CLI version, this new Codex runs on OpenAI's infrastructure instead of locally. Its agentic architecture also means the AI can review and improve itself. In other words, developers can fire off multiple coding tasks simultaneously and return later to review the results, because an agent is able to review its own outputs -- and fix its own bugs. The system is powered by a new model called Codex One, which OpenAI described as its "best coding model to date." "We've taken o3 and we've optimized it for not just the benchmarks, but really for the kind of code that people actually want to merge into their codebase," Brockman said. OpenAI emphasized that Codex is still a "research preview," and said there's more development ahead. But you'll have to pay pay for OpenAI's premium tier you want to test it. Initially, Codex is available for ChatGPT Pro, Enterprise, and Teams users, with plans to expand to Plus and Edu users later in the future. OpenAI said it is starting with "very generous rate limits" and no additional pricing, though this will change as it gathers usage data and feedback. Looking ahead, OpenAI plans to integrate Codex with issue trackers and CLI systems, potentially automating even more of the development lifecycle. The company also continues to develop Codex CLI, the open-source, local agent that runs on developers' own machines, and envisions a future where local and remote versions work together seamlessly. "What you really want is a remote co-worker with its own computer, but who can also look over your shoulder," Brockman explained. "You're there typing away, working on some change, and you're like, 'Ah, I want to go to lunch. Codex, can you finish this?' It just takes it over seamlessly and runs it in the cloud."
[8]
OpenAI updates ChatGPT with coding-optimized Codex AI agent - SiliconANGLE
OpenAI updates ChatGPT with coding-optimized Codex AI agent OpenAI today debuted a new artificial intelligence agent, Codex, that can help developers write code and fix bugs. The tool is available through a sidebar in ChatGPT's interface. One button in the sidebar configures Codex to generate new code based on user instructions, while another allows it to answer questions about existing code. Prompt responses take between one and 30 minutes to generate based on the complexity of the request. Codex is powered by a new AI model called codex-1. It's a version of o3, OpenAI's most capable reasoning model, that has been optimized for programming tasks. The ChatGPT developer fine-tuned Codex by training it on a set of real-world coding tasks. Those tasks involved a range of software environments. A piece of software that runs well in one environment, such as a cloud platform, may not run as efficiently on a Linux server or a developer's desktop, if at all. As a result, an AI model's training dataset must include technical information about every environment that it will be expected to use. OpenAI used reinforcement learning to train codex-1. It's a way of developing AI models that relies on trial and error to boost output quality. When a neural network completes a task correctly, it's given a virtual reward, while incorrect answers lead to penalties that encourage the algorithm to come up with a better approach. In a series of coding tests carried out by OpenAI, Codex achieved an accuracy rate of 75%. That's 5% better than the most capable, hardware-intensive version of o3. OpenAI's first-generation reasoning model, o1, scored 11%. Codex carries out coding tasks in isolated software containers that don't have web access. According to OpenAI, the agent launches a separate container for each task. Developers can customize those development environments by uploading a text file called AGENTS.md. The file may describe what programs Codex should install, how AI-generated code should be tested for bugs and related details. Using AGENTS.md, developers can ensure that the container in which Codex generates code is configured the same way as the production system on which the code will run. That reduces the need to modify the code before releasing it to production. Developers can monitor Codex while it's generating code. After the tool completes a task, it provides technical data that can be used to review each step of the workflow. It's possible to request revisions if the code doesn't meet project requirements. OpenAI started rolling out Codex to ChatGPT today as a research preview. It will initially provide "generous access at no additional cost." In a few weeks, OpenAI will switch Codex to lower rate limits with "flexible pricing options that let you purchase additional usage on-demand." The ChatGPT developer also plans to expand Codex's feature set. One upcoming capability will allow users to provide the agent with instructions while it's in the middle of a task. Additionally, OpenAI plans to integrate Codex with more developer tools. One of the upcoming integrations will be for Codex CLI, an open-source application that OpenAI released last month. It's an AI coding assistant that developers can install on their desktops and access from the command line. OpenAI debuted a new version of Codex CLI in conjunction with the release of Codex today.
[9]
OpenAI launches Codex, an AI agent for coding
OpenAI launched a research preview on Friday of what it's calling its most capable AI coding agent yet. Codex, a cloud-based software engineering agent, can write features, answer questions about a codebase, fix bugs, and propose pull requests for review. Several tasks can run simultaneously, and users retain full access to their computers while the agent takes anywhere from one to 30 minutes to complete a task. Since it's still in research preview, the tool remains in early development. The company said in a blog post that it "currently lacks features like image inputs for front-end work, and the ability to course-correct the agent while it's working. Additionally, delegating to a remote agent takes longer than interactive editing, which can take some getting used to." Over time, however, the company said using the service will feel more like asynchronous collaboration with colleagues.
[10]
ChatGPT's New Coding Agent Is Huge, Even if You Aren't a Programmer
Apple Maps' New Expert Ratings Make It Easier to Find the Best Places to Eat ChatGPT does a pretty good job of generating code from text prompts and breaking it down. Now, OpenAI has added a new coding agent to ChatGPT, and it's not just programmers who should be excited ChatGPT's Codex Takes AI Programming to the Next Level OpenAI is launching a research preview of Codex, a "cloud-based software engineering agent." The feature is powered by codex-1, a version of the OpenAI o3 model optimized for coding and software engineering tasks. Codex-1 is also trained to align its output closely with "human coding preferences and standards." You can find the feature in the ChatGPT sidebar if you're a ChatGPT Pro, Enterprise, or Team user, with Plus and Edu users getting it soon. Once open, you can either assign it a coding task by typing a prompt and entering the Code button or ask questions about your codebase using the Ask button. You'll find information on Codex's task list and progress below the prompt bar. This new agent can perform multiple tasks on an existing codebase, like adding new features, fixing bugs, and answering any questions you might have. Each task runs in a separate isolated environment, preloaded with your codebase or repository. Codex can read and edit files as well. OpenAI's announcement claims that the agent will take anywhere between one to 30 minutes to complete an assigned task, depending on the complexity of the task. You can monitor its progress in real-time or even run multiple tasks simultaneously, all while using your browser and computer as usual. While ChatGPT can help you generate code and even provide entire projects that you can download and test, it doesn't work well with software repositories and codebases. Codex's abilities to with within typical software engineering infrastructures means it's a lot more useful compared to vanilla ChatGPT to both companies and individuals who maintain multiple projects in repositories. Codex produces cleaner code compared to ChatGPT, which is ready for human review and integration into workflows or codebases. It also runs tests until it passes all the given test cases and conditions. Once a task is completed, Codex will commit the changes to its environment and provide "verifiable evidence of its actions through citations of terminal logs and test outputs." Why Is Codex a Big Deal? Codex is a big deal for professionals in any industry. You can write Excel macros, automate reports, batch edit files, and do just about everything that would've required expertise in some programming or scripting language. Sure, ChatGPT can generate code and scripts for you, but in my experience, it's not reliable. You need to have relatively good knowledge of the programming language you're working with and a general idea of code debugging. Codex, however, automatically checks its code and runs tests to ensure it works the way you want. This might improve with ChatGPT's new GPT-4.1 model, but it's not a perfect solution. Knowing when to use which ChatGPT model can largely affect the output, so a model custom-made for coding will perform better than a more general-purpose model. Of course, if you're a programmer, Codex is massively helpful as it can integrate with your GitHub repositories and take care of repetitive tasks and test cases. This lets you develop and ship your app faster without getting caught up in maintenance, testing, and other tasks usually part of the software development process.
[11]
OpenAI takes on Google Gemini Anthropic with AI coding agent for ChatGPT
OpenAI has launched Codex, an AI coding agent powered by codex-1, designed to assist software engineers with tasks like writing features, fixing bugs, and proposing pull requests. Available on ChatGPT Pro, Enterprise, and Team, Codex aims to improve coding workflows while incorporating safeguards against malicious use. Google DeepMind also enhanced Gemini 2.5 Pro with improved coding capabilities.OpenAI launched a research preview of Codex, a cloud-based software engineering agent on Friday. The AI coding agent is powered by codex-1, a version of OpenAI o3 optimized for software engineering, the AI platform said. Codex can write features, answer questions about codebases, fix bugs, and propose pull requests for review. Each task will run in its own cloud sandbox, preloaded with the user's repository. OpenAI said Codex will be available on ChatGPT Pro, Enterprise, and Team users today, with support for Plus and Edu coming soon. It can be accessed through the ChatGPT sidebar, and assigned new tasks by typing a prompt and clicking 'Code'. Users can ask questions about a codebase by clicking 'Ask'. Codex's actions can be seen through citations of terminal logs and test outputs, helping trace each step taken. Users can then review the results, request further revisions, open a GitHub pull request, or directly integrate the changes on their workspaces. OpenAI said Codex was trained to identify and refuse requests aimed at the development of malicious software, addressing concerns that malicious actors could misuse this sophisticated coding agent for cyber attacks and other harmful uses. Apart from OpenAI, Microsoft-owned GitHub, Google and Anthropic, along with startups including Anysphere and Windsurf, offer AI tools for to aid programmers. Earlier this month, Google DeepMind added vastly improved coding capabilities to Gemini 2.5 Pro (Preview). In the run-up to its recently concluded Google I/O 2025 event, the search major released the AI agenct, now branded the I/O Edition. Internally labelled gemini-2.5-pro-preview-05-06, the model can now deliver significant improvements in code transformation, code editing, and even in developing complex agentic workflows -- making it far more capable for software developers and engineers, according to Google.
[12]
OpenAI Launches Codex, a Software Engineering AI Agent
The Codex agent is rolling out to ChatGPT Pro, Team, and Enterprise users, starting today. Today, OpenAI introduced a cloud-based software engineering AI agent, powered by the company's most powerful coding model called 'codex-1'. It's available to ChatGPT Pro, Team, and Enterprise users, starting today. OpenAI says ChatGPT Plus and Edu users will get access to Codex in the future. Talking about Codex, the software engineering agent can perform multiple tasks in parallel on the cloud. It can add new features, answer questions about your codebase, fix bugs, and propose pull requests for review. Developers can connect their GitHub repositories and run the AI agent to perform a variety of tasks. You can access Codex in ChatGPT from the left sidebar. OpenAI says codex-1 is built on the o3 model by training it "using reinforcement learning on real-world coding tasks in a variety of environments to generate code that closely mirrors human style and PR preferences." Codex follows the instructions carefully and runs tests until it receives a passing result in the cloud environment. You can define the AGENTS.md file in your repo to guide the agent so that it can navigate the codebase and perform actions precisely the way you want. On OpenAI's internal SWE tasks, the codex-1 model achieves 75% accuracy, more than o3-high which gets 70%. On SWE-Bench Verified, codex-1 does slightly better than o3-high after a number of attempts. OpenAI finally says the Codex agent operates in a secure, isolated container in the cloud, and during the task execution, internet access is disabled. The agent can't access external websites, APIs, or other services.
[13]
New from OpenAI Codex in ChatGPT : Enhancing Coding Efficiency and Collaboration
OpenAI has introduced Codex, an advanced AI-powered coding assistant designed to enhance the efficiency and productivity of software development. Built on the robust codex-1 model, this tool aims to streamline workflows, automate repetitive tasks, and foster collaboration among developers. By using advanced technologies such as reinforcement learning and secure cloud-based environments, Codex offers a new approach to tackling coding challenges, making it a valuable resource for developers and teams. Codex provides a comprehensive suite of features tailored to meet the diverse needs of software developers. Its capabilities extend across a range of tasks, including: Operating within secure, isolated cloud environments, Codex ensures that workflows remain efficient and protected. By preloading user repositories, it enables developers to work seamlessly without compromising security. Outputs generated by Codex, such as terminal logs, test results, and citations, are verifiable, making sure transparency and reliability. The tool's ability to automate repetitive tasks and provide actionable insights significantly reduces the time spent on manual processes. For instance, Codex can identify bugs, suggest fixes, and generate comprehensive documentation, allowing developers to focus on more complex and creative aspects of their work. Its seamless integration with existing development tools ensures that teams can adopt it without disrupting their established workflows. At the core of Codex lies the codex-1 model, which has been specifically optimized for software engineering tasks. OpenAI trained this model using reinforcement learning techniques, focusing on real-world coding scenarios to align its outputs with human coding preferences. This training approach ensures that Codex not only understands the technical nuances of programming but also adheres to industry standards and best practices. The model's training emphasizes adaptability and precision, allowing it to handle tasks of varying complexity. Whether developers need assistance with a simple script or a sophisticated system, Codex demonstrates efficiency, often completing tasks within 1 to 30 minutes. This capability is particularly advantageous for teams working under tight deadlines or managing large-scale projects, where time and accuracy are critical. Codex is designed to support both real-time and asynchronous collaboration, making it an effective tool for distributed teams. Developers can delegate routine or time-intensive tasks to Codex, freeing up time to focus on higher-level problem-solving and innovation. This feature is especially beneficial for teams operating across different time zones, where asynchronous workflows are essential for maintaining productivity. Real-time collaboration is assistd through shared coding environments and synchronized updates, making sure that all team members remain aligned. Additionally, Codex provides context-aware suggestions and guidance, helping teams maintain a cohesive and efficient development process. By allowing developers to offload repetitive tasks, Codex fosters a more streamlined and collaborative approach to software development. OpenAI has prioritized security and ethical considerations in the design of Codex. The tool operates within secure, isolated containers, preventing unauthorized access to sensitive data. During task execution, Codex is restricted from accessing the internet, further minimizing potential security risks. These measures make Codex a reliable choice for handling sensitive coding projects. To promote ethical use, Codex has been trained to reject malicious requests, such as generating harmful or unethical code. Developers are encouraged to manually review all AI-generated outputs before integration, making sure accountability and maintaining quality control. This approach strikes a balance between the benefits of automation and the need for human oversight, fostering responsible use of the tool. Codex is designed to integrate effortlessly with existing development tools and workflows. It supports AGENTS.md files, allowing developers to configure environments and provide repository-specific guidance. This level of customization ensures that Codex can adapt to the unique requirements of different teams and projects. The tool also integrates seamlessly with platforms like GitHub and continuous integration (CI) systems, further streamlining workflows. By automating routine tasks and offering actionable insights, Codex reduces cognitive load and enhances overall productivity. Its ability to adapt to various development environments makes it a versatile tool for teams of all sizes. Codex has already been adopted by OpenAI engineers and external testers, including companies such as Cisco, Temporal, Superhuman, and Kodiak. These early adopters have reported notable productivity gains and smoother workflows, demonstrating the tool's potential to address common challenges in software development. For example: These real-world applications highlight Codex's versatility and effectiveness in addressing a wide range of development needs. Codex is currently available to ChatGPT Pro, Team, and Enterprise users, with plans to extend access to Plus and Edu users in the future. OpenAI employs a token-based pricing model, charging $1.50 per 1 million input tokens and $6 per 1 million output tokens. This pricing structure provides flexibility, allowing developers to explore Codex's capabilities without significant upfront costs. Despite its robust features, Codex has some limitations. It currently lacks support for image inputs and may be slower at task delegation compared to interactive editing. These constraints represent areas for potential improvement as OpenAI continues to refine the tool. OpenAI envisions a future where asynchronous, multi-agent workflows become a standard practice in software engineering. By allowing developers to delegate tasks to AI agents like Codex, teams can achieve greater scalability and efficiency. This vision aligns with OpenAI's broader goal of empowering small teams to accomplish significant outcomes through enhanced productivity. Future updates to Codex are expected to focus on expanding its interactive capabilities, deepening integration with development tools, and providing proactive updates. These advancements aim to position Codex as an indispensable tool for developers, driving innovation and efficiency across the software engineering landscape.
Share
Share
Copy Link
OpenAI introduces Codex, an advanced AI coding agent integrated into ChatGPT, designed to automate and streamline software development tasks for experienced programmers.
OpenAI has introduced Codex, a groundbreaking AI agent designed to revolutionize software development. Integrated into the ChatGPT web application, Codex aims to automate and streamline coding tasks for experienced developers 1.
Codex is built on codex-1, a fine-tuned version of OpenAI's o3 reasoning model, specifically optimized for software engineering tasks. The AI agent can generate production-ready code, fix bugs, answer questions about codebases, and run tests 2. It operates in a sandboxed, virtual computer in the cloud, allowing it to handle multiple tasks simultaneously without limiting users' access to their own computers.
Accessible through the ChatGPT sidebar, users can interact with Codex by entering prompts and clicking either "code" to generate code or "ask" for advice. The agent works in a distinct container preloaded with the user's codebase, accurately reflecting their development environment 1.
OpenAI has addressed common concerns about AI coding tools by implementing several safety measures. Codex operates in an air-gapped environment with no access to the broader internet or external APIs, limiting potential security risks 2. The agent also shows its thinking and work throughout the process, enhancing transparency and allowing developers to review and validate the generated code 1.
The launch of Codex positions OpenAI to compete with other major players in the AI coding tool market, including Google, Anthropic, and various startups 4. With reports of OpenAI acquiring Windsurf (formerly Codeium) for $3 billion, the company is clearly investing heavily in this space 2.
Codex is currently available as a research preview for ChatGPT Pro, Enterprise, and Team users. OpenAI plans to implement rate limits and a new pricing scheme in the coming weeks, with users having the option to purchase additional credits 2. The company also updated its Codex CLI tool with a version of the o4-mini model optimized for software engineering, available through the OpenAI API 2.
OpenAI envisions Codex evolving into a "virtual teammate" capable of autonomously completing tasks that currently take human engineers hours or days 3. This aligns with the broader trend of AI-assisted coding, with major tech companies reporting that up to 30% of their code is now written by AI 2.
Reference
[2]
[4]
[5]
OpenAI has launched Codex CLI, an open-source coding agent designed to run locally from terminal software, integrating AI models with local code and computing tasks to enhance the programming process.
2 Sources
2 Sources
OpenAI has updated the ChatGPT app for macOS with a new feature that allows direct code editing in popular development environments like Xcode and VS Code, streamlining the coding process for developers.
7 Sources
7 Sources
GitHub has unveiled a suite of new AI-powered features for its Copilot tool, including Agent Mode, Copilot Edits, and Project Padawan, aimed at enhancing developer productivity and automating complex coding tasks.
7 Sources
7 Sources
Tech leaders predict AI will soon dominate coding tasks, potentially transforming the role of software developers and making programming more accessible.
7 Sources
7 Sources
OpenAI has upgraded its Canvas feature with o1 model integration and expanded rendering capabilities, making it a more powerful tool for coding and creative tasks.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved