OpenAI Unveils New Tools for Building Advanced AI Agents

16 Sources

[1]

Ars Technica

OpenAI pushes AI agent capabilities with new developer API

The AI industry is doing its best to will "agents" -- pieces of AI-driven software that can perform multistep actions on your behalf -- into reality. Several tech companies, including Google, have emphasized agentic features recently, and in January, OpenAI CEO Sam Altman wrote that 2025 would be the year AI agents "join the workforce." OpenAI is working to make that promise into a reality. On Tuesday, OpenAI unveiled a new "Responses API" designed to help software developers create AI agents that can perform tasks independently using the company's AI models. The Responses API will eventually replace the current Assistants API, which OpenAI plans to retire in the first half of 2026. With the new offering, users can develop custom AI agents that scan company files with a file search utility that rapidly checks company databases (with OpenAI promising not to train its models on these files) and navigate websites -- similar to functions available through OpenAI's Operator agent, whose underlying Computer-Using Agent (CUA) model developers can also access to enable automation of tasks like data entry and other operations. However, OpenAI acknowledges that its CUA model is not yet reliable for automating tasks on operating systems and can make unintended mistakes. The company describes the new API as an early iteration that it will continue to improve over time. Developers using the Responses API can access the same models that power ChatGPT Search: GPT-4o search and GPT-4o mini search. These models can browse the web to answer questions and cite sources in their responses. That's notable because OpenAI says the added web search ability dramatically improves the factual accuracy of its AI models. On OpenAI's SimpleQA benchmark, which aims to measure confabulation rate, GPT-4o search scored 90 percent, while GPT-4o mini search achieved 88 percent -- both substantially outperforming the larger GPT-4.5 model without search, which scored 63 percent. Despite these improvements, the technology still has significant limitations. Aside from issues with CUA properly navigating websites, the improved search capability doesn't completely solve the problem of AI confabulations, with GPT-4o search still making factual mistakes 10 percent of the time. Alongside the Responses API, OpenAI released the open source Agents SDK, providing developers free tools to integrate models with internal systems, implement safeguards, and monitor agent activities. This toolkit follows OpenAI's earlier release of Swarm, a framework for orchestrating multiple agents. These are still early days in the AI agent field, and things will likely improve rapidly. However, at the moment, the AI agent movement remains vulnerable to unrealistic claims, as demonstrated earlier this week when users discovered that Chinese startup Butterfly Effect's Manus AI agent platform failed to deliver on many of its promises, highlighting the persistent gap between promotional claims and practical functionality in this emerging technology category.

[2]

TechCrunch

OpenAI launches new tools to help businesses build AI agents | TechCrunch

On Tuesday, OpenAI released new tools designed to help developers and enterprises build AI agents - automated systems that can independently accomplish tasks - using the company's own AI models and frameworks. The tools are part of OpenAI's new Responses API, which lets businesses develop custom AI agents that can perform web searches, scan through company files, and navigate websites, much like OpenAI's Operator product. The Responses API effectively replaces OpenAI's Assistants API, which the company plans to sunset in the first half of 2026. The hype around AI agents has grown dramatically in recent years despite the fact that the tech industry has struggled to show people, or even define, what "AI agents" really are. In the most recent example of agent hype running ahead of utility, Chinese startup Butterfly Effect earlier this week went viral for a new AI agent platform called Manus that users quickly discovered didn't deliver on many of the company's promises. In other words, the stakes are high for OpenAI to get agents right. "It's pretty easy to demo your agent," Olivier Godemont, OpenAI's API product head, told TechCrunch in an interview. "To scale an agent is pretty hard, and to get people to use it often is very hard." Earlier this year, OpenAI introduced two AI agents in ChatGPT: Operator, which navigates websites on your behalf, and deep research, which compiles research reports for you. Both tools offered a glimpse at what agentic technology can achieve, but left quite a bit to be desired in the "autonomy" department. Now with the Responses API, OpenAI wants to sell access to the components that power AI agents, allowing developers to build their own Operator- and deep research-style agentic applications. OpenAI hopes that developers can create some applications with its agent technology that feel more autonomous than what's available today. Using the Responses API, developers can tap the same AI models (in preview) under the hood of OpenAI's ChatGPT Search web search tool: GPT-4o search and GPT-4o mini search. The models can browse the web for answers to questions, citing sources as they generate replies. OpenAI claims that GPT-4o search and GPT-4o mini search are highly factually accurate. On the company's SimpleQA benchmark, which measures the ability of models to answer short, fact-seeking questions, GPT-4o search scores 90% while GPT-4o mini search scores 88% (higher is better). For comparison, GPT-4.5 - OpenAI's much larger, recently released model - scores just 63%. The fact that AI-powered search tools are more accurate than traditional AI models is not necessarily surprising - in theory, GPT-4o search can just look up the right answer. However, web search does not render hallucinations a solved problem. Beyond their factual accuracy, AI search tools also tend to struggle with short, navigational queries (such as "Lakers score today"), and recent reports suggest that ChatGPT's citations aren't always reliable. The Responses API also includes a file search utility that can quickly scan across files in a company's databases to retrieve information. (OpenAI claims that it won't train models on these files.) In addition, developers using the Responses API can tap OpenAI's Computer-Using Agent (CUA) model, which powers Operator. The model generates mouse and keyboard actions, allowing developers to automate computer use tasks like data entry and app workflows. Enterprises can optionally run the CUA model, which is releasing in research preview, locally on their own systems, OpenAI said. The consumer version of the CUA available in Operator can only take actions on the web. To be clear, the Responses API won't solve all the technical problems plaguing AI agents today. While AI-powered search tools are more accurate than traditional AI models - a fact that is unsurprising given they can just look up the right answer - web search does not render AI hallucinations a solved problem. GPT-4o search still gets 10% of factual questions wrong. Beyond their accuracy, AI search tools also tend to struggle with short, navigational queries (such as "Lakers score today"), and recent reports suggest that ChatGPT's citations aren't always reliable. In a blog post provided to TechCrunch, OpenAI said that the CUA model is "not yet highly reliable for automating tasks on operating systems," and that it's susceptible to making "inadvertent" mistakes. However, OpenAI said these are early iterations of their agent tools, and it's constantly working to improve them. Alongside the Responses API, OpenAI is releasing an open-source toolkit called the Agents SDK, which offers developers free tools to integrate models with their internal systems, put in place safeguards, and monitor AI agent activities for debugging and optimization purposes. The Agents SDK is a follow-up of sorts to OpenAI's Swarm, a framework for multi-agent orchestration that the company released late last year. Godemont said he hopes OpenAI can bridge the gap between AI agent demos and products this year, and that, in his opinion, "agents are the most impactful application of AI that will happen." That echoes a proclamation OpenAI CEO Sam Altman made in January: that 2025 is the year AI agents enter the workforce. Whether or not 2025 truly becomes the "year of the AI agent," OpenAI's latest releases show the company wants to shift from flashy agent demos to impactful tools.

[3]

ZDNet

Why OpenAI's new AI agent tools could change how you code

If you've been a production developer for any time, you know application programming interfaces (APIs) are subject to change at the provider's whim. You might go to bed one night knowing your app is solid and has a bright future. Then you wake up in the morning to discover that one of your API providers has announced a change that requires you to spend the next six months recoding your app to work with the revised API. Ask me how I know. Also: The best AI for coding in 2025 (and what not to use) AI-based apps are not immune to this immutable law of API life. OpenAI has announced the introduction of the Responses API, which will trigger the sunsetting of its Assistants API sometime in 2026. As with most API changes, the new API does more, works better, and has fewer calories. However, despite these capabilities, the new API still adds big to-do items to early adopters of previous APIs. OpenAI has announced a series of new programmer-level features for building AI agents, including the API. Before we dig into what's new, let's put this change into context. You've probably noticed that since generative AI (Gen AI) burst onto the scene a few years ago, nearly every product and service seems to have suddenly added an AI upsell feature to their base offering. It's a great line extension to many products and services, but it's not like each vendor had AI research teams building large language models. Instead, these companies have licensed APIs that provide AI services at a fairly predictable fee per transaction. Different models cost different amounts, but at a rough estimate, a developer could make 67 GPT-4.5 typical chat queries, 1,176 GPT-4o queries, or 20,000 GPT-4o mini queries for a single dollar. Also: How to use ChatGPT to write code - and my favorite trick to debug what it generates With OpenAI's APIs (and other AI companies' APIs), AI functionality can be added to code with simple function calls and a credit card, cutting short years of development and millions of dollars of infrastructure and research costs. Until now, OpenAI offered an API, the Chat Completions API, for sending chat prompt queries to ChatGPT and getting a response. OpenAI also offered an API, the Assistants API, that enables agents to perform multi-step reasoning and access files. The new Responses API combines the features of both previous APIs. While the Chat Completions API is expected to remain, the company has announced plans to sunset the Assistants API as soon as next year in favor of the more capable Responses API. Also: How to turn ChatGPT into your AI coding power tool - and double your output Rates for using Responses API are located on OpenAI's pricing page and vary based on model and usage. The Responses API has three built-in tools programmers can call: web search, file search, and computer usage. If you ever used ChatGPT to search the web and aggregate an answer, you know what the web search tool does. In the case of the Responses API, it allows programmers to initiate such queries from within their programs. When used, the API provides text as a response, and links to sources and inline citations. Many of these responses should be correct. The company cites a chart showing that tests of web searches with GPT-4o were only 38% right (keep that in mind next time you use ChatGPT to search), but searches using the GPT-4o search preview and the GPT-4o mini search preview were correct nine out of ten times. Mostly good is pretty good, right? Next is the file search tool, which will search a library of your corporate documents. And yeah, I know what you're thinking. OpenAI says it doesn't train its AIs on business data. So, you should be fairly confident that uploading all your data will be secure and won't show up in someone else's response to a prompt. Also: How I test an AI chatbot's coding ability - and you can, too There are issues, though. For one thing, at $0.10 per gig, per month, file storage isn't cheap. My little company's corporate documents database is 57TB. If that database was uploaded to OpenAI for context-specific AI search, the storage would cost almost $6,000 per month -- and my company has just two employees. The benefits of having an AI search tool is the ability to break down silos. But if the silos are what makes storing the data more affordable by only uploading select data, this pricing might limit the usefulness of this feature. Finally, OpenAI includes its computer usage tool, the screen-viewing and clicking tool at the heart of its Operator agent offering. This thing is supposed to be able to read screens and, based on screen content, either perform processing or click to cause actions. From the demos, especially in the preview of the current announcement, it looks like the tool has a long way to go before it's universally useful. Stay tuned on this feature. An API is a set of endpoints or function calls for interacting with an external system. An SDK (software development kit) is a broader collection, often including software tools, some APIs, libraries, and utilities, that all work together. OpenAI announced Agents SDK, a set of tools for building, debugging, and monitoring agents in action. The kit includes the Responses API and adds workflow orchestration, allowing coders to perform multi-step reasoning and task execution. Also: This new AI benchmark measures how much models lie The kit also allows coders to create teams of agents, each tasked with specific responsibilities. A "hand-off" functionality allows one agent to do some work, then hand-off processing to another agent, and so on. Additionally, and I think this is one of the most important features, a monitoring dashboard allows developers to see each interaction with the AI, identify what agents were used for what, and how they were tasked. With the prevalence of AI hallucinations, it's important to be able to keep track of what all those independent tasks are up to. Are you planning to integrate the Responses API into your projects, or are you concerned about OpenAI's sunsetting of the Assistants API? How do you feel about the built-in tools, such as web search, file search, and computer usage? Do they solve real problems or introduce new challenges? And what about the Agents SDK? Do you see it as a game-changer for AI agent development, or are some features still lacking? Let us know in the comments below.

[4]

TechRadar

OpenAI wants to help your business build its next generation of AI agents

Several new tools also launched to make its agents smarter and more effective OpenAI has launched its bid to lead the AI agent race with the release of several new tools, including an agent builder platform, aimed at bringing the technology to more businesses than ever before. The company has revealed a host of new tools which it says will help its platform to evolve and, "help developers and enterprises build useful and reliable agents," providing, "the first building blocks for developers and enterprises". The new releases include Responses API, several new built-in tools including web and file search functions, and a new Agents SDK (software development kit) looking to help streamlining the building of agents. "We believe agents will soon become integral to the workforce, significantly enhancing productivity across industries," an OpenAI blog post announcing the news stated. "As companies increasingly seek to leverage AI for complex tasks, we're committed to providing the building blocks that enable developers and enterprises to effectively create autonomous systems that deliver real-world impact." "Our goal is to give developers a seamless platform experience for building agents that can help with a variety of tasks across any industry." The new Responses API builds on several of the company's existing services, with OpenAI saying it, "combines the simplicity of Chat Completions with the tool-use capabilities of the Assistants API." The aim, unsurprisingly, is to make building agents much quicker and more straightforward for developers, allowing them to solve complex tasks with multiple models, without the complexity of integrating multiple APIs or external providers. These will be supported by a number of new built-in tools which should make the agents themselves smarter and more effective. These include web search, which can quickly provide up-to-date answers from web-based sources, file search, which does the same, but with the file library on your device or within your business, and computer use, which enables agents to complete tasks typically done on a computer, such as automatically diagnosing and fixing technical issues, or everyday tasks like scheduling meetings or handling emails. For those developers or businesses which want that extra level of customization and personalization, the new Agents SDK allows them to build single and multi-agent workflows to their exact requirements. OpenAI says the open source platform can offer greater power and flexibility than its previous Swarm tool, making it ideal for tasks such as customer support automation, multi-step research, content generation, code review, and sales prospecting. Developers will be able to integrate Agents SDK into their Python codebases, and will work with models from other providers which provide a Chat Completions style API endpoint. All the new services are available now to OpenAI subscribers at no extra cost, with the company promising additional tools and capabilities soon.

[5]

VentureBeat

OpenAI unveils Responses API, open source Agents SDK, letting developers build their own Deep Research and Operator

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI is rolling out a new suite of APIs and tools designed to help developers and enterprises build AI-powered agents more efficiently atop some of the very same technology powering its own first-party AI agents Deep Research tool (which scours the internet independently to develop richly researched, well organized and cited reports) and Operator (its tool for controlling a web browser cursor autonomously based on a user's text instructions and performing actions like finding sports tickets or making reservations). Now, with access to the building blocks behind these powerful first-party OpenAI agents, developers can build their own third-party rivals or more domain-specialized products and services specific to their use case and audience. OpenAI's recent advancements in reasoning, multimodal processing, and safety mechanisms have laid the groundwork for these capabilities, especially its "o" family of reasoning models (o1 and o3). "It's hard to overstate how critical reasoning models are for enabling AI agents," said Olivier Godement, head of product for OpenAI's platform, in a video call with VentureBeat. "One of the biggest limitations before was handling long-horizon tasks like planning." But the company says developers haven't yet had the tools necessary to easily built them into production-ready applications for enterprises and their customers -- until now. To address these hurdles, OpenAI is introducing several new offerings: the Responses API, built-in tools for web and file search, a computer use tool, and the open-source Agents SDK. These tools aim to streamline the development of AI agents by reducing the need for extensive prompt engineering and custom orchestration logic. They should also make it an attractive proposition to continue building atop OpenAI's platform even in the face of rising competition from Chinese players like Manus, Alibaba's Qwen, DeepSeek, and homegrown rivals such as Anthropic and Google. While those other players do offer developer tools or products, the continual evolution of OpenAI's developer platform makes it a hard proposition to beat as a "one stop shop" for those looking to leverage the latest AI advances in a clean, easy-to-use, fast and affordable way. Agents SDK provides open source tools for building agents atop even non-OpenAI models In a move certain to send the AI blogosphere and social media space aflutter, OpenAI is getting back into open source in a big way with the release of its Agents SDK, a toolkit designed to help developers manage, coordinate, and optimize agent workflows -- even building agents powered by other, non-OpenAI models such as those by competitors Anthropic and Google, or open source models from DeepSeek, Qwen, Mistral, and Meta's Llama family. "The Agent SDK is open-source, allowing enterprises to mix and match different models," Godement said. "We don't want to force anyone to use only OpenAI models." The SDK offers key features such as: * Configurable Agents - AI models with predefined instructions and tool access. * Intelligent Handoffs - Mechanisms to transfer tasks between agents based on context. * Built-in Guardrails - Safety measures for input validation and content moderation. * Tracing and Observability - Tools for debugging and optimizing agent performance. "With the Agents SDK, developers can track exactly what an agent is doing -- what tasks it spawns, what data it gathers, and how it generates responses," said Handa. What the new Responses API offers At the center of this update is the Responses API, which combines features of OpenAI's Chat Completions API with the tool-use functionality of the Assistants API. This integration allows developers to leverage multiple built-in tools within a single API call, making it easier to build applications that require complex, multi-step interactions. The Responses API initially supports three built-in tools: * Web Search - Provides real-time, cited answers by fetching information from the web. * File Search - Retrieves relevant information from large document repositories using metadata filtering and optimized query processing. * Computer Use Tool - Enables AI agents to perform actions on a computer, such as browsing, data entry, and navigating software interfaces. "With the Responses API, developers get more visibility into what the model is doing -- what tools it's calling, why it's calling them, and what decisions it's making before and after those calls," said Nikunj Handa, PM on OpenAI's API team, in the same video call with VentureBeat. With these capabilities, OpenAI envisions the Responses API serving as a foundation for agentic applications, eliminating the need for multiple external integrations. The API is available to all developers starting today, with usage billed at OpenAI's standard token and tool rates. Additionally, OpenAI notes that while the Chat Completions API will continue to receive updates, the Responses API is considered its superset. Developers who need built-in tools or multi-step model interactions should use Responses API for new integrations. OpenAI is also making its web search, file search, and computer use tools available directly through the Responses API. These tools enable AI agents to access real-world information, retrieve context from documents, and interact with digital environments more effectively. Web Search offers developers realtime information with citaitons The web search tool allows developers to integrate real-time search capabilities into their applications, making it useful for research assistants, shopping guides, and content aggregation tools. It provides sources for its responses, ensuring users can verify the accuracy of the information. "The first thing we're launching is built-in tools, like web search, which allows models to access real-time information," said Handa. "It's the same tool that powers ChatGPT's Search, and now we're bringing it to the API." OpenAI also confirmed that web search results in the API will include clear citations, allowing users to click through to original sources. Developers can implement web search as part of a broader retrieval system that includes proprietary data sources. File search: intelligent document retrieval on private clouds With the file search tool, AI agents can quickly retrieve relevant information from large document collections. This tool supports multiple file formats and includes features like query optimization, metadata filtering, and custom ranking for more precise results. "The third tool we're launching is file search, which makes it easy for developers to take all their data, store it in our system, and extract the right information with high accuracy," Handa explained. The file search tool is priced at $2.50 per thousand queries, with storage fees of $0.10 per GB per day (the first GB is free). Developers may now also access Computer Use, the tech powering OpenAI's Operator The computer use tool extends agent capabilities beyond simple text-based tasks by allowing AI to interact with computer interfaces. Powered by OpenAI's computer-using-agent (CUA) model, this tool translates AI-generated actions into executable commands, enabling automation of tasks like data entry and web navigation. "We're also launching a computer-use tool, allowing models to interact with graphical user interfaces when there's no existing API for a task," Handa noted. The computer use tool is currently available as a research preview for select developers in usage tiers 3-5. Pricing is set at $3 per million input tokens and $12 per million output tokens. What it means for enterprise leaders For IT team leaders, CTOs, and mid-level managers looking to optimize workflows, OpenAI's new tools provide a clear path toward automating and scaling AI-driven processes without requiring extensive custom development. The built-in web search and file search capabilities allow enterprises to quickly integrate AI-powered information retrieval into their existing systems, while the computer use tool enables automated interactions with legacy applications that lack API access. The open-source Agents SDK further empowers organizations to coordinate AI-driven workflows across teams, making it easier to deploy agents that improve efficiency in areas such as customer support, document processing, and market research. With enterprise security and observability built into these tools, decision-makers can adopt AI solutions with greater transparency and control, ensuring compliance and performance monitoring at scale. What's next? OpenAI sees these new releases as the first step in building a comprehensive platform for AI agents. The company plans to roll out additional tools and integrations in the coming months to help developers deploy, evaluate, and scale agentic applications more effectively. "We think the coming months are going to be critical for deploying more and more agents at scale," said Godement. "We've already done this with first-party agents like Deep Research, but OpenAI isn't going to build every agent -- that's why we have a developer platform." OpenAI also stated that it will continue improving safety features for agentic applications, including safeguards for prompt injections and unauthorized data access. Developers interested in building with the new tools can explore OpenAI's documentation and API playground to get started today.

[6]

SiliconANGLE

OpenAI's newest developer AI brings search capabilities to AI agents - SiliconANGLE

OpenAI's newest developer AI brings search capabilities to AI agents The future of artificial intelligence will be dominated by AI agents, and OpenAI is now trying to accelerate that reality by letting developers build their own. Earlier today, it announced the availability of a new "Responses API", which simplifies the process of creating and deploying AI agents that can perform tasks for their users independently. The Responses API lets developers create AI agents that are powered by OpenAI's large language models, and is set to eventually replace the existing Assistants API, which will be retired in about one year, the company said. OpenAI says the new offering will facilitate the creation of AI agents that are able to utilize a file search tool to scan a company's internal datasets and search the wider internet. Such capabilities are similar to OpenAI's recently announced Operator agent, which relies on a Computer-Using-Agent or CUA model to help automate tasks such as data entry. It's worth pointing out that OpenAI has previously acknowledged that the CUA model is somewhat unreliable when trying to automate tasks on operating systems, and it has been known to make mistakes. As such, OpenAI warns developers that the Responses API should still be considered an "early iteration", and says it will become more reliable over time. When using Responses API to create an AI agent, developers can choose from two models - GPT-4o search and GPT-4o mini search. According to the company, both are capable of browsing the web autonomously to try and find answers to questions, and they also cite the sources that inform their responses. It's an important capability because OpenAI said the ability to search the web and scour a company's private datasets can significantly improve the accuracy of its models, and therefore, the agents based on them. The company demonstrated just how superior its search-capable models are on its own, SimpleQA benchmark, which is designed to measure the confabulation rate of AI systems. According to OpenAI, GPT-4o search achieved a 90% score, while GPT-4o mini search scored 88%. In contrast, the new GPT-4.5 model, which has many more parameters and is therefore much more powerful, only scored 63% on the same benchmark, because it lacks the ability to search for additional information. Even so, developers would do well to remember that although these models bring improvements, the search functionality doesn't completely fix all AI confabulations or hallucinations. The benchmark scores suggest GPT-4o search still makes factual mistakes in around 10% of its responses. Such an error rate may be intolerably high for many agentic AI workloads. Still, OpenAI wants to encourage developers to get started, at least, and In addition to the Responses API, it also released an open-source Agents SDK that provides tools for integrating AI models and agents with internal systems. It also provides tools for implementing safeguards and monitoring the activities of AI agents. It follows the release of another tool called Swarm, which provides a framework for developers to manage and orchestrate multiple AI agents. No doubt some developers will be eager to see what kinds of AI agents they can create, but it's important to remember that these technologies are still nascent and not always as effective as some users might claim. Earlier this week, a Chinese startup took the internet by storm with the debut of an AI agent called Manus that wowed some early adopters, only to be quickly found wanting once it became more widely available.

[7]

SiliconANGLE

OpenAI's newest developer API brings search capabilities to AI agents - SiliconANGLE

OpenAI's newest developer API brings search capabilities to AI agents The future of artificial intelligence will be dominated by AI agents, and OpenAI is now trying to accelerate that reality by letting developers build their own. Today the AI company announced the availability of a new "Responses API," which simplifies the process of creating and deploying AI agents that can perform tasks for their users independently. The Responses API lets developers create AI agents that are powered by OpenAI's large language models, and is set eventually to replace the existing Assistants API, which will be retired in about one year, the company said. OpenAI says the new offering will facilitate the creation of AI agents that are able to employ a file search tool to scan a company's internal datasets and search the wider internet. Such capabilities are similar to OpenAI's recently announced Operator agent, which relies on a Computer-Using-Agent or CUA model to help automate tasks such as data entry. It's worth pointing out that OpenAI has previously acknowledged that the CUA model is somewhat unreliable when trying to automate tasks on operating systems, and it has been known to make mistakes. As such, OpenAI warns developers that the Responses API should still be considered an "early iteration," and says it will become more reliable over time. When using Responses API to create an AI agent, developers can choose from two models: GPT-4o search and GPT-4o mini search. According to the company, both are capable of browsing the web autonomously to try and find answers to questions, and they also cite the sources that inform their responses. It's an important capability because OpenAI said the ability to search the web and scour a company's private datasets can significantly improve the accuracy of its models and therefore the agents based on them. The company demonstrated just how superior its search-capable models are on its own, SimpleQA benchmark, which is designed to measure the confabulation rate of AI systems. According to OpenAI, GPT-4o search achieved a 90% score, while GPT-4o mini search scored 88%. In contrast, the new GPT-4.5 model, which has many more parameters and is therefore much more powerful, scored only 63% on the same benchmark, because it lacks the ability to search for additional information. Even so, developers would do well to remember that although these models bring improvements, the search functionality doesn't completely fix all AI confabulations or hallucinations. The benchmark scores suggest GPT-4o search still makes factual mistakes in around 10% of its responses. Such an error rate may be intolerably high for many agentic AI workloads. Still, OpenAI wants to encourage developers to get started, at least. In addition to the Responses API, it released an open-source Agents SDK that provides tools for integrating AI models and agents with internal systems. It also provides tools for implementing safeguards and monitoring the activities of AI agents. It follows the release of another tool called Swarm, which provides a framework for developers to manage and orchestrate multiple AI agents. The new tools are all about increasing the market share of OpenAI's LLMs. According to Damian Rollison, director of market insights at the agentic AI startup SOCi Inc., the company has used a similar strategy already when it embedded ChatGPT with Apple Inc.'s Siri in the new Apple Intelligence suite, exposing it to a new audience of users. "The new Responses API opens up the possibility for even broader exposure and acclimation of the general public to the concept of AI agents, perhaps embedded in a range of tools they already use," Rollison said. No doubt some developers will be eager to see what kinds of AI agents they can create, but it's important to remember that these technologies are still nascent and not always as effective as some users might claim. Earlier this week, a Chinese startup took the internet by storm with the debut of an AI agent called Manus that wowed some early adopters, only to be quickly found wanting once it became more widely available.

[8]

AIM

OpenAI Unveils New APIs and Tools for Developers to Build Their Own Manus

"It's the year that ChatGPT and our developer tools go from just answering questions to actually doing things for you out in the real world." OpenAI has launched a set of APIs and tools to simplify the development of AI agents and to help developers and enterprises build reliable and useful autonomous systems. The company defines agents as systems that independently perform tasks on behalf of users. "2025 is going to be the year of the agent," said Kevin Weil, chief product officer, OpenAI. "It's the year that ChatGPT and our developer tools go from just answering questions to actually doing things for you out in the real world." OpenAI announced in a live stream the launch of the Responses API, built-in tools for web search, file search, and computer use, an Agents SDK, and observability tools for workflow execution. These tools streamline agent logic, orchestration, and interactions. The newly introduced tools are available to developers starting today. OpenAI stated that pricing for the Responses API follows standard token and tool usage rates. The Responses API allows developers to leverage OpenAI's built-in tools for building agents. It integrates features from the Chat Completions API and the Assistants API, offering a unified solution for handling complex tasks. "With a single Responses API call, developers will be able to solve increasingly complex tasks using multiple tools and model turns," OpenAI said. Developers using the Responses API will have access to built-in tools, including web search, which retrieves up-to-date information with citations, file search for querying large document volumes, and computer use for automating browser-based tasks. "During early testing, we've seen developers build with web search for a variety of use cases including shopping assistants, research agents, and travel booking agents," the company said. OpenAI also introduced the Agents SDK, an evolution of Swarm for agent orchestration. Ilan Bigio, an engineer on OpenAI's developer experience team, said the SDK simplifies complex agent interactions and includes features like monitoring, tracing, and built-in guardrails. "The idea with the Agents SDK is to keep simple ideas simple to implement while allowing you to build more complex and robust ideas," Bigio said. "The Agents SDK has enabled companies like Coinbase to quickly prototype and deploy AI agents," OpenAI said. OpenAI's announcement comes after Manus, based in Shenzhen, China, built what it calls a 'general-purpose AI agent.' The agent can plan, execute, and deliver complete results autonomously while browsing websites in real time, processing and generating multiple data types. It also utilises multiple tools to deliver results. Manus also evaluated the agent using the GAIA benchmark, which tests general AI assistants on solving real-world problems. According to the results, Manus outperformed OpenAI's Deep Research. The company has also addressed changes to its existing APIs. Chat Completions will continue to be supported, while the Responses API will serve as its superset. OpenAI plans to deprecate the Assistants API in mid-2026, providing a migration path to the Responses API. The company plans to roll out additional capabilities in the coming months.

[9]

AIM

OpenAI Unveils New APIs and Tools for Agent Development

"It's the year that ChatGPT and our developer tools go from just answering questions to actually doing things for you out in the real world." OpenAI has launched a set of APIs and tools to simplify the development of AI agents and to help developers and enterprises build reliable and useful autonomous systems. The company defines agents as systems that independently perform tasks on behalf of users. "2025 is going to be the year of the agent," said Kevin Weil, chief product officer, OpenAI. "It's the year that ChatGPT and our developer tools go from just answering questions to actually doing things for you out in the real world." OpenAI announced in a live stream the launch of the Responses API, built-in tools for web search, file search, and computer use, an Agents SDK, and observability tools for workflow execution. These tools streamline agent logic, orchestration, and interactions. The newly introduced tools are available to developers starting today. OpenAI stated that pricing for the Responses API follows standard token and tool usage rates. The Responses API allows developers to leverage OpenAI's built-in tools for building agents. It integrates features from the Chat Completions API and the Assistants API, offering a unified solution for handling complex tasks. "With a single Responses API call, developers will be able to solve increasingly complex tasks using multiple tools and model turns," OpenAI said. Developers using the Responses API will have access to built-in tools, including web search, which retrieves up-to-date information with citations, file search for querying large document volumes, and computer use for automating browser-based tasks. "During early testing, we've seen developers build with web search for a variety of use cases including shopping assistants, research agents, and travel booking agents," the company said. OpenAI also introduced the Agents SDK, an evolution of Swarm for agent orchestration. Ilan Bigio, an engineer on OpenAI's developer experience team, said the SDK simplifies complex agent interactions and includes features like monitoring, tracing, and built-in guardrails. "The idea with the Agents SDK is to keep simple ideas simple to implement while allowing you to build more complex and robust ideas," Bigio said. "The Agents SDK has enabled companies like Coinbase to quickly prototype and deploy AI agents," OpenAI said. The company has also addressed changes to its existing APIs. Chat Completions will continue to be supported, while the Responses API will serve as its superset. OpenAI plans to deprecate the Assistants API in mid-2026, providing a migration path to the Responses API. The company plans to roll out additional capabilities in the coming months.

[10]

Inc.

OpenAI's New Developer Tools Make Building AI Agents Easy

In a livestreamed video on Tuesday, March 11, a group of OpenAI employees introduced the Agents software development kit (SDK) as well as the Responses application programming interface (API). The SDK gives developers a framework that enables applications built with OpenAI's models to access additional tools and capabilities, like searching across files, parsing the internet, running code, or controlling a computer. The Responses API connects the company's models to applications that require those new agentic capabilities. In essence, think of the SDK as a spellbook and the API as a magic wand. The SDK gives developers the language needed to call upon these new powers, and the API channels those powers from their source (OpenAI) to an application. Developers give each agent a name, a set of instructions to follow (like "you are a customer support specialist..."), and a defined set of tools that enable them to use specific capabilities or functions. In the livestream, the employees created an AI stylist agent as an example of how the new developer tools can be leveraged to create more useful AI applications. They gave the stylist agent access to two tools: The ability to search the internet, and access to a private database containing information about the OpenAI employees' personal style.

[11]

Gadgets 360

OpenAI's New AI Tools Will Help Developers Build AI Agents

Agents SDK is an open-source kit to setup and manage multi-agent workflow OpenAI made several developer-focused announcements in a live stream on Tuesday. The San Francisco-based AI giant is introducing a new application programming interface (API) and an open-source software development kit (SDK) that will allow developers to build artificial intelligence (AI) agents and manage multi-agent workflows. Alongside, the company also previewed three tools that will be part of the new Responses API and offer developers to build software and apps with agentic capabilities. While the Responses API will act as the baseline to build AI agents, the Agents SDK will offer a suite to manage and create workflows using multiple agents. In a post on X (formerly known as Twitter), the official handle of OpenAI Developers posted a video from the live stream where the company made the announcements. Apart from the API and SDK, the AI firm unveiled three new tools -- web search, file search, and computer use. All the three tools can be accessed via the new Responses API. Web search tool allows developers to integrate their AI agents with web lookup capability so that they can find and provide real-time information. It uses the same model OpenAI uses for its web search function. It is currently available in preview. File search is a retrieval function that can pull specific information from large documents. It comes with query optimisation and custom reranking capabilities. Finally, the computer use tool is powered by the same model behind the AI firm's Operator agent. It can be used to build agents that can access and complete tasks on a computer and across graphical user interfaces (GUIs). OpenAI says it can be used for QA testing and data entry-like tasks. Notably, this tool is available to developers on usage tiers three to five as a research preview. Coming to the Responses API, it is the latest API by the company which is aimed at replacing the existing Chat Completions and Assistant API. Put simply, the Responses API comes with all the functionalities of the existing programming interfaces, with the additional tool use capability. OpenAI says this tool integration capability will also allow developers to build AI agents. OpenAI says that since the Responses API is a superset of Chat Completions and Assistant, code migration will be a straight forward process and the AI firm will ensure that developers do not suffer any glitches or data loss. Eventually, the company plans to retire the Assistant API, while Chat Completions will continue to exist for those developers who do not want agentic functions. Finally, the Agents SDK is designed to let developers create and manage multi-agent workflows. With this, they can pair multiple agents together to complete more complex tasks. It is an upgrade to the previous agentic system Swarm. This SDK is equipped with several sets of primitives including Agents (AI models equipped with instructions and tools), Handoffs (agents can delegate tasks to other agents), and Guardrails (agents validate the input).

[12]

OpenAI launches new developer tools as Chinese AI startups gain ground

OpenAI launched new tools for developers on Tuesday that will help them build advanced AI agents, using a few application programming interfaces (APIs), amid growing competition from Chinese AI startups. AI agents are designed to independently execute complex real-world tasks without direct human intervention, while API is a strings of code which enables standardized communication, data exchange and functionality between software components. The new tool, called the Responses API, is available to all developers at no additional cost. It replaces OpenAI's Assistants API, which is set to be phased out by the second half of 2026. The development comes close on the heels of release of latest AI models by Chinese startups which claim to be on par or better than industry-leading models in the United States at a fraction of the cost. Chinese startup Monica has grabbed attention over the past few days after its launched its autonomous AI agent Manus, weeks after DeepSeek was showered with praise by Silicon Valley executives and U.S. tech company engineers. Monica, which claims that its Manus AI outperforms OpenAI's DeepResearch agent, said on Tuesday that it was partnering with the team behind Alibaba's Qwen AI models.

[13]

Geeky Gadgets

OpenAI Launches New Tools for Building AI Agents with Advanced APIs

OpenAI has unveiled a comprehensive suite of tools and APIs designed to transform the way developers create AI agents. These updates aim to simplify the development of systems capable of handling complex, multi-step tasks autonomously. By incorporating advanced reasoning, multimodal capabilities, and streamlined workflows, OpenAI provides a unified platform that enables developers to build more efficient and capable AI-driven applications. These tools are poised to enhance productivity and innovation across a wide range of industries. The suite of new tools and APIs have been designed to empower developers to create AI agents capable of tackling complex, multi-step tasks. From retrieving real-time information on the web to interacting with legacy systems, these tools promise to bridge gaps and unlock new possibilities. But what does this mean for you? Whether you're a tech enthusiast, a business owner, or just someone curious about the future of AI, these updates hint at a future where technology works harder for us, not the other way around. AI agents are autonomous systems designed to perform tasks on behalf of users. Unlike traditional AI models that primarily focus on answering questions, AI agents are capable of executing actions, making decisions, and interacting with real-world systems. These agents can assist with a variety of tasks, such as conducting research, managing schedules, automating workflows, or even interfacing with outdated systems. For instance, an AI agent might help organize your calendar, retrieve specific information, or handle repetitive tasks, allowing you to focus on more strategic and creative work. By using these capabilities, AI agents are becoming indispensable tools for both personal and professional use. OpenAI's latest tools address several critical challenges in AI development, including retrieving information, automating tasks, and integrating with diverse systems. These tools are designed to enhance the functionality and versatility of AI agents, making them more effective in real-world applications. Some of the standout additions include: These tools collectively enable developers to create AI agents that are not only smarter but also more adaptable to diverse environments and use cases. Uncover more insights about AI Agents in previous articles we have written. The newly introduced Responses API is a significant addition to OpenAI's toolkit, offering enhanced capabilities for managing intricate workflows. This API supports multimodal inputs, multiple tool calls, and multi-turn interactions, making it ideal for applications that require dynamic and interactive responses. Fully backward-compatible with the existing Chat Completions API, the Responses API ensures seamless integration with existing systems while introducing advanced features for handling complex tasks. This API is particularly effective for use cases such as virtual assistants, customer support systems, and other applications that demand real-time, context-aware interactions. By allowing AI agents to process and respond to diverse inputs, the Responses API enhances their ability to perform tasks autonomously and efficiently. OpenAI has also made significant upgrades to its Agents SDK, transforming it from the experimental "Swarm" SDK into a robust, production-ready framework. This open source toolkit simplifies the orchestration of multiple AI agents and introduces several advanced features that enhance its usability and functionality: Currently available for Python, the Agents SDK will soon expand to include JavaScript support, broadening its accessibility to a wider range of developers and allowing the creation of more versatile AI-driven applications. The tools and APIs introduced by OpenAI unlock a wide array of possibilities for AI agent applications, making them valuable across various industries. Some notable use cases include: By combining advanced reasoning, retrieval capabilities, and seamless integration with existing systems, these AI agents are capable of handling increasingly complex tasks with minimal human intervention. This not only enhances productivity but also enables businesses to focus on innovation and strategic growth. OpenAI has outlined an ambitious roadmap for the future of its AI tools and APIs. While the Chat Completions API will continue to be supported, the newly introduced Responses API is expected to gradually replace the Assistance API, which is slated for retirement by 2026. OpenAI also plans to introduce more advanced models and tools, further expanding the capabilities of AI agents. Looking ahead, OpenAI envisions 2025 as a pivotal year for the development and adoption of AI agents. The company aims to shift the focus of AI from merely answering questions to performing real-world tasks, empowering developers to create systems that are not only intelligent but also action-oriented. This vision underscores OpenAI's commitment to driving innovation and allowing the next generation of AI-driven applications.

[14]

Geeky Gadgets

New OpenAI API Tools : Web Search, File Search and More Explained

OpenAI has unveiled a series of updates and tools designed to enhance its API ecosystem, empowering developers to build more advanced and efficient AI agents. These updates include the introduction of the new Responses API, along with integrated functionalities such as web search, file search, and observability tools. By consolidating features and introducing advanced capabilities, OpenAI is not only simplifying workflows but also reinforcing its position as a leader in developer-focused AI infrastructure. These changes aim to address the evolving needs of developers while providing a robust foundation for creating innovative AI solutions. At the heart of these updates is a promise to simplify and empower. OpenAI's Responses API, features built-in tools for tasks like web and file search, and observability features are designed to streamline workflows and expand possibilities. Whether you're a seasoned developer or just starting out, these changes aim to make your life easier while opening doors to more advanced and efficient AI solutions. But what exactly do these updates offer, and how can they transform the way you work? Sam Witteveen explores what this means for you and your projects. The Responses API serves as the cornerstone of OpenAI's latest updates, merging the functionalities of previous APIs like the Completions and Assistant APIs into a single, unified endpoint. This consolidation is designed to streamline integration processes, offering developers a versatile tool capable of handling a wide range of tasks. These tasks include text and image input, web and file search, function calling, and reasoning, all within a single framework. For developers, this unified approach significantly reduces complexity and ensures seamless compatibility with the existing Chat Completions API. OpenAI has also announced plans to phase out the Assistant API by mid-2026, signaling a strategic move toward a more cohesive and standardized ecosystem. This transition underscores OpenAI's commitment to simplifying development workflows while maintaining robust functionality. OpenAI's built-in tools are designed to enhance the functionality of its API, providing developers with powerful options for real-time data retrieval, file management, and task automation. These tools address common challenges in AI development while offering flexibility and scalability to meet diverse project requirements. These tools collectively expand the range of possibilities for developers, allowing them to build more sophisticated and responsive AI systems. By integrating these capabilities directly into the API, OpenAI is reducing the need for external dependencies and creating a more streamlined development experience. To support developers in managing complex workflows, OpenAI has introduced observability tools that provide detailed insights into agent activities. These tools enable tracing and monitoring of large language model (LLM) calls, tool usage, and overall performance. By offering a clear view of how AI agents operate, these tools are invaluable for debugging, optimizing workflows, and creating datasets for fine-tuning models. The observability tools are particularly useful for identifying inefficiencies and making sure that AI agents perform as intended. However, their use raises important considerations regarding data privacy and intellectual property. Developers must carefully evaluate the trade-offs between enhanced observability and the potential risks associated with sharing agent structures and data with OpenAI. This balance is crucial for maintaining both functionality and security in AI development. The latest updates to OpenAI's API ecosystem reflect a strategic focus on addressing the challenges developers face when building AI systems. By prioritizing standardization, advanced functionality, and observability, OpenAI is equipping developers with the tools needed to create sophisticated and reliable AI agents. These updates also demonstrate OpenAI's commitment to supporting developers with practical solutions that enhance productivity and innovation. Additionally, OpenAI's emphasis on privacy safeguards highlights its awareness of the concerns surrounding data security and intellectual property. By incorporating these considerations into its offerings, OpenAI is fostering trust among developers and enterprises alike. This strategic shift underscores OpenAI's ambition to build a comprehensive ecosystem for agent development, catering to both individual developers and large-scale enterprises. By expanding its focus beyond consumer-facing applications, OpenAI is positioning itself as a key player in the AI development landscape. These updates not only enhance the capabilities of its API but also solidify its role as a leader in the rapidly evolving field of artificial intelligence.

[15]

Geeky Gadgets

OpenAI Agent SDK : A New AI Developer's Toolkit for Simplifying AI Workflows

OpenAI has unveiled a robust suite of tools and updates designed to simplify AI development while expanding its potential applications. At the heart of this release is the Agent SDK, a lightweight yet powerful framework that streamlines the creation and management of AI workflows. Complementing the SDK are tools such as the Response API, Web Search, File Search, and Computer Usage, which collectively enhance functionality, flexibility, and safety. These advancements underscore OpenAI's commitment to equipping developers with innovative resources to tackle complex challenges. OpenAI is now handing developers a streamlined, flexible toolkit that includes the new Agent SDK and complementary tools. From orchestrating multi-agent workflows to integrating real-time web search and automating desktop tasks, these updates are designed to simplify work and unlock new possibilities for AI development. The Agent SDK serves as the foundation of OpenAI's new toolkit, providing a comprehensive framework for orchestrating AI workflows with precision and scalability. Built on the Swarm framework, it introduces several key features that simplify the development of complex, multi-agent systems: These features make the Agent SDK an indispensable tool for developers aiming to build scalable, reliable, and ethically sound AI systems. By using its capabilities, you can streamline the management of multi-agent workflows, making sure efficiency and accuracy in your projects. The Response API replaces the soon-to-be-deprecated Assistant API, offering a more streamlined and versatile approach to integrating tools into AI workflows. This API combines chat completions with tool usage, allowing seamless interaction with default tools such as Web Search, File Search, and Computer Usage. By adopting the Response API, you can enhance your AI agents' ability to perform dynamic, real-world tasks. This integration not only simplifies development but also makes your applications more responsive and adaptable to diverse use cases. With its unified approach, the Response API enables developers to create AI systems that are both functional and versatile, bridging the gap between conversational AI and practical tool usage. Find more information on AI Agents by browsing our extensive range of articles, guides and tutorials. The Web Search Tool is a powerful addition to OpenAI's toolkit, designed to provide real-time information retrieval with remarkable accuracy and efficiency. Using GPT-4 models fine-tuned for this purpose, it ensures reliable results for a wide range of applications. Whether you're verifying facts or gathering the latest data, this tool is particularly suited for: This tool ensures that your AI agents remain informed and relevant, even in fast-changing environments. Its ability to deliver precise, real-time data makes it an invaluable resource for developers working on applications that demand accuracy and timeliness. For workflows involving extensive document management, the File Search Tool offers a sophisticated solution. It automates processes such as document chunking, embedding, and ranking, allowing efficient retrieval-augmented generation (RAG) workflows. By interacting with OpenAI's vector store, this tool allows for precise document-based queries, making it ideal for tasks such as: This tool is particularly valuable for developers working with data-intensive applications, offering a reliable method for managing and querying large volumes of information. The Computer Usage Tool introduces automation to desktop environments by capturing and executing mouse and keyboard actions. Powered by a multimodal GPT-4 model, it simplifies repetitive tasks such as data entry, file organization, and basic desktop operations. While its current capabilities are best suited for straightforward automation, it lays the groundwork for more advanced integrations in the future. This tool is a step toward integrating AI into everyday workflows, offering developers a practical solution for automating routine tasks. Its potential for growth highlights OpenAI's focus on expanding the role of AI in enhancing productivity. OpenAI has placed a strong emphasis on safety and observability in its latest offerings, making sure that AI systems operate responsibly and within ethical boundaries. Key tools in this area include: These features not only enhance the reliability of AI applications but also build trust in their deployment. By prioritizing safety and observability, OpenAI ensures that its tools meet the highest standards of accountability and transparency. OpenAI's pricing model is designed to accommodate a wide range of project needs, offering flexibility for developers. The costs are as follows: This pricing structure allows you to select tools that align with your budget and project requirements, making sure accessibility for developers at various levels. The introduction of these tools marks a significant evolution in OpenAI's approach, emphasizing practical, product-focused innovation. By adopting the Response API and integrating tools like Web Search and File Search, you can create applications that are both powerful and adaptable to changing needs. The inclusion of safety and observability features ensures that your solutions remain effective and responsible, even as they scale. These advancements provide developers with a comprehensive toolkit for tackling diverse challenges in AI development. Whether you're automating tasks, retrieving real-time information, or managing document-heavy workflows, OpenAI's latest offerings equip you with the resources needed to innovate and succeed in the rapidly evolving AI landscape.

[16]

PYMNTS

OpenAI Launches New Tools for Building AI Agents | PYMNTS.com

OpenAI launched a new set of APIs and tools designed to help developers and enterprises build, deploy and scale artificial intelligence agents. While the company already offered AI model capabilities that can support AI agents, it added these new tools in response to customers' calls for an easier way to turn these capabilities into AI agents, OpenAI said in a Tuesday (March 11) blog post. The new offerings launched Tuesday include a new Responses API that helps build agents; built-in tools like web search, file search and computer use; a new Agents SDK that orchestrates single-agent and multi-agent workflows; and integrated observability tools that help users trace and inspect agent workflow development, according to the post. "These new tools streamline core agent logic, orchestration and interactions, making it significantly easier for developers to get started with building agents," OpenAI said in the post. With the introduction of these tools, OpenAI will continue to support its Chat Completions API, which is used by developers who don't need built-in tools, but it plans to deprecate its Assistants API after incorporating all its features into the new Responses API by mid-2026, per the post. OpenAI said in the post that it plans to add more tools and capabilities that simplify and accelerate the building of agentic applications on its platform over the next weeks and months. "We believe agents will soon become integral to the workforce, significantly enhancing productivity across industries," OpenAI said in the post. "As companies increasingly seek to leverage AI for complex tasks, we're committed to providing the building blocks that enable developers and enterprises to effectively create autonomous systems that deliver real-world impact." AI agents that can handle complex jobs, think independently and learn from their environment are reshaping industries and automating tasks in ways that didn't used to be possible, PYMNTS reported in October. Meta's head of business AI, Clara Shih, said Thursday (March 6) that Meta's AI is being used by more than 700 million customers and that the company hopes to offer it to "hundreds of millions" of businesses. "We're quickly coming to a place where every business, from the very large to the very small, they're going to have a business agent representing it and acting on its behalf, in its voice -- the way that businesses today have websites and email addresses," Shih said.

Twitter

Facebook

Copy Link

OpenAI introduces the Responses API and Agents SDK, enabling developers to create more capable AI agents with improved search, file handling, and computer interaction capabilities.

OpenAI Pushes Boundaries with New AI Agent Tools

OpenAI has unveiled a suite of new tools designed to empower developers and enterprises in building more advanced AI agents. The announcement comes as the AI industry continues to push for the realization of autonomous software capable of performing complex, multi-step actions on behalf of users 1

Introducing the Responses API

At the heart of OpenAI's new offerings is the Responses API, which combines features from the company's existing Chat Completions API and Assistants API. This new API is set to replace the Assistants API, which OpenAI plans to retire in the first half of 2026 1

The Responses API provides developers with access to:

Web Search: Allows AI agents to browse the internet for real-time information and cite sources 1
1
2
2
4
4
.
File Search: Enables rapid scanning of company databases without training OpenAI's models on the files 1
1
2
2
4
4
.
Computer-Using Agent (CUA) model: Facilitates automation of tasks like data entry and other computer operations 1
1
2
2
4
4
.

Improved Accuracy and Limitations

OpenAI claims significant improvements in factual accuracy with its new models. On the SimpleQA benchmark:

GPT-4o search scored 90%
GPT-4o mini search achieved 88%
GPT-4.5 without search scored 63% 1
1
2
2

Despite these advancements, the technology still faces challenges. The CUA model is not yet fully reliable for automating tasks on operating systems, and even with improved search capabilities, AI confabulations persist in about 10% of cases 1

Open-Source Agents SDK

Alongside the Responses API, OpenAI has released the open-source Agents SDK. This toolkit provides developers with free resources to:

Integrate models with internal systems
Implement safeguards
Monitor agent activities 1
1
2
2
3
3

The SDK builds upon OpenAI's earlier release of Swarm, a framework for orchestrating multiple agents 1

Implications for Developers and Businesses

These new tools aim to streamline the development of AI agents, reducing the need for extensive prompt engineering and custom orchestration logic 5

. The Agents SDK's open-source nature allows enterprises to mix and match different models, not limiting them to OpenAI's offerings 5

OpenAI envisions these tools as foundational for agentic applications, potentially eliminating the need for multiple external integrations 5

. The company believes that AI agents will soon become integral to the workforce, enhancing productivity across industries 4

Challenges and Future Outlook

While these advancements are promising, the AI agent field remains in its early stages. Recent incidents, such as the Butterfly Effect's Manus AI agent platform failing to deliver on many of its promises, highlight the persistent gap between promotional claims and practical functionality in this emerging technology category 1

As the industry continues to evolve, OpenAI's latest releases demonstrate a shift from flashy agent demos to more impactful tools. However, the true test will be in how effectively these new capabilities can be integrated into real-world applications and whether they can deliver on the promise of truly autonomous AI agents 2

References

Summarized by

Navi

[1]

Ars Technica

OpenAI pushes AI agent capabilities with new developer API

[2]

TechCrunch

OpenAI launches new tools to help businesses build AI agents | TechCrunch

[3]

ZDNet

Why OpenAI's new AI agent tools could change how you code

[4]

TechRadar

OpenAI wants to help your business build its next generation of AI agents

[5]

VentureBeat

OpenAI unveils Responses API, open source Agents SDK, letting developers build their own Deep Research and Operator

Recent Highlights

Today's Top Stories

AI resurrections of dead celebrities spark ethical debate over digital likeness control

OpenAI's Sora tool has unleashed hyper-realistic AI-generated videos depicting deceased figures like Queen Elizabeth II and Martin Luther King Jr., triggering backlash from families and experts. While some clips amuse viewers, others raise serious concerns about consent, misinformation, and the erosion of trust in genuine news as synthetic content spreads unchecked across social media.

2 Sources

Entertainment and Society

12 hrs ago

Chinese AI models match Western rivals as open-source battle reshapes global AI landscape

Chinese AI models from Alibaba and DeepSeek have achieved performance parity with leading US systems while offering significant cost advantages. A Stanford report reveals these open-weight models now dominate global adoption, with usage jumping from 1.2% to nearly 30% by August 2025. The shift challenges US dominance as Meta retreats from open-source leadership.

2 Sources

Technology

16 hrs ago

Google Gemini makes home appliance debut in Samsung's AI Refrigerator at CES 2026

Google Gemini is coming to Samsung refrigerators for the first time, marking its home appliance debut. The Bespoke AI Refrigerator Family Hub will showcase enhanced AI Vision capabilities at CES 2026, recognizing more food items without manual registration and offering smarter kitchen management features.

3 Sources

Technology

8 hrs ago

AI Bubble Fears Intensify as Tech Giants Pour Trillions Into Infrastructure Without Matching Returns

Concerns about an AI market valuation bubble are mounting as tech companies commit unprecedented spending on infrastructure while profits lag far behind. Jason Furman warns the financial bubble poses greater risks than the technology itself, with OpenAI planning $1.4 trillion in spending against just $20 billion in expected 2025 profits. The situation echoes the dot-com bubble, raising questions about when returns will justify the massive investments from tech companies.

3 Sources

Business and Economy

12 hrs ago

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

The Outpost

News

About

OpenAI Unveils New Tools for Building Advanced AI Agents

OpenAI Pushes Boundaries with New AI Agent Tools

Introducing the Responses API

Improved Accuracy and Limitations

Open-Source Agents SDK

Implications for Developers and Businesses

Challenges and Future Outlook

References

OpenAI pushes AI agent capabilities with new developer API

OpenAI launches new tools to help businesses build AI agents | TechCrunch

Why OpenAI's new AI agent tools could change how you code

OpenAI wants to help your business build its next generation of AI agents

OpenAI unveils Responses API, open source Agents SDK, letting developers build their own Deep Research and Operator

Related Stories

OpenAI Unveils AgentKit: A Game-Changer for AI Agent Development

OpenAI DevDay 2024: Revolutionizing AI Development with New Features and APIs

OpenAI Launches New Developer Tools Amid Rising Competition from Chinese AI Startups

Recent Highlights

Google launches Gemini 3 Flash as default AI model, delivering speed with Pro-grade reasoning

OpenAI launches GPT Image 1.5 as AI image generator war with Google intensifies

OpenAI launches ChatGPT app store, opening doors for third-party developers to build AI-powered apps

Recent Highlights

Today's Top Stories

AI resurrections of dead celebrities spark ethical debate over digital likeness control

Chinese AI models match Western rivals as open-source battle reshapes global AI landscape

Google Gemini makes home appliance debut in Samsung's AI Refrigerator at CES 2026

AI Bubble Fears Intensify as Tech Giants Pour Trillions Into Infrastructure Without Matching Returns