8 Sources
8 Sources
[1]
Google launched its deepest AI research agent yet -- on the same day OpenAI dropped GPT-5.2 | TechCrunch
Google released on Thursday a "reimagined" version of its research agent Gemini Deep Research based on its much-ballyhooed state-of-the-art foundation model, Gemini 3 Pro. This new agent isn't just designed to produce research reports - although it can still do that. It now allows developers to embed Google's SATA-model research capabilities into their own apps. That capability is made possible through Google's new Interactions API, which is designed to give devs more control in the coming agentic AI era. The new Gemini Deep Research tool is an agent equipped to synthesize mountains of information and handle a large context dump in the prompt. Google says it's used by customers for tasks ranging from due diligence to drug toxicity safety research. Google also says it will soon be integrating this new deep research agent into services, including Google Search, Google Finance, its Gemini App and its popular NotebookLM. This is another step towards preparing for a world where humans don't Google anything anymore, their AI agents do. The tech giant says that Deep Research benefits from Gemini 3 Pro's status as its "most factual" model that is trained to minimize hallucinations during complex tasks. AI hallucinations - where the LLM just makes stuff up - are an especially crucial issue for long-running, deep reasoning agentic tasks, in which many autonomous decisions are made over minutes, hours, or longer. The more choices an LLM has to make, the greater the chance that even one hallucinated choice will invalidate the entire output. To prove its progress claims, Google has also created yet another benchmark (as if the AI world needs another one). The new benchmark is unimaginatively named DeepSearchQA, and is intended to test agents on complex, multi-step information-seeking tasks. Google has open sourced this benchmark. It also tested Deep Research on Humanity's Last Exam, a much-more interestingly named, independent benchmark of general knowledge filled with impossibly niche tasks; and BrowserComp, a benchmark for browser-based agentic tasks. As you might expect, Google's new agent bested the competition on its own benchmark, and Humanity's. However, OpenAI's ChatGPT 5 Pro was a surprisingly close second all the way around and slightly bested Google on BrowserComp. But those benchmark comparisons were obsolete almost the moment Google published them. Because on the same day, OpenAI launched its highly anticipated GPT 5.2 -- codenamed Garlic. OpenAI says its newest model bests its rivals -- especially Google -- on a suite of the typical benchmarks, including OpenAI's homegrown one. Perhaps one of the most interesting parts of this announcement was the timing. Knowing that the world was awaiting the release of Garlic, Google dropped some AI news of its own.
[2]
Upgraded Deep Research coming to Gemini app, agent now available for devs
Google today announced a "significantly more powerful Gemini Deep Research agent" that will soon be available in consumer apps and is now available for developers. Today's announcement is focused on developers. Google has a new Interactions API that serves as its "unified interface for interacting" with models (like Gemini 3 Pro) and agents. Google's new API reflects the latest model capabilities like "thinking" and advanced tool use that go beyond text generation. We will expand built-in agents and introduce the ability to build and bring your own agents. This will enable you to connect Gemini models, Google's built-in agents, and your custom agents using one API. The first built-in agent is Gemini Deep Research (Preview). Third-party developers can now add "advanced autonomous research capabilities" into their applications. "Optimized for long-running context gathering and synthesis tasks," the Gemini Deep Research agent uses Gemini 3 Pro. Google touts how it is "specifically trained to reduce hallucinations and maximize report quality during complex tasks." In response to your prompt, it "formulates queries, reads results, identifies knowledge gaps, and searches again." There's also "vastly improved web search, allowing it to navigate deep into sites for specific data." By scaling multi-step reinforcement learning for search, the agent autonomously navigates complex information landscapes with high accuracy. On benchmarks, Google notes state-of-the-art results on Humanity's Last Exam (reasoning and knowledge), DeepSearchQA (comprehensive web research), and BrowseComp (locating hard to find facts) that surpass Gemini 3 Pro. Gemini Deep Research achieves 46.4% (versus Gemini 3 Pro's 43.2%) on the full HLE set, 66.1% (versus 56.6%) on DeepSearchQA and a high 59.2% (versus 49.4%) on BrowseComp: All these improvements that developers can start previewing today (Google AI Studio) will "soon" be available for Google's consumer apps, including Gemini, Google Search, and NotebookLM.
[3]
Google's New Deep Research Agent Scores SoTA Results on Benchmarks | AIM
On the Humanity's Last Exam benchmark, Deep Research Agent scored 46.4%, outperforming OpenAI's GPT-5 Pro (38.9%). Google has announced the Gemini Deep Research agent, which is based on the Gemini 3 Pro model via the interactions API. This helps developers integrate autonomous research capabilities within their applications. "By scaling multi-step reinforcement learning for search, the agent autonomously navigates complex information landscapes with high accuracy," the company said. The tool plans its research iteratively, starting with formulating queries, reading sources, identifying gaps and searching again to fill them. "This release features vastly improved web search, allowing it to navigate deep into sites for specific data," Google added. On the Humanity's Last Exam benchmark, which tests AI models on expert-level reasoning and problem-solving across a broad range of academic subjects, the Deep Research Agent scored 46.4%, outperforming OpenAI's GPT-5 Pro (38.9%). Even on the BrowseComp benchmark, which evaluates LLM on locating 'hard-to-find' facts, Google's DeepResearch agent scored 59.2%, only slightly below GPT-5 Pro (59.5%). Alongside the announcement, Google also announced a new benchmark called DeepSearchQA, designed to test agent comprehensiveness on web research tasks. Gemini Deep Research agent scored 66.1%, outperforming GPT-5 Pro (65.2%). DeepSearchQA features 900 hand-crafted tasks across 17 fields, where each step depends on prior analysis. "Unlike traditional fact-based tests, DeepSearchQA measures comprehensiveness, requiring agents to generate exhaustive answer sets. This assesses both research precision and retrieval recall," Google said. Google said the agent will be soon available on Google Search, NotebookLM, Google Finance and the Gemini app. The API pricing matches the Gemini 3 Pro model: it costs $2 per million input tokens, while output tokens are priced at $12 per million for prompts up to 200,000 tokens and $18 per million for prompts that exceed that length.
[4]
Google upgrades Gemini Deep Research's search and problem-solving capabilities - SiliconANGLE
Google upgrades Gemini Deep Research's search and problem-solving capabilities Google LLC today released a new version of Gemini Deep Research, an artificial intelligence agent designed to automate complex tasks such as crafting financial reports. The company first introduced the tool last December. The initial version used Gemini 1.5 Pro, which at the time was Google's flagship large language model. The new version that debuted today is based on Gemini 3 Pro, a significantly more capable model released last month. One of the areas where Gemini 3 Pro performs better than its predecessors is visual reasoning. According to Google, it can perform tasks such as planning the travel paths of a warehouse robot. When the LLM is applied to document processing use cases, it can extract information from handwritten text, charts and mathematical notation. The new release of Gemini Deep Research uses Gemini 3 Pro's visual reasoning features to automate data retrieval tasks. Users can upload documents and have the agent scan them to find a specific piece of information. Alternatively, Gemini Deep Research can be instructed to condense the documents into a report or enrich them with information from the web. Google says that the agent's new release introduces significantly improved web search capabilities. "Deep Research iteratively plans its investigation - it formulates queries, reads results, identifies knowledge gaps, and searches again," Google DeepMind product manager Lukas Haas and group product manager Shrestha Basu Mallick explained in a blog post. Gemini Deep Research is available through a new application programming interface called Interactions API that debuted in conjunction. It enables developers to access the agent and the Gemini model series through a single access point. In the future, Google will add more pre-packaged agents to interactions API along with support for custom agent development. Besides providing centralized access to multiple AI offerings, the API also eases certain programing tasks. It automates some of the work involved in managing the data that users upload to an AI application for processing. Additionally, there's an MCP tool for connecting AI models to third-party systems. Google evaluated Gemini Deep Research's capabilities using two benchmarks called HLE and DeepSearchQA. According to the company, it achieved record performance on both tests. HLE is a particularly difficult AI benchmark that comprises over 2,500 questions. More than half of the questions relate to math, physics and programming. Google says that Gemini Deep Research solved 46.4% of the problems in HLE correctly. The other benchmark that Google used in the evaluation, DeepSearchQA, is an internally-developed dataset it open-sourced today. It comprises 900 multi-step tasks in which each step "depends on prior analysis." Google says that the benchmark measures the precision and comprehensiveness of AI models' research.
[5]
Google releases reimagined Gemini Deep Research on Gemini 3 Pro
Google released a reimagined version of its research agent, Gemini Deep Research, on Thursday, coinciding with OpenAI's announcement of GPT-5.2. The update builds on the Gemini 3 Pro foundation model to enhance factual accuracy and enable developer integration for advanced AI applications. The new Gemini Deep Research agent retains its ability to generate research reports while introducing expanded functionalities. Developers can now embed Google's SATA-model research capabilities directly into their own applications. This integration occurs through the newly launched Interactions API, which provides developers with increased control over AI operations as agentic systems become more prevalent in software development. At its core, the agent processes and synthesizes vast amounts of information efficiently. It manages large context dumps within prompts, allowing it to handle complex data sets without losing coherence. Customers already employ the tool for practical applications, including due diligence processes in business and drug-toxicity safety research in pharmaceuticals, demonstrating its utility in real-world scenarios requiring precise information handling. Google intends to incorporate the deep-research agent into several of its existing services to broaden accessibility. These include Google Search for improved query resolution, Google Finance for detailed market analysis, the Gemini App for user interactions, and NotebookLM for note-taking and knowledge organization. Such integrations aim to leverage the agent's strengths across Google's ecosystem. The agent's performance relies heavily on the Gemini 3 Pro model's design as Google's most factual foundation model. This model undergoes training specifically to minimize hallucinations, instances where large language models generate inaccurate information. Hallucinations pose a significant risk in long-running, deep-reasoning tasks, where agents make numerous autonomous decisions over extended periods such as minutes or hours. A single hallucinated choice in these sequences can compromise the validity of the entire output, making reduced hallucination rates essential for reliable operation. To substantiate its advancements, Google developed a new evaluation benchmark named DeepSearchQA. This benchmark assesses AI agents on complex, multi-step information-seeking tasks that mimic real research challenges. Google has made DeepSearchQA available as an open-source resource, enabling the broader AI community to test and compare agent capabilities using standardized metrics.
[6]
Google Unveils Gemini Deep Research The Same Day As OpenAI's GPT-5.2 Launch, Intensifying AI Face-Off - Alphabet (NASDAQ:GOOG), Alphabet (NASDAQ:GOOGL)
Alphabet's Google (NASDAQ:GOOGL) (NASDAQ:GOOG) has launched an enhanced version of its research agent, Gemini Deep Research, which is designed to revolutionize the way AI is used for research and development. New Gemini Agent Powers AI Research On Thursday, Google unveiled a new and improved version of its research agent, Gemini Deep Research, which is based on the advanced Gemini 3 Pro model. Besides producing research reports, it also allows developers to integrate Google's SATA-model research capabilities into their own applications. The Gemini Deep Research tool is equipped to synthesize vast amounts of information and manage large context dumps in the prompt. Google states that its customers use this tool for various tasks, from due diligence to drug toxicity safety research. Google will embed its new deep research agent across products like Search, Finance, the Gemini app, and NotebookLM, marking a shift toward AI-driven information retrieval. The tool is powered by Gemini 3 Pro, which Google describes as its most accurate model, designed to reduce hallucinations during complex research tasks. See Also: Having Multiple Jobs Finally Caught Up To Him. He Got Fired From Three In One Day. 'Woke Up This Morning To A Fun Impromptu Meeting With HR' OpenAI, Google Intensify AI Competition The launch of the new Gemini Deep Research tool came the same day as OpenAI introduced its most advanced AI model, GPT-5.2, which it claimed to be the best offering yet for everyday professional use. The company also stated that the update improves spreadsheet creation, presentation building, image understanding, coding, and long-context comprehension. After Anthropic and Google rolled out new models last month, OpenAI reportedly declared a "code red," shifting resources toward upgrading ChatGPT and putting other projects on hold. This move by Google is likely to intensify the competition between the two tech giants in the AI space. Earlier in December, CNBC commentator Jim Cramer predicted that OpenAI could fall behind due to the recent advancements in AI technology, particularly the introduction of Google's Gemini 3 AI model. Cramer suggested that this could lead to a surge of tens of millions of users to the Gemini 3 platform. READ NEXT: Scott Galloway Calls SpaceX Incredible Company With 'Bigger Moat' Than OpenAI, But Refuses To Invest In It Image via Shutterstock Disclaimer: This content was partially produced with the help of AI tools and was reviewed and published by Benzinga editors. GOOGAlphabet Inc$314.500.26%OverviewGOOGLAlphabet Inc$313.350.29%Market News and Data brought to you by Benzinga APIs
[7]
Google Takes the Battle for AI Supremacy Back to OpenAI with New Gemini Deep Research
The latest effort would likely help Google integrate its AI apps more tightly with their workplace offerings Barely hours after OpenAI came out with its latest frontier model GPT-5.2 to take on Google's Gemini-3, the search giant pushed right back with a "reimagined" version of its research agent - Gemini Deep Research based on their Gemini 3 Pro foundation model. The new agent based is not only designed to generate research reports, it also allows developers to embed Google's SATA model research capabilities into their own apps. This has been made possible through the use of Google's interactions API that gives developers more control while creating new agentic era solutions. According to a blog post, the Interactions API introduces a native interface specifically designed to handle complex context management when building agentic applications with interleaved messages, thoughts, tool calls and their state. "Alongside our suite of Gemini models, the Interactions API provides access to our first built-in agent: Gemini Deep Research (Preview), a state-of-the-art agent capable of executing long-horizon research tasks and synthesizing findings into comprehensive reports," Google says. The new Gemini Deep Research tool comes equipped with the capability to synthesise massive amounts of data, the ability to handle an equally voluminous context in a prompt, making it perfect for users in tasks ranging from due diligence testing to safety research around drugs. Google also plans to integrate the research agent into its services such as Google Search, Google Finance, the Gemini App and NotebookLM. "Gemini Deep Research is an agent optimized for long-running context gathering and synthesis tasks. The agent's reasoning core uses Gemini 3 Pro, our most factual model yet, and is specifically trained to reduce hallucinations and maximize report quality during complex tasks. By scaling multi-step reinforcement learning for search, the agent autonomously navigates complex information landscapes with high accuracy," Google says in another blog post. This puts Google once again a step ahead in the integration race as the company wants AI agents to do all the work that humans do while using Google's work suite products. They claim that Deep Research is proving to be the most factual model that is trained to minimise hallucinations while attempting complex tasks. In order to prove its claims, the company says it has created another benchmark that is named DeepSearchQA to test agents on complex, multi-step information-heavy tasks. Google has open-sourced this benchmark, though most of us watching their numbers grow would wonder what another one can do in this already crowded arena. Deep Research iteratively plans its investigation - it formulates queries, reads results, identifies knowledge gaps, and searches again. This release features vastly improved web search, allowing it to navigate deep into sites for specific data, Google says. DeepSearchQA features 900 hand-crafted "causal chain" tasks across 17 fields, where each step depends on prior analysis. Unlike traditional fact-based tests, DeepSearchQA measures comprehensiveness, requiring agents to generate exhaustive answer sets. This assesses both research precision and retrieval recall, Google claims. The company has provided developer documentation and wants users to take a stab at building while also promising that future updates would deliver richer outputs such as native chart generation for visual analytical reports, and better connectivity via the model text protocol support that facilitates easier access to custom data sources.
[8]
Google launches upgraded Gemini Deep Research agent: Here's what it can do
At the core of the upgraded agent is Gemini 3 Pro, Google's most factual model yet. Google has rolled out a major upgrade to its Gemini Deep Research agent, giving developers access to a far more powerful system for long-form research, analysis and information gathering. Interestingly, Google's announcement arrived on the same day OpenAI launched GPT-5.2. Instead of returning quick answers, the agent plans its work carefully: it creates search queries, reads through results, identifies what it still doesn't know, and searches again. This process helps it gather deeper, more accurate information from across the web. At the core of the upgraded agent is Gemini 3 Pro, Google's most factual model yet. It has been trained specifically to reduce hallucinations and to produce clearer, more reliable reports. According to the tech giant, the upgraded Gemini Deep Research agent achieves "state-of-the-art results on Humanity's Last Exam (HLE) and DeepSearchQA, and is our best on BrowseComp." "Deep Research is now more useful and intelligent than ever, and will soon be available in Google Search, NotebookLM, Google Finance and upgraded in the Gemini App." Also read: OpenAI brings GPT 5.2 to take on Gemini 3 Pro, Sam Altman says its most capable model yet Developers can use the Deep Research agent to analyse uploaded documents, combine them with web findings, and generate structured reports. It also supports custom formatting, which means you can control the output via prompting. Google says future updates will bring built-in chart generation, better connections to custom data sources through MCP, and availability through Vertex AI for enterprise use. Also read: OpenAI's ChatGPT can now edit your images using Adobe Photoshop: Here is how The tech giant is also open-sourcing DeepSearchQA, a benchmark built to test how well research agents handle long, multi-step tasks. The benchmark includes 900 carefully designed tasks across 17 fields. Unlike simple fact-checking datasets, DeepSearchQA "measures comprehensiveness, requiring agents to generate exhaustive answer sets. This assesses both research precision and retrieval recall," Google explains.
Share
Share
Copy Link
Google released a reimagined version of its Gemini Deep Research AI agent powered by the Gemini 3 Pro model, introducing developer integration through the new Interactions API. The timing proved strategic—Google dropped this announcement on the same day OpenAI launched GPT-5.2. The updated agent scored 46.4% on Humanity's Last Exam, outperforming OpenAI's GPT-5 Pro at 38.9%.
Google released a significantly upgraded version of its Gemini Deep Research AI agent on Thursday, powered by the Gemini 3 Pro model and designed to handle complex research tasks with enhanced factual accuracy
1
. The new AI research agent retains its core ability to produce research reports while introducing a critical new capability: developers can now embed Google's advanced research capabilities directly into their own applications through the newly launched Interactions API2
. This API serves as a unified interface for interacting with models and agents, reflecting the latest capabilities like thinking and advanced tool use that extend beyond simple text generation.
Source: 9to5Google
The timing of this release carried strategic significance. Google made its announcement on the same day OpenAI unveiled GPT-5.2, codenamed Garlic, in what appears to be a calculated move to maintain visibility amid intense competition in the AI space
1
. The API pricing matches the Gemini 3 Pro model at $2 per million input tokens, while output tokens cost $12 per million for prompts up to 200,000 tokens and $18 per million for longer prompts3
.
Source: Benzinga
The reimagined Gemini Deep Research demonstrates improved web search capabilities that allow it to navigate deep into websites for specific data
2
. The AI agent operates by iteratively planning its investigation—it formulates queries, reads results, identifies knowledge gaps, and searches again to fill them4
. By scaling multi-step reinforcement learning for search, the agent autonomously navigates complex information landscapes with high accuracy2
.Optimized for long-running context gathering and information synthesis tasks, the agent can process and synthesize vast amounts of information efficiently while managing large context dumps within prompts
5
. Customers already employ the tool for practical applications ranging from due diligence processes in business to drug toxicity safety research in pharmaceuticals1
. The Gemini 3 Pro model's visual reasoning features enable users to upload documents and have the agent scan them to find specific information, condense them into reports, or enrich them with data from the web4
.
Source: SiliconANGLE
Google emphasizes that Gemini Deep Research benefits from Gemini 3 Pro's status as its most factual model, specifically trained to minimize hallucinations during complex tasks
1
. AI hallucinations—where the LLM generates inaccurate information—pose a significant risk in long-running, deep reasoning agentic tasks where numerous autonomous decisions occur over minutes or hours5
. A single hallucinated choice in these sequences can compromise the validity of the entire output, making reduced hallucinations essential for reliable operation5
.On benchmarks, the agent achieved state-of-the-art results. On Humanity's Last Exam, which tests AI models on expert-level reasoning and problem-solving across academic subjects, Gemini Deep Research scored 46.4%, surpassing both Gemini 3 Pro at 43.2% and OpenAI's GPT-5 Pro at 38.9%
3
. Google also introduced DeepSearchQA, a new benchmark featuring 900 hand-crafted tasks across 17 fields designed to test agent comprehensiveness on web research tasks3
. The agent scored 66.1% on DeepSearchQA compared to GPT-5 Pro's 65.2%, while on BrowseComp, which evaluates locating hard-to-find facts, it achieved 59.2% versus GPT-5 Pro's 59.5%3
.Related Stories
Google plans to integrate the deep research agent into several existing services to broaden accessibility, including Google Search for improved query resolution, Google Finance for detailed market analysis, the Gemini App for user interactions, and NotebookLM for note-taking and knowledge organization
2
. This represents another step toward preparing for a world where AI agents conduct searches on behalf of humans1
.The agent is now available for developers through Google AI Studio, with the consumer app integrations coming soon
2
. Google will expand built-in agents and introduce the ability to build and bring custom agents, enabling developers to connect Gemini models, Google's built-in agents, and their custom agents using one API2
. The Interactions API also automates some of the work involved in managing data that users upload to an AI application for processing, and includes an MCP tool for connecting AI models to third-party systems4
.Summarized by
Navi
[1]
[4]
17 Dec 2024•Technology

14 Oct 2025•Technology

09 Nov 2024•Technology

1
Science and Research

2
Policy and Regulation

3
Technology
