Teams of AI agents accelerate drug discovery, identifying candidates in hours instead of months

2 Sources

Share

Two new AI systems published in Nature use teams of AI agents to develop hypotheses and analyze data for scientific research. Google's Co-Scientist and FutureHouse's Robin identified promising drug candidates for acute myeloid leukemia and macular degeneration in hours, completing tasks that typically take human researchers months. While human supervision remains essential, these AI-based science assistants demonstrate how teams of AI agents can accelerate scientific research by synthesizing vast amounts of literature.

AI Agents Transform Scientific Research Timelines

Two groundbreaking systems described in Nature are reshaping how scientific research unfolds, using teams of AI agents to develop hypotheses, propose experiments, and analyze data at unprecedented speeds. Google's Co-Scientist and FutureHouse's Robin both demonstrated the ability to identify promising drug candidates in mere hours—tasks that would typically consume months of human effort

1

. These AI-based science assistants represent a shift in laboratory workflows, though human supervision remains central to their operation.

Vivek Natarajan, a researcher at Google DeepMind who helped develop Co-Scientist, describes the system as "an agentic, in silico implementation of the thought process in a scientist's head." The goal, he explains, is to "give scientists superpowers"

1

. Both systems tackle a pressing challenge: the explosion of scientific literature has made it nearly impossible for researchers to stay current even within their own fields, let alone identify relevant connections across disciplines.

Source: Nature

Source: Nature

Google's Co-Scientist Tackles Drug Retargeting

Built on Google's Gemini model, Co-Scientist operates as what researchers call "scientist in the loop," keeping human researchers engaged at critical decision points

2

. The system interprets research goals provided by scientists and launches literature searches to generate hypotheses. These hypotheses then compete in a "tournament" evaluated by a Reflection agent, while an Evolution agent refines surviving ideas through iterative cycles.

In drug discovery experiments targeting acute myeloid leukemia, Co-Scientist identified a list of drug candidates from which human researchers selected five for further study. Three showed promise in preliminary tests on lab-grown cells

1

. The system evaluates suggestions based on plausibility, novelty, testability, and safety throughout the process. Access to scientific literature proved crucial—it "prevented the hallucination of seemingly novel but implausible hypotheses," according to the research team

2

.

In another experiment, Co-Scientist developed a hypothesis explaining why certain antimicrobial resistance genes appear across multiple bacterial species. The system reached the same conclusion in days that a research group had spent considerably longer studying—results they had not yet published

1

. About 100 scientists outside Google DeepMind now have access to test its capabilities across various research settings.

FutureHouse's Robin Advances Autonomous Analysis

Developed by FutureHouse, a non-profit AI research lab in San Francisco, Robin takes the agentic approach further by incorporating specialized analysis capabilities. The system was instructed to find treatments for dry age-related macular degeneration, beginning with AI agents trained to conduct literature reviews

1

. Robin used these reports to select lab experiments testing various drug candidates, with humans conducting the physical experiments and feeding data back to the system.

An AI agent specialized in analyzing data then processed the experimental results. Through this procedure, Robin identified ripasudil—a drug approved for treating glaucoma—as a candidate treatment for macular degeneration. The system suggested assays to confirm ripasudil's activity and proposed follow-up experiments

1

. FutureHouse researchers emphasize that Robin targets "low-hanging fruit" that human experts might overlook due to knowledge compartmentalization, focusing on "combinatorial synthesis" to identify non-obvious connections between disparate fields

2

.

The Reality Check: Limitations and Human Oversight

While these systems accelerate scientific research dramatically, significant caveats remain. None of the drug candidates identified have been fully evaluated, and many compounds that pass initial assays in lab-grown cells fail more stringent testing

1

. Both systems rely on large language models prone to AI hallucinations—false but plausible-sounding answers that could lead researchers down costly dead ends.

Ola Spjuth, who studies AI for drug discovery at Uppsala University, notes that hallucinations will likely remain a concern with this form of AI. However, cutting-edge models hallucinate less than predecessors, and researchers can audit decision-making processes to understand the reasoning behind suggestions

1

. Both Robin and Co-Scientist include steps where AI agents debate hypotheses or compare results among themselves, potentially filtering out faulty reasoning.

"We cannot just delegate important decisions right now to LLMs and AI agents," Spjuth emphasizes. "We need to supervise these methods"

1

. Karandeep Singh, who oversees AI initiatives for University of California San Diego Health, adds that real-world performance across diverse contexts remains to be seen: "You don't know how it works in reality until it's been made available to a broad set of people"

1

.

What This Means for Research's Future

The question isn't whether AI can perform certain tasks better than humans, but whether humans would realistically conduct these exhaustive literature searches at all. By chewing through massive amounts of information in the background, these systems augment scientists' capabilities rather than replace them. The role of human researchers is shifting—companies are advancing sophisticated robots for lab work, while Google researchers reported another agentic AI system called Empirical Research Assistance that writes high-quality software for fields from cosmology to neuroscience

1

.

Source: Ars Technica

Source: Ars Technica

Samuel Rodriques, chief executive and co-founder of FutureHouse, suggests that AI's ability to handle hypothesis generation and data interpretation may vary by research type. For drug discovery specifically, "there's a huge way to go" before AI can design entirely new therapeutic applications

1

. Google's system is model-agnostic, allowing it to switch to better-performing models as AI evolves, though it "inherits the intrinsic limitations of its underlying models, including imperfect factuality and the potential for hallucinations"

2

.

As these AI-based science assistants move from proof-of-concept to broader deployment, researchers will be watching closely to see whether the speed gains translate across different scientific domains and whether human oversight can effectively catch AI errors before they derail expensive research programs.

Today's Top Stories

TheOutpost.ai

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Instagram logo
LinkedIn logo
Youtube logo
© 2026 TheOutpost.AI All rights reserved