Curated by THEOUTPOST
On Fri, 20 Dec, 12:06 AM UTC
3 Sources
[1]
Need a research hypothesis? Ask AI
Crafting a unique and promising research hypothesis is a fundamental skill for any scientist. It can also be time consuming: New PhD candidates might spend the first year of their program trying to decide exactly what to explore in their experiments. What if artificial intelligence could help? MIT researchers have created a way to autonomously generate and evaluate promising research hypotheses across fields, through human-AI collaboration. In a new paper, they describe how they used this framework to create evidence-driven hypotheses that align with unmet research needs in the field of biologically inspired materials. Published today in Advanced Materials, the study was co-authored by Alireza Ghafarollahi, a postdoc in the Laboratory for Atomistic and Molecular Mechanics (LAMM), and Markus Buehler, the Jerry McAfee Professor in Engineering in MIT's departments of Civil and Environmental Engineering and of Mechanical Engineering and director of LAMM. The framework, which the researchers call SciAgents, consists of multiple AI agents, each with specific capabilities and access to data, that leverage "graph reasoning" methods, where AI models utilize a knowledge graph that organizes and defines relationships between diverse scientific concepts. The multi-agent approach mimics the way biological systems organize themselves as groups of elementary building blocks. Buehler notes that this "divide and conquer" principle is a prominent paradigm in biology at many levels, from materials to swarms of insects to civilizations -- all examples where the total intelligence is much greater than the sum of individuals' abilities. "By using multiple AI agents, we're trying to simulate the process by which communities of scientists make discoveries," says Buehler. "At MIT, we do that by having a bunch of people with different backgrounds working together and bumping into each other at coffee shops or in MIT's Infinite Corridor. But that's very coincidental and slow. Our quest is to simulate the process of discovery by exploring whether AI systems can be creative and make discoveries." Automating good ideas As recent developments have demonstrated, large language models (LLMs) have shown an impressive ability to answer questions, summarize information, and execute simple tasks. But they are quite limited when it comes to generating new ideas from scratch. The MIT researchers wanted to design a system that enabled AI models to perform a more sophisticated, multistep process that goes beyond recalling information learned during training, to extrapolate and create new knowledge. The foundation of their approach is an ontological knowledge graph, which organizes and makes connections between diverse scientific concepts. To make the graphs, the researchers feed a set of scientific papers into a generative AI model. In previous work, Buehler used a field of math known as category theory to help the AI model develop abstractions of scientific concepts as graphs, rooted in defining relationships between components, in a way that could be analyzed by other models through a process called graph reasoning. This focuses AI models on developing a more principled way to understand concepts; it also allows them to generalize better across domains. "This is really important for us to create science-focused AI models, as scientific theories are typically rooted in generalizable principles rather than just knowledge recall," Buehler says. "By focusing AI models on 'thinking' in such a manner, we can leapfrog beyond conventional methods and explore more creative uses of AI." For the most recent paper, the researchers used about 1,000 scientific studies on biological materials, but Buehler says the knowledge graphs could be generated using far more or fewer research papers from any field. With the graph established, the researchers developed an AI system for scientific discovery, with multiple models specialized to play specific roles in the system. Most of the components were built off of OpenAI's ChatGPT-4 series models and made use of a technique known as in-context learning, in which prompts provide contextual information about the model's role in the system while allowing it to learn from data provided. The individual agents in the framework interact with each other to collectively solve a complex problem that none of them would be able to do alone. The first task they are given is to generate the research hypothesis. The LLM interactions start after a subgraph has been defined from the knowledge graph, which can happen randomly or by manually entering a pair of keywords discussed in the papers. In the framework, a language model the researchers named the "Ontologist" is tasked with defining scientific terms in the papers and examining the connections between them, fleshing out the knowledge graph. A model named "Scientist 1" then crafts a research proposal based on factors like its ability to uncover unexpected properties and novelty. The proposal includes a discussion of potential findings, the impact of the research, and a guess at the underlying mechanisms of action. A "Scientist 2" model expands on the idea, suggesting specific experimental and simulation approaches and making other improvements. Finally, a "Critic" model highlights its strengths and weaknesses and suggests further improvements. "It's about building a team of experts that are not all thinking the same way," Buehler says. "They have to think differently and have different capabilities. The Critic agent is deliberately programmed to critique the others, so you don't have everybody agreeing and saying it's a great idea. You have an agent saying, 'There's a weakness here, can you explain it better?' That makes the output much different from single models." Other agents in the system are able to search existing literature, which provides the system with a way to not only assess feasibility but also create and assess the novelty of each idea. Making the system stronger To validate their approach, Buehler and Ghafarollahi built a knowledge graph based on the words "silk" and "energy intensive." Using the framework, the "Scientist 1" model proposed integrating silk with dandelion-based pigments to create biomaterials with enhanced optical and mechanical properties. The model predicted the material would be significantly stronger than traditional silk materials and require less energy to process. Scientist 2 then made suggestions, such as using specific molecular dynamic simulation tools to explore how the proposed materials would interact, adding that a good application for the material would be a bioinspired adhesive. The Critic model then highlighted several strengths of the proposed material and areas for improvement, such as its scalability, long-term stability, and the environmental impacts of solvent use. To address those concerns, the Critic suggested conducting pilot studies for process validation and performing rigorous analyses of material durability. The researchers also conducted other experiments with randomly chosen keywords, which produced various original hypotheses about more efficient biomimetic microfluidic chips, enhancing the mechanical properties of collagen-based scaffolds, and the interaction between graphene and amyloid fibrils to create bioelectronic devices. The system was able to come up with these new, rigorous ideas based on the path from the knowledge graph," Ghafarollahi says. "In terms of novelty and applicability, the materials seemed robust and novel. In future work, we're going to generate thousands, or tens of thousands, of new research ideas, and then we can categorize them, try to understand better how these materials are generated and how they could be improved further." Going forward, the researchers hope to incorporate new tools for retrieving information and running simulations into their frameworks. They can also easily swap out the foundation models in their frameworks for more advanced models, allowing the system to adapt with the latest innovations in AI. "Because of the way these agents interact, an improvement in one model, even if it's slight, has a huge impact on the overall behaviors and output of the system," Buehler says. Since releasing a preprint with open-source details of their approach, the researchers have been contacted by hundreds of people interesting in using the frameworks in diverse scientific fields and even areas like finance and cybersecurity. "There's a lot of stuff you can do without having to go to the lab," Buehler says. "You want to basically go to the lab at the very end of the process. The lab is expensive and takes a long time, so you want a system that can drill very deep into the best ideas, formulating the best hypotheses and accurately predicting emergent behaviors. Our vision is to make this easy to use, so you can use an app to bring in other ideas or drag in datasets to really challenge the model to make new discoveries."
[2]
Need a research hypothesis? Ask AI.
Caption: A language model the researchers named the "Ontologist" is tasked with defining scientific terms in the papers and examining the connections between them, fleshing out the knowledge graph. Crafting a unique and promising research hypothesis is a fundamental skill for any scientist. It can also be time consuming: New PhD candidates might spend the first year of their program trying to decide exactly what to explore in their experiments. What if artificial intelligence could help? MIT researchers have created a way to autonomously generate and evaluate promising research hypotheses across fields, through human-AI collaboration. In a new paper, they describe how they used this framework to create evidence-driven hypotheses that align with unmet research needs in the field of biologically inspired materials. Published Wednesday in Advanced Materials, the study was co-authored by Alireza Ghafarollahi, a postdoc in the Laboratory for Atomistic and Molecular Mechanics (LAMM), and Markus Buehler, the Jerry McAfee Professor in Engineering in MIT's departments of Civil and Environmental Engineering and of Mechanical Engineering and director of LAMM. The framework, which the researchers call SciAgents, consists of multiple AI agents, each with specific capabilities and access to data, that leverage "graph reasoning" methods, where AI models utilize a knowledge graph that organizes and defines relationships between diverse scientific concepts. The multi-agent approach mimics the way biological systems organize themselves as groups of elementary building blocks. Buehler notes that this "divide and conquer" principle is a prominent paradigm in biology at many levels, from materials to swarms of insects to civilizations -- all examples where the total intelligence is much greater than the sum of individuals' abilities. "By using multiple AI agents, we're trying to simulate the process by which communities of scientists make discoveries," says Buehler. "At MIT, we do that by having a bunch of people with different backgrounds working together and bumping into each other at coffee shops or in MIT's Infinite Corridor. But that's very coincidental and slow. Our quest is to simulate the process of discovery by exploring whether AI systems can be creative and make discoveries." Automating good ideas As recent developments have demonstrated, large language models (LLMs) have shown an impressive ability to answer questions, summarize information, and execute simple tasks. But they are quite limited when it comes to generating new ideas from scratch. The MIT researchers wanted to design a system that enabled AI models to perform a more sophisticated, multistep process that goes beyond recalling information learned during training, to extrapolate and create new knowledge. The foundation of their approach is an ontological knowledge graph, which organizes and makes connections between diverse scientific concepts. To make the graphs, the researchers feed a set of scientific papers into a generative AI model. In previous work, Buehler used a field of math known as category theory to help the AI model develop abstractions of scientific concepts as graphs, rooted in defining relationships between components, in a way that could be analyzed by other models through a process called graph reasoning. This focuses AI models on developing a more principled way to understand concepts; it also allows them to generalize better across domains. "This is really important for us to create science-focused AI models, as scientific theories are typically rooted in generalizable principles rather than just knowledge recall," Buehler says. "By focusing AI models on 'thinking' in such a manner, we can leapfrog beyond conventional methods and explore more creative uses of AI." For the most recent paper, the researchers used about 1,000 scientific studies on biological materials, but Buehler says the knowledge graphs could be generated using far more or fewer research papers from any field. With the graph established, the researchers developed an AI system for scientific discovery, with multiple models specialized to play specific roles in the system. Most of the components were built off of OpenAI's ChatGPT-4 series models and made use of a technique known as in-context learning, in which prompts provide contextual information about the model's role in the system while allowing it to learn from data provided. The individual agents in the framework interact with each other to collectively solve a complex problem that none of them would be able to do alone. The first task they are given is to generate the research hypothesis. The LLM interactions start after a subgraph has been defined from the knowledge graph, which can happen randomly or by manually entering a pair of keywords discussed in the papers. In the framework, a language model the researchers named the "Ontologist" is tasked with defining scientific terms in the papers and examining the connections between them, fleshing out the knowledge graph. A model named "Scientist 1" then crafts a research proposal based on factors like its ability to uncover unexpected properties and novelty. The proposal includes a discussion of potential findings, the impact of the research, and a guess at the underlying mechanisms of action. A "Scientist 2" model expands on the idea, suggesting specific experimental and simulation approaches and making other improvements. Finally, a "Critic" model highlights its strengths and weaknesses and suggests further improvements. "It's about building a team of experts that are not all thinking the same way," Buehler says. "They have to think differently and have different capabilities. The Critic agent is deliberately programmed to critique the others, so you don't have everybody agreeing and saying it's a great idea. You have an agent saying, 'There's a weakness here, can you explain it better?' That makes the output much different from single models." Other agents in the system are able to search existing literature, which provides the system with a way to not only assess feasibility but also create and assess the novelty of each idea. Making the system stronger To validate their approach, Buehler and Ghafarollahi built a knowledge graph based on the words "silk" and "energy intensive." Using the framework, the "Scientist 1" model proposed integrating silk with dandelion-based pigments to create biomaterials with enhanced optical and mechanical properties. The model predicted the material would be significantly stronger than traditional silk materials and require less energy to process. Scientist 2 then made suggestions, such as using specific molecular dynamic simulation tools to explore how the proposed materials would interact, adding that a good application for the material would be a bioinspired adhesive. The Critic model then highlighted several strengths of the proposed material and areas for improvement, such as its scalability, long-term stability, and the environmental impacts of solvent use. To address those concerns, the Critic suggested conducting pilot studies for process validation and performing rigorous analyses of material durability. The researchers also conducted other experiments with randomly chosen keywords, which produced various original hypotheses about more efficient biomimetic microfluidic chips, enhancing the mechanical properties of collagen-based scaffolds, and the interaction between graphene and amyloid fibrils to create bioelectronic devices. "The system was able to come up with these new, rigorous ideas based on the path from the knowledge graph," Ghafarollahi says. "In terms of novelty and applicability, the materials seemed robust and novel. In future work, we're going to generate thousands, or tens of thousands, of new research ideas, and then we can categorize them, try to understand better how these materials are generated and how they could be improved further." Going forward, the researchers hope to incorporate new tools for retrieving information and running simulations into their frameworks. They can also easily swap out the foundation models in their frameworks for more advanced models, allowing the system to adapt with the latest innovations in AI. "Because of the way these agents interact, an improvement in one model, even if it's slight, has a huge impact on the overall behaviors and output of the system," Buehler says. Since releasing a preprint with open-source details of their approach, the researchers have been contacted by hundreds of people interested in using the frameworks in diverse scientific fields and even areas like finance and cybersecurity. "There's a lot of stuff you can do without having to go to the lab," Buehler says. "You want to basically go to the lab at the very end of the process. The lab is expensive and takes a long time, so you want a system that can drill very deep into the best ideas, formulating the best hypotheses and accurately predicting emergent behaviors. Our vision is to make this easy to use, so you can use an app to bring in other ideas or drag in datasets to really challenge the model to make new discoveries."
[3]
AI agents mimic scientific collaboration to generate evidence-driven hypotheses
Crafting a unique and promising research hypothesis is a fundamental skill for any scientist. It can also be time consuming: New Ph.D. candidates might spend the first year of their program trying to decide exactly what to explore in their experiments. What if artificial intelligence could help? MIT researchers have created a way to autonomously generate and evaluate promising research hypotheses across fields, through human-AI collaboration. In a new paper, they describe how they used this framework to create evidence-driven hypotheses that align with unmet research needs in the field of biologically inspired materials. Published in Advanced Materials, the study was co-authored by Alireza Ghafarollahi, a postdoc in the Laboratory for Atomistic and Molecular Mechanics (LAMM), and Markus Buehler, the Jerry McAfee Professor in Engineering in MIT's departments of Civil and Environmental Engineering and of Mechanical Engineering and director of LAMM. The framework, which the researchers call SciAgents, consists of multiple AI agents, each with specific capabilities and access to data, that leverage "graph reasoning" methods, where AI models utilize a knowledge graph that organizes and defines relationships between diverse scientific concepts. The multi-agent approach mimics the way biological systems organize themselves as groups of elementary building blocks. Buehler notes that this "divide and conquer" principle is a prominent paradigm in biology at many levels, from materials to swarms of insects to civilizations -- all examples where the total intelligence is much greater than the sum of individuals' abilities. "By using multiple AI agents, we're trying to simulate the process by which communities of scientists make discoveries," says Buehler. "At MIT, we do that by having a bunch of people with different backgrounds working together and bumping into each other at coffee shops or in MIT's Infinite Corridor. But that's very coincidental and slow. Our quest is to simulate the process of discovery by exploring whether AI systems can be creative and make discoveries." Automating good ideas As recent developments have demonstrated, large language models (LLMs) have shown an impressive ability to answer questions, summarize information, and execute simple tasks. But they are quite limited when it comes to generating new ideas from scratch. The MIT researchers wanted to design a system that enabled AI models to perform a more sophisticated, multistep process that goes beyond recalling information learned during training, to extrapolate and create new knowledge. The foundation of their approach is an ontological knowledge graph, which organizes and makes connections between diverse scientific concepts. To make the graphs, the researchers feed a set of scientific papers into a generative AI model. In previous work, Buehler used a field of math known as category theory to help the AI model develop abstractions of scientific concepts as graphs, rooted in defining relationships between components, in a way that could be analyzed by other models through a process called graph reasoning. This focuses AI models on developing a more principled way to understand concepts; it also allows them to generalize better across domains. "This is really important for us to create science-focused AI models, as scientific theories are typically rooted in generalizable principles rather than just knowledge recall," Buehler says. "By focusing AI models on 'thinking' in such a manner, we can leapfrog beyond conventional methods and explore more creative uses of AI." For the most recent paper, the researchers used about 1,000 scientific studies on biological materials, but Buehler says the knowledge graphs could be generated using far more or fewer research papers from any field. With the graph established, the researchers developed an AI system for scientific discovery, with multiple models specialized to play specific roles in the system. Most of the components were built off of OpenAI's ChatGPT-4 series models and made use of a technique known as in-context learning, in which prompts provide contextual information about the model's role in the system while allowing it to learn from data provided. The individual agents in the framework interact with each other to collectively solve a complex problem that none of them would be able to do alone. The first task they are given is to generate the research hypothesis. The LLM interactions start after a subgraph has been defined from the knowledge graph, which can happen randomly or by manually entering a pair of keywords discussed in the papers. In the framework, a language model the researchers named the "Ontologist" is tasked with defining scientific terms in the papers and examining the connections between them, fleshing out the knowledge graph. A model named "Scientist 1" then crafts a research proposal based on factors like its ability to uncover unexpected properties and novelty. The proposal includes a discussion of potential findings, the impact of the research, and a guess at the underlying mechanisms of action. A "Scientist 2" model expands on the idea, suggesting specific experimental and simulation approaches and making other improvements. Finally, a "Critic" model highlights its strengths and weaknesses and suggests further improvements. "It's about building a team of experts that are not all thinking the same way," Buehler says. "They have to think differently and have different capabilities. The Critic agent is deliberately programmed to critique the others, so you don't have everybody agreeing and saying it's a great idea. You have an agent saying, 'There's a weakness here, can you explain it better?' That makes the output much different from single models." Other agents in the system are able to search existing literature, which provides the system with a way to not only assess feasibility but also create and assess the novelty of each idea. Making the system stronger To validate their approach, Buehler and Ghafarollahi built a knowledge graph based on the words "silk" and "energy intensive." Using the framework, the "Scientist 1" model proposed integrating silk with dandelion-based pigments to create biomaterials with enhanced optical and mechanical properties. The model predicted the material would be significantly stronger than traditional silk materials and require less energy to process. Scientist 2 then made suggestions, such as using specific molecular dynamic simulation tools to explore how the proposed materials would interact, adding that a good application for the material would be a bioinspired adhesive. The Critic model then highlighted several strengths of the proposed material and areas for improvement, such as its scalability, long-term stability, and the environmental impacts of solvent use. To address those concerns, the Critic suggested conducting pilot studies for process validation and performing rigorous analyses of material durability. The researchers also conducted other experiments with randomly chosen keywords, which produced various original hypotheses about more efficient biomimetic microfluidic chips, enhancing the mechanical properties of collagen-based scaffolds, and the interaction between graphene and amyloid fibrils to create bioelectronic devices. "The system was able to come up with these new, rigorous ideas based on the path from the knowledge graph," Ghafarollahi says. "In terms of novelty and applicability, the materials seemed robust and novel. In future work, we're going to generate thousands, or tens of thousands, of new research ideas, and then we can categorize them, try to understand better how these materials are generated and how they could be improved further." Going forward, the researchers hope to incorporate new tools for retrieving information and running simulations into their frameworks. They can also easily swap out the foundation models in their frameworks for more advanced models, allowing the system to adapt with the latest innovations in AI. "Because of the way these agents interact, an improvement in one model, even if it's slight, has a huge impact on the overall behaviors and output of the system," Buehler says. Since releasing a preprint with open-source details of their approach, the researchers have been contacted by hundreds of people interested in using the frameworks in diverse scientific fields and even areas like finance and cybersecurity. "There's a lot of stuff you can do without having to go to the lab," Buehler says. "You basically want to go to the lab at the very end of the process. The lab is expensive and takes a long time, so you want a system that can drill very deep into the best ideas, formulating the best hypotheses and accurately predicting emergent behaviors. Our vision is to make this easy to use, so you can use an app to bring in other ideas or drag in datasets to really challenge the model to make new discoveries."
Share
Share
Copy Link
MIT scientists have created an AI system called SciAgents that can autonomously generate and evaluate research hypotheses across various fields, potentially revolutionizing the scientific discovery process.
Researchers at the Massachusetts Institute of Technology (MIT) have created an innovative artificial intelligence (AI) framework called SciAgents, designed to autonomously generate and evaluate promising research hypotheses across various scientific fields. This groundbreaking development, published in Advanced Materials, could potentially revolutionize the scientific discovery process 1.
SciAgents consists of multiple AI agents, each with specific capabilities and access to data. The system leverages "graph reasoning" methods, utilizing a knowledge graph that organizes and defines relationships between diverse scientific concepts. This multi-agent approach mimics the way biological systems organize themselves, following a "divide and conquer" principle observed in nature 2.
Ontological Knowledge Graph: The foundation of the approach, created by feeding scientific papers into a generative AI model.
Specialized AI Agents:
The system employs OpenAI's ChatGPT-4 series models and uses in-context learning, where prompts provide contextual information about each model's role. The researchers utilized category theory to develop abstractions of scientific concepts as graphs, enabling better generalization across domains 3.
This AI-driven approach could significantly accelerate the research hypothesis generation process, which traditionally can take months or even years for new researchers. By simulating the collaborative nature of scientific communities, SciAgents aims to explore whether AI systems can be creative and make discoveries in a more efficient manner.
While the current study focused on biologically inspired materials using about 1,000 scientific papers, the researchers suggest that this method could be applied to various scientific fields. This versatility opens up possibilities for accelerating discoveries across multiple disciplines, potentially leading to breakthroughs in areas such as medicine, materials science, and beyond.
As with any AI-driven system in scientific research, there may be concerns about the reliability and originality of the generated hypotheses. Future work may need to address issues of bias in the training data and the integration of human oversight in the hypothesis generation process.
Reference
[1]
[2]
Kevin Yager from Brookhaven National Laboratory proposes the 'exocortex', an AI-powered extension of the human brain to enhance scientific research and creativity.
2 Sources
AI's ability to generate 'hallucinations' is being harnessed by scientists to accelerate research and innovation across various fields, from medicine to chemistry, challenging the negative perception of AI-generated content.
2 Sources
MIT professor Markus J. Buehler has created an advanced AI method that uses graph-based representation and category theory to find unexpected connections between diverse fields, potentially accelerating scientific discovery and innovation.
2 Sources
Sakana AI, a Tokyo-based startup, has developed an AI scientist capable of automating scientific research and discovery. This breakthrough could potentially accelerate scientific progress and lead to groundbreaking discoveries in various fields.
3 Sources
A significant portion of research papers may already be co-authored by AI, raising questions about authorship, ethics, and the future of scientific publishing.
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved