7 Sources
[1]
Nvidia-backed AI startup SandboxAQ creates new data to speed up drug discovery
SAN FRANCISCO, June 18 (Reuters) - SandboxAQ, an artificial intelligence startup spun out of Alphabet's Google and backed by Nvidia (NVDA.O), opens new tab, on Wednesday released a trove of data it hopes will speed up the discovery of new medical treatments by helping scientists understand how drugs stick to proteins. The goal is to help scientists predict whether a drug will bind to its target in the human body. But while the data is backed up by real-world scientific experiments, it did not come from a lab. Instead, SandboxAQ, which has raised nearly $1 billion in venture capital, generated the data using Nvidia's chips and will feed it back into AI models that it hopes scientists can use to rapidly predict whether a small-molecule pharmaceutical will bind to the protein that researchers are targeting, a key question that must be answered before a drug candidate can move forward. For example, if a drug is meant to inhibit a biological process like the progression of a disease, scientists can use the tool to predict whether the drug molecule is likely to bind to the proteins involved in that process. The approach is an emerging field that combines traditional scientific computing techniques with advancements in AI. In many fields, scientists have long had equations that can precisely predict how atoms combine into molecules. But even for relatively small three-dimensional pharmaceutical molecules, the potential combinations become far too vast to calculate manually, even with today's fastest computers. So SandboxAQ's approach was to use existing experimental data to calculate about 5.2 million new, "synthetic" three-dimensional molecules - molecules that haven't been observed in the real world, but were calculated with equations based on real-world data. That synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting in a fraction of the time it would take to calculate it manually, while retaining accuracy. SandboxAQ will charge money for its own AI models developed with the data, which it hopes will get results that rival running lab experiments, but virtually. "This is a long-standing problem in biology that we've all, as an industry, been trying to solve for," Nadia Harhen, general manager of AI simulation at SandboxAQ, told Reuters on Tuesday. "All of these computationally generated structures are tagged to a ground-truth experimental data, and so when you pick this data set and you train models, you can actually use the synthetic data in a way that's never been done before." Reporting by Stephen Nellis; Editing by Leslie Adler Our Standards: The Thomson Reuters Trust Principles., opens new tab Suggested Topics:Future of Health
[2]
5 million synthetic drug models could revolutionize pharma pipelines
"This is a long-standing problem in biology that we've all, as an industry, been trying to solve for," Nadia Harhen, general manager of AI simulation at SandboxAQ, told Reuters. "All of these computationally generated structures are tagged to a ground-truth experimental data, and so when you pick this data set and you train models, you can actually use the synthetic data in a way that's never been done before." To bypass the data bottleneck, SandboxAQ used NVIDIA chips to generate synthetic structures. These are not observed in labs but calculated from real experimental data using the Boltz-1x co-folding model. For each protein-drug pair from public datasets like ChEMBL and BindingDB, the team created five different 3D poses. They then cross-referenced these predictions with computational potency values to retain only the most accurate ones. The final SAIR dataset includes those high-confidence entries. Boosting AI model training with open data AI models like AlphaFold2 and newer systems such as AlphaFold3 and Boltz-2 have made major progress in predicting 3D structures and binding poses. But they still struggle when dealing with unfamiliar proteins or molecules outside their training data.
[3]
Nvidia-backed AI startup SandboxAQ creates new data to speed up drug discovery
SandboxAQ, an AI startup originating from Alphabet's Google and backed by Nvidia, has released a substantial dataset to accelerate drug discovery. Generated using Nvidia's chips, the 5.2 million synthetic three-dimensional molecules aim to predict drug-protein binding. This data, validated by real-world experiments, will train AI models, potentially revolutionising pharmaceutical research by virtually replicating lab results and saving time.SandboxAQ, an artificial intelligence startup spun out of Alphabet's Google and backed by Nvidia, on Wednesday released a trove of data it hopes will speed up the discovery of new medical treatments by helping scientists understand how drugs stick to proteins. The goal is to help scientists predict whether a drug will bind to its target in the human body. But while the data is backed up by real-world scientific experiments, it did not come from a lab. Instead, SandboxAQ, which has raised nearly $1 billion in venture capital, generated the data using Nvidia's chips and will feed it back into AI models that it hopes scientists can use to rapidly predict whether a small-molecule pharmaceutical will bind to the protein that researchers are targeting, a key question that must be answered before a drug candidate can move forward. For example, if a drug is meant to inhibit a biological process like the progression of a disease, scientists can use the tool to predict whether the drug molecule is likely to bind to the proteins involved in that process. The approach is an emerging field that combines traditional scientific computing techniques with advancements in AI. In many fields, scientists have long had equations that can precisely predict how atoms combine into molecules. But even for relatively small three-dimensional pharmaceutical molecules, the potential combinations become far too vast to calculate manually, even with today's fastest computers. So SandboxAQ's approach was to use existing experimental data to calculate about 5.2 million new, "synthetic" three-dimensional molecules - molecules that haven't been observed in the real world, but were calculated with equations based on real-world data. That synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting in a fraction of the time it would take to calculate it manually, while retaining accuracy. SandboxAQ will charge money for its own AI models developed with the data, which it hopes will get results that rival running lab experiments, but virtually. "This is a long-standing problem in biology that we've all, as an industry, been trying to solve for," Nadia Harhen, general manager of AI simulation at SandboxAQ, told Reuters on Tuesday. "All of these computationally generated structures are tagged to a ground-truth experimental data, and so when you pick this data set and you train models, you can actually use the synthetic data in a way that's never been done before."
[4]
SandboxAQ Launches Dataset for Training AI Models in Drug Discovery | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. The SAIR (Structurally Augmented IC50 Repository) is a detailed dataset of protein-ligand pairs with annotated experimental potency data designed to enhance the speed and accuracy of binding affinity predictions, the company said in a Wednesday (June 18) press release. SAIR includes 5.2 million synthetic 3D molecular structures across 1 million protein-ligand systems, according to the release. The SAIR dataset was generated with the use of SandboxAQ's AI large quantitative model (LQM) capabilities and Nvidia's development platform for AI training and fine-tuning, DGX Cloud, the release said. With this dataset, resources can train AI models to accurately predict protein-ligand binding affinities at least 1,000 times faster than traditional physics-based methods, per the release. "This achievement marks a pivotal moment in drug discovery, demonstrating our capacity to fundamentally transform the traditional trial-and-error process into a rapid, data-driven approach," Nadia Harhen, general manager of AI simulation at SandboxAQ, said in the release. "By putting five-plus million, affinity-labeled protein-ligand structures into the public domain, we're handing every scientist the raw fuel to train breakthrough models overnight, setting a new pace for drug discovery." SandboxAQ said in April that it raised over $450 million in a Series E round to support its development of large quantitative models that help enterprises leverage AI to solve scientific and quantitative challenges. The merger of AI with quantum computing could have significant implications for many verticals, Chris Hume, senior director of business operations for SandboxAQ, told PYMNTS in an interview posted in February 2024. "The physical world is defined by quantum mechanics," Hume said. "The more effectively we can understand those interactions and then model those interactions, the more efficiently and effectively you can build predictive models." PYMNTS reported in October that the wave of AI breakthroughs in the medical field was reflected in financial markets, where HealthTech stocks rose 12% in 2024 and AI healthcare companies commanded valuations up to five times higher than their non-AI counterparts.
[5]
Nvidia-backed AI startup SandboxAQ creates new data to speed up drug discovery
SAN FRANCISCO -- SandboxAQ, an artificial intelligence startup spun out of Alphabet's Google and backed by Nvidia, on Wednesday released a trove of data it hopes will speed up the discovery of new medical treatments by helping scientists understand how drugs stick to proteins. The goal is to help scientists predict whether a drug will bind to its target in the human body. But while the data is backed up by real-world scientific experiments, it did not come from a lab. Instead, SandboxAQ, which has raised nearly US$1 billion in venture capital, generated the data using Nvidia's chips and will feed it back into AI models that it hopes scientists can use to rapidly predict whether a small-molecule pharmaceutical will bind to the protein that researchers are targeting, a key question that must be answered before a drug candidate can move forward. For example, if a drug is meant to inhibit a biological process like the progression of a disease, scientists can use the tool to predict whether the drug molecule is likely to bind to the proteins involved in that process. The approach is an emerging field that combines traditional scientific computing techniques with advancements in AI. In many fields, scientists have long had equations that can precisely predict how atoms combine into molecules. But even for relatively small three-dimensional pharmaceutical molecules, the potential combinations become far too vast to calculate manually, even with today's fastest computers. So SandboxAQ's approach was to use existing experimental data to calculate about 5.2 million new, "synthetic" three-dimensional molecules - molecules that haven't been observed in the real world, but were calculated with equations based on real-world data. That synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting in a fraction of the time it would take to calculate it manually, while retaining accuracy. SandboxAQ will charge money for its own AI models developed with the data, which it hopes will get results that rival running lab experiments, but virtually. "This is a long-standing problem in biology that we've all, as an industry, been trying to solve for," Nadia Harhen, general manager of AI simulation at SandboxAQ, told Reuters on Tuesday. "All of these computationally generated structures are tagged to a ground-truth experimental data, and so when you pick this data set and you train models, you can actually use the synthetic data in a way that's never been done before." ---
[6]
SandboxAQ releases AI data to speed up drug discovery By Investing.com
Investing.com -- Artificial intelligence startup SandboxAQ released a large dataset on Wednesday aimed at accelerating drug discovery by helping scientists predict how drugs bind to proteins in the human body. The company, which spun out of Alphabet (NASDAQ:GOOGL)'s Google and has raised nearly $1 billion in venture capital, generated the data using Nvidia (NASDAQ:NVDA)'s chips rather than traditional laboratory experiments. This synthetic data will be incorporated into AI models designed to quickly determine if pharmaceutical molecules will bind to target proteins - a critical step in drug development. SandboxAQ's approach combines traditional scientific computing with AI advancements to address a fundamental challenge in drug discovery. While scientists have equations to predict atomic combinations, the calculations become too complex for even modern computers when dealing with three-dimensional pharmaceutical molecules. The company calculated approximately 5.2 million new "synthetic" three-dimensional molecules based on existing experimental data. These calculated molecules haven't been observed in real-world settings but were derived using equations grounded in real-world data. This publicly released dataset can train AI models to predict drug-protein binding much faster than manual calculations while maintaining accuracy. SandboxAQ plans to monetize its own AI models developed with this data, aiming to achieve results comparable to laboratory experiments but in virtual environments. "This is a long-standing problem in biology that we've all, as an industry, been trying to solve for," said Nadia Harhen, general manager of AI simulation at SandboxAQ, on Tuesday. "All of these computationally generated structures are tagged to a ground-truth experimental data, and so when you pick this data set and you train models, you can actually use the synthetic data in a way that's never been done before."
[7]
Nvidia-backed AI startup SandboxAQ creates new data to speed up drug discovery
SAN FRANCISCO (Reuters) -SandboxAQ, an artificial intelligence startup spun out of Alphabet's Google and backed by Nvidia, on Wednesday released a trove of data it hopes will speed up the discovery of new medical treatments by helping scientists understand how drugs stick to proteins. The goal is to help scientists predict whether a drug will bind to its target in the human body. But while the data is backed up by real-world scientific experiments, it did not come from a lab. Instead, SandboxAQ, which has raised nearly $1 billion in venture capital, generated the data using Nvidia's chips and will feed it back into AI models that it hopes scientists can use to rapidly predict whether a small-molecule pharmaceutical will bind to the protein that researchers are targeting, a key question that must be answered before a drug candidate can move forward. For example, if a drug is meant to inhibit a biological process like the progression of a disease, scientists can use the tool to predict whether the drug molecule is likely to bind to the proteins involved in that process. The approach is an emerging field that combines traditional scientific computing techniques with advancements in AI. In many fields, scientists have long had equations that can precisely predict how atoms combine into molecules. But even for relatively small three-dimensional pharmaceutical molecules, the potential combinations become far too vast to calculate manually, even with today's fastest computers. So SandboxAQ's approach was to use existing experimental data to calculate about 5.2 million new, "synthetic" three-dimensional molecules - molecules that haven't been observed in the real world, but were calculated with equations based on real-world data. That synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting in a fraction of the time it would take to calculate it manually, while retaining accuracy. SandboxAQ will charge money for its own AI models developed with the data, which it hopes will get results that rival running lab experiments, but virtually. "This is a long-standing problem in biology that we've all, as an industry, been trying to solve for," Nadia Harhen, general manager of AI simulation at SandboxAQ, told Reuters on Tuesday. "All of these computationally generated structures are tagged to a ground-truth experimental data, and so when you pick this data set and you train models, you can actually use the synthetic data in a way that's never been done before." (Reporting by Stephen Nellis; Editing by Leslie Adler)
Share
Copy Link
SandboxAQ, an AI startup backed by Nvidia, has released a massive dataset of 5.2 million synthetic 3D molecules to accelerate drug discovery by predicting drug-protein binding more efficiently.
SandboxAQ, an artificial intelligence startup spun out of Alphabet's Google and backed by Nvidia, has unveiled a groundbreaking dataset aimed at revolutionizing drug discovery 1. The company has generated 5.2 million synthetic three-dimensional molecules using Nvidia's chips, with the goal of helping scientists predict drug-protein binding more efficiently 2.
Source: Interesting Engineering
The newly released Structurally Augmented IC50 Repository (SAIR) dataset includes 5.2 million synthetic 3D molecular structures across 1 million protein-ligand systems 4. This data, while not derived from laboratory experiments, is backed by real-world scientific experiments and equations. The synthetic molecules were calculated using existing experimental data and advanced AI techniques 3.
The primary objective of this dataset is to enable scientists to predict whether a drug will bind to its target in the human body more rapidly and accurately. By training AI models on this synthetic data, researchers can potentially predict drug-protein binding affinities at least 1,000 times faster than traditional physics-based methods 4.
Source: Economic Times
SandboxAQ's approach represents an emerging field that merges traditional scientific computing techniques with recent advancements in AI. While scientists have long had equations to predict atomic combinations, the vast number of potential combinations in three-dimensional pharmaceutical molecules makes manual calculations impractical, even with today's fastest computers 5.
This innovation has the potential to transform the traditional trial-and-error process of drug discovery into a rapid, data-driven approach. By making the SAIR dataset publicly available, SandboxAQ is providing researchers worldwide with the raw material to train breakthrough models, potentially setting a new pace for drug discovery 4.
Source: BNN
While the synthetic data is being released publicly, SandboxAQ plans to monetize its own AI models developed with this data. The company aims to provide results that rival running lab experiments, but in a virtual environment 1. With nearly $1 billion raised in venture capital, including a recent Series E round of over $450 million, SandboxAQ is well-positioned to further develop its AI and quantum computing capabilities 4.
Summarized by
Navi
[2]
YouTube announces integration of Google's advanced Veo 3 AI video generator into Shorts format, potentially revolutionizing content creation and raising questions about the future of user-generated content.
7 Sources
Technology
11 hrs ago
7 Sources
Technology
11 hrs ago
Pope Leo XIV, the first American pope, has made artificial intelligence's threat to humanity a key issue of his papacy, calling for global regulation and challenging tech giants' influence on the Vatican.
3 Sources
Policy and Regulation
12 hrs ago
3 Sources
Policy and Regulation
12 hrs ago
Google introduces Search Live, an AI-powered feature enabling back-and-forth voice conversations with its search engine, enhancing user interaction and multitasking capabilities.
11 Sources
Technology
11 hrs ago
11 Sources
Technology
11 hrs ago
OpenAI CEO Sam Altman announces GPT-5's summer release, hinting at significant advancements and potential shifts in AI model deployment. Meanwhile, OpenAI renegotiates with Microsoft and expands into new markets.
2 Sources
Technology
11 hrs ago
2 Sources
Technology
11 hrs ago
Canva has integrated Google's Veo 3 AI model into its platform, allowing users to generate high-quality video clips with synchronized audio from text prompts. This feature, called "Create a Video Clip," is now available for Canva Pro, Teams, Enterprise, and Nonprofit users.
6 Sources
Technology
20 hrs ago
6 Sources
Technology
20 hrs ago