2 Sources
[1]
Argonne flexes spare supercompute to build private AI inference service
Boffins at the Department of Energy's (DoE) Argonne National Laboratory near Chicago on Tuesday unveiled a new AI inference service cobbled together from spare supercomputing capacity. The hope is that the service can help researchers across the US, including DoE labs and those working on the Genesis Mission, advance scientific discovery across a range of fields. Argonne is home to some of the world's largest supercomputing clusters, including the No. 3-ranked Aurora supercomputer. But its compute capacity also includes several smaller, AI-optimized systems. As of writing, the lab's inference service is running atop two clusters: The first is the Sophia system, comprising 192 Nvidia A100 GPUs, most with 40 GB of memory. The second, dubbed Metis, is arguably the more interesting. That system features 32 of SambaNova's SN40L AI accelerators. Moving forward, Argonne says that the inference service will also be extended to the Nvidia GH200-based Tara and B200-based Minerva systems. The inference service provides researchers with access to a range of large language models (LLMs) through a chatbot-like portal. Models include OpenAI's GPT-OSS, Google's Gemma family, Meta's Llama herd, and a variety of domain-specific and custom models, like AuroraGPT. And at least for some of its services, Argonne appears to be using Open WebUI, a popular self-hosted chatbot service we've explored on numerous occasions. Argonne envisions researchers harnessing these models to securely analyze large datasets and experiment with integrating generative AI into their workflows. "By making AI inference available as a shared resource, we are enabling researchers to apply AI at scale to their data, their simulations and their experiments without having to build and maintain their own infrastructure," ALCF director Michael Papka said in a statement. Critically, the service enables DoE researchers to experiment with chatbots in a secure manner that doesn't expose data to public services like ChatGPT. According to Argonne, researchers are already using the service to analyze experimental data in real time to predict things like plasma disruptions in fusion energy research. Boffins are also using the tech to sift through large quantities of data generated by particle accelerators and telescopes to narrow the search radius of the most likely candidates. By doing so, researchers can make better use of available supercomputing capacity, rather than wasting cycles brute forcing the problem. While LLMs and other generative AI models still struggle with hallucinations and other erroneous behavior, there's a growing corpus of research to suggest that the technology can be used to automate research or supplement traditional climate or physics models. For example, before it was air-gapped, the eggheads at Lawrence Livermore National Laboratory tasked El Capitan, the world's most powerful publicly known supe, to develop a new tsunami forecasting model. Meanwhile, Nvidia has demonstrated that AI climate models can identify storm cells faster and more accurately than existing models. ®
[2]
Argonne launches first large-scale AI inference service for open science
Newswise -- A key enabler for DOE's Genesis Mission, the service helps researchers accelerate the path from data to insight to discovery. The U.S. Department of Energy's (DOE) Argonne National Laboratory has launched a first-of-its-kind artificial intelligence (AI) inference service to help researchers across the nation accelerate discovery and innovation. The service offers cloud-like access to a range of large language models (LLMs) and science foundation models running on Argonne's high performance computing (HPC) systems. This gives researchers a powerful and secure resource for analyzing large datasets and testing new ideas. "Our inference service helps close the gap between developing AI models and putting them to work in scientific research," said Michael Papka, director of the Argonne Leadership Computing Facility (ALCF). "By offering AI inference as a shared resource, we enable researchers to apply AI at scale to their data, simulations and experiments, without the burden of building and maintaining their own infrastructure." The ALCF is a DOE Office of Science user facility. Inference is the process of using trained AI models to analyze data, identify patterns and make predictions. AI chatbots like ChatGPT use inference to answer questions in real time. In research, the same capability can help scientists guide experiments, make sense of complex data and perform other analytical tasks more efficiently. "By making AI inference available as a shared resource, we are enabling researchers to apply AI at scale to their data, their simulations and their experiments without having to build and maintain their own infrastructure." -- Michael Papka, ALCF director "Inference services allow researchers to spend less time managing models and more time testing hypotheses," said Venkat Vishwanath, AI and machine learning lead at the ALCF. "Instead of taking days or weeks to analyze data, scientists can rapidly interpret results, refine experiments and explore complex systems in ways that weren't practical before." The service grew out of a 2025 paper that outlined a framework for providing secure and scalable AI inference on HPC systems. The goal was to give researchers the ability to run multiple AI tasks in parallel on different models without relying on commercial cloud services. This work led to the creation of the ALCF Inference Service. It provides access to a range of LLMs, including Google's Gemma series, Meta's LLaMA models, and OpenAI's GPT-OSS family, as well as domain-specific foundation models, computer vision models and in-house models developed at Argonne, such as AuroraGPT. The service is being used by a growing and diverse set of researchers. In addition to a substantial base of Argonne and ALCF users, it is actively supporting users across the DOE national laboratory ecosystem, enabling seamless access for researchers from several labs using their home institution credentials. These include Brookhaven National Laboratory, Fermi National Accelerator Laboratory, Los Alamos National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratories and Thomas Jefferson National Accelerator Facility. This expanding cross-lab adoption underscores the service's role in enabling integrated, multi-institutional research workflows. Among its users, the ALCF Inference Service is supporting teams working on DOE's Genesis Mission, a national AI initiative to build the world's most powerful scientific platform to accelerate discovery science, strengthen national security and drive energy innovation. It will also be a key tool for the American Science Cloud (AmSC), the Genesis Mission's integrated environment connecting DOE supercomputers, experimental facilities and data resources. Beyond the Genesis Mission, the service enables scientists to tackle complex challenges in many fields. In fusion energy research, for example, AI models can analyze streams of experimental data in real time and predict plasma disruptions before they occur. This capability enables safer and more efficient control of fusion reactions. In high energy physics and astronomy, inference helps scientists sift through massive volumes of collider and telescope data. This can speed up the process of identifying rare events and new phenomena. The service is also being used to advance work in chemistry and materials science. A recent example is ChemGraph, an AI framework that helps researchers simplify and automate molecular simulation workflows and other chemistry tasks. By using the ALCF Inference Service for LLM-driven tasks, ChemGraph lets researchers manage complex, multi-step simulation workflows interactively. "This allows scientists to explore more candidate molecules, iterate on designs faster and manage large-scale calculations as an integrated process rather than a series of disconnected jobs," said Murat Keçeli, an Argonne computational scientist who helped develop ChemGraph. ChemGraph relies on repeated interactions between the AI and simulation tools, known as "tool calling," to break complex workflows into smaller steps and manage large amounts of data effectively. Each interaction requires the AI model to process information in units referred to as tokens. These can add up quickly and increase computational costs. "Since tool calls can be frequent and might involve large inputs and outputs, token consumption can be much higher than conversation style usage and cost a lot for frontier AI models," Keçeli said. "Therefore, the inference service is very useful for agentic AI applications to minimize the cost." To support these scientific workflows, the service runs on dedicated ALCF systems, including Sophia and Metis. In the near future, the service will also run on the facility's new NVIDIA-based systems, Tara and Minerva. The service is part of the ALCF's broader Service-Enabled Science program, which brings together HPC and AI resources, integrated workflow tools, AI model training capabilities and large-scale data sharing and analysis. This approach provides scientists with a complete suite of tools and services supporting every stage of their research projects. "ALCF's shift from providing raw computing power to delivering integrated services gives researchers a connected environment that supports data generation, simulation and AI inference, accelerating scientific discovery," Papka said. For additional details, see the ALCF user guide or watch the webinar, "Deploying Inference Services at ALCF." Jim Collins is a science and technology writer for Argonne specializing in coverage of computational science, supercomputing and AI-driven research. He has been writing about science for over 15 years. The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines. Supported by the U.S. Department of Energy's (DOE's) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOE Leadership Computing Facilities in the nation dedicated to open science. Argonne National Laboratory seeks solutions to pressing national problems in science and technology by conducting leading-edge basic and applied research in virtually every scientific discipline. Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy's Office of Science. The U.S. Department of Energy's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.
Share
Copy Link
The Department of Energy's Argonne National Laboratory has unveiled a new AI inference service built from spare supercomputing capacity to accelerate scientific discovery. The service provides researchers across DOE labs with secure, cloud-like access to large language models without relying on public platforms like ChatGPT, supporting critical work including the Genesis Mission.
The Department of Energy's Argonne National Laboratory near Chicago has launched an AI inference service built from spare supercomputing capacity, marking a significant shift in how scientific institutions leverage their existing infrastructure. Announced on Tuesday, the service provides researchers across the United States—including those at DOE labs and teams working on the Genesis Mission—with secure access to advanced AI capabilities for scientific discovery
1
.Argonne houses some of the world's largest computing clusters, including the No. 3-ranked Aurora supercomputer. The new service taps into smaller, AI-optimized systems that weren't being fully utilized. Currently, the platform runs on two clusters: the Sophia system with 192 Nvidia A100 GPUs (most with 40 GB of memory), and Metis, featuring 32 of SambaNova's SN40L AI accelerators. Plans are underway to extend the service to the Nvidia GH200-based Tara and B200-based Minerva systems.
The AI inference service offers cloud-like access to large language models through a chatbot-like portal, with Argonne appearing to use Open WebUI, a popular self-hosted chatbot service, for at least some of its operations. Researchers can access OpenAI's GPT-OSS, Google's Gemma family, Meta's Llama herd, and domain-specific models like AuroraGPT
1
. The service also includes computer vision models and in-house models developed at Argonne2
."By making AI inference available as a shared resource, we are enabling researchers to apply AI at scale to their data, their simulations and their experiments without having to build and maintain their own infrastructure," said ALCF director Michael Papka
1
. The Argonne Leadership Computing Facility (ALCF) is a DOE Office of Science user facility.The service has gained traction across the DOE national laboratory ecosystem, actively supporting researchers from Brookhaven National Laboratory, Fermi National Accelerator Laboratory, Los Alamos National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratories, and Thomas Jefferson National Accelerator Facility. Users can access the platform using their home institution credentials, enabling seamless integration into existing research workflows
2
.This cross-lab adoption positions the service as a key enabler for DOE's Genesis Mission, a national AI initiative to build the world's most powerful scientific platform. The service will also support the American Science Cloud (AmSC), the Genesis Mission's integrated environment connecting DOE supercomputers, experimental facilities, and data resources
2
.Related Stories

Source: The Register
Researchers are already deploying these AI models on high-performance computing infrastructure for practical applications. In fusion energy research, scientists analyze experimental data in real time to predict plasma disruptions before they occur, enabling safer reactor control. Teams working with particle accelerators and telescopes use the service to sift through massive data volumes, narrowing the search radius for rare events and new phenomena rather than wasting computational cycles on brute-force approaches
1
.In materials science and chemistry, researchers are using ChemGraph, an AI framework that leverages the ALCF Inference Service for LLM-driven tasks. This allows scientists to simplify molecular simulation workflows, explore more candidate molecules, and manage large-scale calculations as an integrated process
2
.A critical advantage of the service is enabling DOE researchers to experiment with generative AI and LLMs in a secure environment that doesn't expose sensitive data to public services like ChatGPT. "Inference services allow researchers to spend less time managing models and more time testing hypotheses," said Venkat Vishwanath, AI and machine learning lead at the ALCF. "Instead of taking days or weeks to analyze data, scientists can rapidly interpret results, refine experiments and explore complex systems in ways that weren't practical before"
2
.The service grew from a 2025 paper outlining a framework for providing secure and scalable AI inference on high-performance computing systems. The goal was to enable researchers to run multiple AI tasks in parallel on different models without relying on commercial cloud services
2
. While concerns about hallucinations and erroneous behavior in generative AI persist, a growing body of research suggests these technologies can automate research tasks and supplement traditional climate or physics models. Lawrence Livermore National Laboratory, for instance, has used El Capitan—the world's most powerful publicly known supercomputer—to develop tsunami forecasting models, while Nvidia has demonstrated that AI climate models can identify storm cells faster and more accurately than existing approaches1
.Summarized by
Navi
26 Jun 2025•Science and Research

26 Nov 2025•Technology

15 Aug 2025•Science and Research

1
Policy and Regulation

2
Policy and Regulation

3
Technology
