US National Labs Test Supercomputing Chips Beyond GPUs

Supercomputing Chips Face New Demands as AI Dominates

The race to build better AI chips is reshaping the supercomputing landscape in ways few anticipated. Nine of the world's top 10 most powerful supercomputers rely on GPUs, but US national labs are now exploring alternatives to GPUs for supercomputers as chipmakers like Nvidia prioritize AI workloads over the ultra-precise calculations needed for scientific computing 1

. This shift has opened the door for new chip architectures from smaller players to challenge the dominance of established semiconductor giants.

At Sandia National Laboratories, located on Kirtland Air Force Base in New Mexico, liquid-cooled supercomputers handle some of the most complex and secretive work the US government undertakes: simulating nuclear weapons as they move through the atmosphere and modeling what happens when nuclear warheads detonate near each other 2

. For over a decade, mainstream semiconductor firms supplied the chips for this demanding work, but supply shortages and shifting design priorities are forcing labs to rethink their procurement strategies.

NextSilicon's Maverick-2 Passes Critical Milestone

Sandia recently validated NextSilicon's Maverick-2, a dataflow processor designed explicitly for the 64-bit floating point mathematics that dominate High-Performance Computing (HPC) applications at the Department of Energy. The Spectra supercomputer, built in collaboration with Penguin Solutions and NextSilicon, met all system acceptance requirements this week, opening the door for deployment in larger systems 1

. While Spectra is modest in scale—counting just 64 nodes and 128 of NextSilicon's runtime-configurable accelerators—it serves as a crucial test bed for proving the technology works.

Source: Reuters

The Israeli startup's chips passed validation using three key workloads: the high-performance conjugate gradient benchmark, the LAMMPS molecular dynamics test suite, and the Sparta Monte Carlo simulation suite 1

. This technical milestone puts NextSilicon in the running for government systems and sets up a decision this fall on whether to test the chips with more demanding computing problems that closely resemble actual nuclear security work 2

High-Precision Computing Takes a Back Seat to AI

The concern driving this search for new options centers on double-precision floating point computation—FP64—which allows computers to handle both very large and very small numbers without losing accuracy to rounding errors. This capability is essential for simulating nuclear weapons and other physics problems 2

. But AI work doesn't benefit from high-precision computing the same way scientific applications do, leading chipmakers to deprioritize it.

Nvidia's forthcoming Rubin GPUs illustrate this tension. The chips promise up to 50 petaFLOPS of FP4 compute for AI inference and training, but their FP64 hardware performance tops out at just 33 teraFLOPS—slower than the nearly four-year-old H100 1

. While Nvidia claims Rubin can deliver up to 200 teraFLOPS of FP64 matrix performance using emulated approaches based on the Ozaki scheme, this method has limitations in vector-heavy workloads like computational fluid dynamics.

"The pressure we're feeling right now is on the computing front and also from the supply chain," said Steve Monk, manager of Sandia's high-performance computing team. "Looking to the future, it's a bit stressful in terms of our ability to deliver to the mission" 2

Dataflow Architecture Offers Different Approach

Unlike the von Neumann architecture that underpins most CPUs and GPUs, the Maverick-2 employs a reconfigurable dataflow processor design. The chip's two compute dies comprise a grid of arithmetic logic units interconnected in a graph, with each unit configured at runtime to perform specific operations 1

. The real advantage comes from overlapping data flow and compute—as soon as data reaches the next unit in the pipeline, it's computed immediately without waiting for load-store operations to shuffle data around.

This approach dramatically improves performance and efficiency in real-world workloads while saving electricity by spending less time and energy moving data back and forth to memory 1

. While dataflow architectures aren't new—companies like Groq, Cerebras, and SambaNova have built similar designs—NextSilicon is one of the few targeting HPC rather than AI workloads.

To address the notorious difficulty of programming dataflow systems, NextSilicon built a compiler that runs existing C, Python, Fortran, or CUDA codebases on its chips by capturing the compute graph and optimizing it for maximum performance 1

Labs Seek Insurance Against Supply Constraints

Sandia's work with NextSilicon reflects a broader strategy to ensure mission-critical capabilities remain available even as major chip firms shift focus. James Laros, a senior scientist at Sandia who oversees testing of new computing architectures, emphasized this imperative: "We have to keep available options to complete our mission, because the mission is not optional" 2

Source: The Register

The lab has a history of incubating technologies that later become widespread. Liquid cooling systems for chips were exotic when Sandia started urging Intel, AMD, and Nvidia to develop the technology over a decade ago; now they're common 2

. This pattern suggests that today's experimental dataflow processors could influence tomorrow's mainstream scientific computing infrastructure, particularly if the chip industry chases AI at the continued expense of scientific computing needs.

US National Labs Turn to Exotic Chips as AI Boom Reshapes Supercomputing Landscape

Supercomputing Chips Face New Demands as AI Dominates

NextSilicon's Maverick-2 Passes Critical Milestone

High-Precision Computing Takes a Back Seat to AI

Dataflow Architecture Offers Different Approach

Labs Seek Insurance Against Supply Constraints

References

Uncle Sam's next big supercomputer might use something more exotic than GPUs

As chip industry chases AI, U.S. national labs look to newcomers for supercomputers

Related Stories

US National Labs Launch Massive AI Supercomputing Initiative to Maintain Global Technological Leadership

NVIDIA Predicts AI-Accelerated Future for Scientific Computing as GPU Supercomputers Dominate Top500

AMD's Ambitious 20x30 AI Efficiency Goal: Embracing Rack-Scale Architecture and Power Constraints

Recent Highlights

OpenAI AI agent broke free from testing sandbox and hacked Hugging Face to cheat on benchmark

Xi Jinping positions China AI as alternative to US tech dominance at Shanghai conference

AI disproves 87-year-old Jacobian conjecture, sparking debate on AI's role in mathematics

Recent Highlights

Today's Top Stories

Anthropic launches Claude Opus 5, matching Fable 5 performance at half the cost for daily work

Meta AI adds task automation and calendar integration to compete with ChatGPT and Gemini

AMD and Cerebras forge partnership to deliver 5x faster AI inference with Helios and Wafer-Scale Engine

Carnegie Mellon PhD Yang Zhilin builds Moonshot AI into $50bn powerhouse rivaling OpenAI