Astera Labs unveils Scorpio X switch to tackle AI compute cluster networking bottlenecks

2 Sources

Share

Astera Labs introduced Scorpio X, a vendor-agnostic AI fabric switch featuring 320 lanes of PCIe 6.0 connectivity and 5.12 TB/s bandwidth. The chip aims to solve networking bottlenecks in massive AI compute clusters by enabling GPUs to access distributed resources like local memory. Production is expected in the second half of 2026.

Astera Labs Challenges Nvidia with Vendor-Agnostic AI Fabric Switch

Astera Labs unveiled Scorpio X, positioning it as an alternative to Nvidia's NVSwitch for building rack-scale AI systems that works with nearly any AI accelerators

1

. The new AI fabric switch crams 320 lanes of PCIe 6.0 connectivity into a single ASIC with 5.12 TB/s of bidirectional bandwidth, addressing the critical networking bottlenecks that plague modern data center operations

1

. While NVSwitch 6 offers nearly three times the bandwidth at 14.4 TB/s, Scorpio X doesn't need to compete directly with Nvidia—instead, it targets customers seeking flexibility across different hardware platforms

1

.

Source: SiliconANGLE

Source: SiliconANGLE

Memory-Semantic Architecture Transforms GPU Communication

The Scorpio X-series employs a novel memory-semantic architecture that fundamentally changes how GPUs and other processors interact across the fabric

2

. This design enables AI accelerators to access resources spread across the fabric using simple load/store operations, similar to accessing data within their own local memory

2

. The entire fabric becomes a unified memory pool, eliminating the overheads that arise from translating data packets and minimizing latency

2

. For AI compute clusters rented at thousands of dollars per hour, reducing idle time when GPUs wait for data has direct economic implications

2

.

In-Network Compute Accelerates Generative AI Inference

Scorpio X is equipped with In-Network Compute capabilities similar to those found in Nvidia's NVSwitch, which help accelerate collective communications

1

. These communications have become especially important for generative AI inference as large language models grow increasingly chatty from a network standpoint, particularly with mixture-of-experts architectures gaining traction

1

. MoE models are composed of multiple sub-models called experts, and for each token generated, a different selection of experts potentially running on different GPUs may be used

1

. By moving collective communications to the switch, GPUs spend less time waiting for the network and more time generating tokens

1

.

Hypercast Technology Optimizes MoE Workloads

Astera Labs developed Hypercast, a proprietary multicast operation optimized specifically for MoE inference

1

. According to Ahmad Danesh, AVP of product management at Astera Labs, "One of the limitations of the standard multicast is the number of groups you can actually support, as well as the dynamic nature of needing to change those groups on the fly for mixture-of-experts models"

1

. The switches can perform data processing tasks like data aggregation or distribution themselves, processing collective operations twice as fast with a direct impact on the token economics of AI workloads

2

.

PCIe Strategy Enables Broader Hardware Compatibility

While technologies like NVLink Fusion and the emerging UALink protocol are gaining traction, chips need to be designed around them from the ground up

1

. PCIe works with just about anything because it's already used to get data in and out of accelerators

1

. For example, stitching together 32 or more Nvidia RTX Pro 6000 Server cards would require a PCIe switch, since those GPUs don't support NVLink at all

1

. PCIe also makes it easier to mix and match chips for disaggregated inference architectures, like those seen with Nvidia and Groq, AWS and Cerebras, or Intel and SambaNova

1

.

Scalability Concerns Drive Industry Innovation

As the world's leading large language models evolve into multitrillion-parameter behemoths, it's no longer possible to squeeze them into a single server rack

2

. The only way to run these models is to link hundreds or even thousands of GPUs into enormous clusters so they can work together as a single giant computer

2

. Chief Executive Jitendra Mohan emphasized that "the frontier models driving today's most demanding AI applications require connectivity infrastructure that keeps pace with the accelerators powering them"

2

.

Source: The Register

Source: The Register

Expanded Product Line and Management Tools

Alongside Scorpio X, Astera Labs is expanding its Scorpio P-series switches with models ranging from 32 to 320 lanes of PCIe connectivity

1

. All switches work with the COSMOS management suite, a hardware monitoring platform designed to help track down and resolve issues across the network fabric

1

. The refreshed Scorpio switches are currently sampling with production expected to ramp in the second half of 2026

1

.

Today's Top Stories

TheOutpost.ai

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Instagram logo
LinkedIn logo
Youtube logo
© 2026 TheOutpost.AI All rights reserved