2 Sources
[1]
Astera speaks softly and carries a big switch
Astera Labs unveiled an alternative to Nvidia's NVSwitch for building rack-scale AI systems on Tuesday, claiming it will work with nearly any accelerator. The AI fabric switch, codenamed Scorpio X, crams 320 lanes of PCIe 6.0 connectivity into a single ASIC with 5.12 TB/s of bidirectional bandwidth. Historically, PCIe switches have been used in a variety of applications including scale-out compute fabrics. CPUs alone either didn't offer enough or fast enough lanes for all the GPUs, NICs, and storage required. So, rather than hanging everything off the CPU, a PCIe switch, often built into the NIC, was used to connect everything together. Astera contends that with a big enough switch, PCIe is a viable alternative to interconnects like NVLink, in the scale-up fabrics used to make dozens or more GPUs behave more like a single large one without needing to redesign their accelerators. However, Astera hasn't just built a bigger PCIe switch. Scorpio is equipped with many of the same in-network compute capabilities as Nvidia's NVSwitch, which help to accelerate collective communications. These communications are especially important for generative AI inference. Large language models have become rather chatty from a network standpoint as mixture-of-experts (MoE) architectures have caught on. MoE models are composed of multiple sub-models called experts. For each token generated, a different selection of experts, potentially running on different GPUs, may be used. By moving collective communications to the switch, the GPUs spend less time waiting for the network to catch up and more time churning out tokens. Astera has gone so far as to develop a multicast operation optimized for MoE inference that it calls Hypercast. "One of the limitations of the standard multicast is the number of groups you can actually support, as well as the dynamic nature of needing to change those groups on the fly for mixture-of-experts models," Ahmad Danesh, AVP of product management at Astera, told El Reg. While there are clear benefits to using PCIe as a chip-to-chip interconnect, Scorpio isn't exactly a replacement for Nvidia's NVSwitch chips. NVSwitch 6, announced at CES in January, offers nearly 3x the bandwidth at 14.4 TB/s. However, Astera doesn't need to compete with NVSwitch directly. In fact, Astera announced plans to extend support for NVLink Fusion, Nvidia's attempt to open its high-speed interconnect to the broader ecosystem, last spring. Instead, Scorpio is being positioned more as a vendor agnostic alternative. Technologies like NVLink Fusion or the emerging UALink protocol are gaining traction, but chips need to be designed around them. PCIe works with just about anything because it's already used to get data in and out of the accelerators. For example, if you wanted to stitch together 32 or more Nvidia RTX Pro 6000 Server cards, you'd need a PCIe switch, since those GPUs don't support NVLink at all. PCIe also makes it easier to mix and match chips for disaggregated inference architectures, like we've seen with Nvidia and Groq, AWS and Cerebras, or Intel and SambaNova. These architectures involve using one accelerator for compute heavy prefill operations and another for bandwidth intensive decode operations. For this to work, the chips have to be connected to one another. Many AI chip builders are doing this over Ethernet, but PCIe would be more direct. Alongside its Scorpio X family of chips, Astera is also expanding its Scorpio P-series switches with models ranging from 32 to 320 lanes of PCIe connectivity. All of these switches work with its COSMOS management suite, a hardware monitoring platform designed to help track down and resolve issues across the network fabric. Astera's refreshed Scorpio switches are currently sampling with production expected to ramp in the second half of 2026. ®
[2]
Astera Labs debuts new Scorpio smart fabric data center switch to scale up AI compute clusters - SiliconANGLE
Astera Labs debuts new Scorpio smart fabric data center switch to scale up AI compute clusters Networking chip company Astera Labs Inc. is trying to ease the traffic congestion in artificial intelligence data centers with the launch of the latest generation of its Scorpio smart data fabric switch family, the X-series. According to the company, it's the largest open and "memory-semantic" fabric switch in the industry, designed to help hyperscale data center operators scale up their compute capacity while avoiding latency issues. The new switch is being launched alongside an expanded Scorpio P-Series PCIe Fabric Switch family, which now comes in configurations ranging from 32 lanes to 320 lanes. It says it's giving data center architects more flexibility as they struggle to shuffle massive streams of data between clusters of AI processors. Astera Labs says the new switches will play a vital role in scaling AI system. As the world's leading large language models evolve into multitrillion-parameter behemoths, it's no longer possible to squeeze them into a single server rack. The only way to run these kinds of models is to link hundreds or even thousands of graphics processing units into enormous clusters so they can work together as a single, giant computer. This has become the standard way for AI data centers to operate, but it also causes immense congestion as data shuffles back and forth between those chips. It's extremely inefficient, because when GPUs are waiting for data to arrive from another part of the cluster, it means they're sitting idle. For AI clusters that are rented out at the cost of thousands of dollars per hour, that idle time is undesirable, to say the least. Astera Labs' Scorpio X-series 320-Lane Smart Fabric Switch is designed to solve this challenge by rethinking the way switches interact with the chips they're linked to. Its novel memory-semantic architecture makes it possible for GPUs and other AI accelerators to access resources spread across the fabric using simple load/store operations, similar to how they might access data within their own local memory. The entire fabric becomes a single, unified memory pool, eliminating the overheads that arise from translating data packets in order to minimize latency. The X-series switches can do this thanks to the company's proprietary Hypercast and In-Network Compute technologies, which enable the switches to perform some data processing tasks itself. More specifically, it's designed to handle "collective operations" such as data aggregation or distribution tasks. Because these no longer have to be done on the AI chip, they can be processed twice as fast - with a direct impact on the "token economics" of AI workloads. Another key element of the X-series switches is their "high radix" design. Each chip boasts 320 lanes of PCIe 6 connectivity, which means they can be swapped for multiple legacy data center switches, reducing network complexity and the overall distance data has to travel. Astera Labs says the older Scorpio P-series chips will now play a complementary role to the X-series family, with the new configurations designed to support "front-end network and AI compute system deployments." With its support for open standards and platform-specific protocols such as NVLink Fusion and UALink, Astera Labs' switches can provide the networking fabric for almost any kind of AI processor, it said. Chief Executive Jitendra Mohan said chip connectivity bottlenecks must be avoided for the AI industry to continue scaling. "The frontier models driving today's most demanding AI applications require connectivity infrastructure that keeps pace with the accelerators powering them," he stressed.
Share
Copy Link
Astera Labs introduced Scorpio X, a vendor-agnostic AI fabric switch featuring 320 lanes of PCIe 6.0 connectivity and 5.12 TB/s bandwidth. The chip aims to solve networking bottlenecks in massive AI compute clusters by enabling GPUs to access distributed resources like local memory. Production is expected in the second half of 2026.
Astera Labs unveiled Scorpio X, positioning it as an alternative to Nvidia's NVSwitch for building rack-scale AI systems that works with nearly any AI accelerators
1
. The new AI fabric switch crams 320 lanes of PCIe 6.0 connectivity into a single ASIC with 5.12 TB/s of bidirectional bandwidth, addressing the critical networking bottlenecks that plague modern data center operations1
. While NVSwitch 6 offers nearly three times the bandwidth at 14.4 TB/s, Scorpio X doesn't need to compete directly with Nvidia—instead, it targets customers seeking flexibility across different hardware platforms1
.
Source: SiliconANGLE
The Scorpio X-series employs a novel memory-semantic architecture that fundamentally changes how GPUs and other processors interact across the fabric
2
. This design enables AI accelerators to access resources spread across the fabric using simple load/store operations, similar to accessing data within their own local memory2
. The entire fabric becomes a unified memory pool, eliminating the overheads that arise from translating data packets and minimizing latency2
. For AI compute clusters rented at thousands of dollars per hour, reducing idle time when GPUs wait for data has direct economic implications2
.Scorpio X is equipped with In-Network Compute capabilities similar to those found in Nvidia's NVSwitch, which help accelerate collective communications
1
. These communications have become especially important for generative AI inference as large language models grow increasingly chatty from a network standpoint, particularly with mixture-of-experts architectures gaining traction1
. MoE models are composed of multiple sub-models called experts, and for each token generated, a different selection of experts potentially running on different GPUs may be used1
. By moving collective communications to the switch, GPUs spend less time waiting for the network and more time generating tokens1
.Astera Labs developed Hypercast, a proprietary multicast operation optimized specifically for MoE inference
1
. According to Ahmad Danesh, AVP of product management at Astera Labs, "One of the limitations of the standard multicast is the number of groups you can actually support, as well as the dynamic nature of needing to change those groups on the fly for mixture-of-experts models"1
. The switches can perform data processing tasks like data aggregation or distribution themselves, processing collective operations twice as fast with a direct impact on the token economics of AI workloads2
.While technologies like NVLink Fusion and the emerging UALink protocol are gaining traction, chips need to be designed around them from the ground up
1
. PCIe works with just about anything because it's already used to get data in and out of accelerators1
. For example, stitching together 32 or more Nvidia RTX Pro 6000 Server cards would require a PCIe switch, since those GPUs don't support NVLink at all1
. PCIe also makes it easier to mix and match chips for disaggregated inference architectures, like those seen with Nvidia and Groq, AWS and Cerebras, or Intel and SambaNova1
.Related Stories
As the world's leading large language models evolve into multitrillion-parameter behemoths, it's no longer possible to squeeze them into a single server rack
2
. The only way to run these models is to link hundreds or even thousands of GPUs into enormous clusters so they can work together as a single giant computer2
. Chief Executive Jitendra Mohan emphasized that "the frontier models driving today's most demanding AI applications require connectivity infrastructure that keeps pace with the accelerators powering them"2
.
Source: The Register
Alongside Scorpio X, Astera Labs is expanding its Scorpio P-series switches with models ranging from 32 to 320 lanes of PCIe connectivity
1
. All switches work with the COSMOS management suite, a hardware monitoring platform designed to help track down and resolve issues across the network fabric1
. The refreshed Scorpio switches are currently sampling with production expected to ramp in the second half of 20261
.Summarized by
Navi
[1]
1
Health

2
Technology

3
Policy and Regulation
