2 Sources
2 Sources
[1]
Intel and SambaNova team up on heterogenous AI inference platform -- different hardware performs different workloads
Inference platform can take advantage of Intel Xeon 6 CPUs, SambaNova SN50 RDUs, and Nvidia GPUs Intel and SambaNova on Wednesday announced their joint production-ready heterogeneous inference architecture that relies on AI accelerators or GPUs for prefill, SambaNova reconfigurable dataflow units (RDUs) SN50 for decode, and Xeon 6 processors for agentic tools and system orchestration. The platform is designed to address as broad a set of workloads as possible to siphon some of the market share away from Nvidia and other emerging players. The heterogeneous inference platform by Intel and SambaNova separates inference into distinct stages handled by different silicon: It uses AI GPUs or AI accelerators for ingesting long prompts and building key-value caches; SambaNova's SN50 RDU for decoding and generating tokens; and Xeon 6 processors for running agent-related operations (e.g., compiling and executing code and validating outputs) as well as coordinating and distributing workloads across hardware. Splitting prefill, decode, and token generation stages is similar to Nvidia's approach to its Rubin platform, which is based on the Rubin CPX and heavy-duty Rubin GPU with HBM4 memory -- with the obvious difference that the Rubin CPX is not coming to market. But, more importantly for Intel, the new platform will rely on its Xeon 6 processors -- not on competing offerings. The solution is scheduled to be available in the second half of 2026 to enterprises, cloud operators, and sovereign AI programs seeking scalable inference platforms in general and coding agents, and other agentic workloads in particular, completely in-house. According to SambaNova's internal data, Xeon 6 achieves over 50% faster LLVM compilation compared to Arm-based server CPUs, and delivers up to 70% higher performance in vector database workloads, relative to competing x86 processors -- namely, AMD EPYC. These gains are intended to shorten end-to-end development cycles for coding agents and similar applications, the two companies claim. Perhaps the biggest advantage of the joint production-ready heterogeneous inference architecture is that SambaNova SN50 and Xeon-based servers are drop-in compatible with data centers that can handle 30kW -- which is the vast majority of enterprise data centers. "The data center software ecosystem is built on x86, and it runs on Xeon -- providing a mature, proven foundation that developers, enterprises, and cloud providers rely on at scale," said Kevork Kechichian, Executive Vice President and General Manager of the Data Center Group (DCG) at Intel Corporation. "Workloads of the future will require a heterogeneous mix of computing, and this collaboration with SambaNova delivers a cost‑efficient, high‑performance inference architecture designed to meet customer needs at scale -- powered by Xeon 6." Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.
[2]
Intel-SambaNova Collaboration Is One Answer to NVIDIA's Groq Partnership, After It Became Clear GPUs Alone Can't Dominate Inference
Inference is the next area of focus for compute providers, and after the NVIDIA-Groq partnership, the AI industry has realized it needs far more than just GPUs. This has led to a new pair emerging: Intel and SambaNova. At this year's GTC, we saw NVIDIA talking about disaggregated inference, and how it has become important for them as a manufacturer to shift from their 'GPU-only' mentality, and instead bring in a relatively newer form of compute units into the infrastructure race. With the Groq licensing agreement, we saw the SRAM-based LPUs debut in Rubin's LPX racks, and now Intel and SambaNova have decided to experiment with something similar, unveiling a new "inference architecture" featuring SambaNova's RDUs with Intel's Xeon 6 CPUs. SambaNova today announced the next phase of its collaboration with Intel: a heterogeneous hardware solution that combines GPUs for prefill, Intel® Xeon® 6 processors as both host and "action" CPUs, and SambaNova RDUs for decode to deliver premium inference for the most demanding Agentic AI applications. - SambaNova This arrangement aims to target RDUs for decode workloads, with GPUs handling prefill work and Xeon 6 CPUs handling tasks such as orchestration and general-purpose work. The Intel-SambaNova partnership doesn't lock in a specific hyperscaler for the GPU option, meaning you could integrate ASICs in this configuration as well, though SambaNova didn't go into much detail about GPU-specific performance. SambaNova will integrate their SN50 units, which we'll discuss in a bit, and, along with this, the firm says they found Xeon 6 CPUs as the ideal for "end‑to‑end coding agent workflows" compared to ARM options. Let's talk about the SN50 chip. The solution, revealed in early 2026, features the company's fifth-gen RDU units, with a combination of DRAM, SRAM, and HBM onboard. The SN50 features 2TB of DDR5 memory, along with 64 GB HBM3 and 520 MB SRAM, and, if you have guessed it by now, the idea of having such a memory architecture onboard is to provide minimal latency, high throughput, and sheer capacity. The SN50 is probably the only accelerator to feature such a memory layout, and according to the manufacturer, the DRAM + SRAM + HBM combo creates 'agentic caching'. On a more general level, the primary difference between Intel's approach with SambaNova and NVIDIA's is that the former focuses more on a 'safer' bet, given that it doesn't need to provide a hefty underlying infrastructure for disaggregated inference. For hyperscalers looking for a more modular rack-scale offering that targets the "prefill + decode" breakdown, the Intel-SambaNova option is a decent bet. We were expecting Intel to go much deeper with RDU integration, but it seems, for now, it might be limited to just the Xeon CPU as the host option. Intel's CEO has participated in SambaNova's latest funding round, and Lip-Bu is also an early investor in the company. There were plans to acquire them as well, but they were reportedly halted after a board disagreement, which is why Intel has settled on being a funding participant.
Share
Share
Copy Link
Intel and SambaNova announced a production-ready heterogeneous AI inference architecture that distributes workloads across different chips. The platform uses GPUs for prefill, SambaNova SN50 RDUs for decode, and Intel Xeon 6 CPUs for orchestration, targeting enterprises and cloud operators seeking alternatives to GPU-centric solutions.
The Intel SambaNova collaboration represents a strategic response to the shifting dynamics in AI infrastructure, where the industry has recognized that GPUs alone cannot dominate AI inference workloads. Intel and SambaNova unveiled a production-ready heterogeneous AI inference architecture that separates inference tasks across specialized silicon, challenging the GPU-centric solutions that have defined the market
1
. This announcement follows Nvidia's partnership with Groq, signaling a broader industry shift toward disaggregated inference approaches that leverage heterogeneous hardware to optimize performance and cost2
.
Source: Wccftech
The heterogeneous AI inference architecture distributes workloads across three distinct hardware types, each optimized for specific tasks. AI GPUs or accelerators handle prefill operations, ingesting long prompts and building key-value caches. SambaNova SN50 RDUs manage decode workloads and token generation, while Intel Xeon 6 CPUs run orchestration, coordinate workload distribution, and execute agentic tools such as compiling and validating code
1
. This division of labor mirrors Nvidia's approach with its Rubin platform, though Intel's solution centers on Xeon 6 processors rather than competing offerings, giving the company a critical foothold in the inference market1
.The SN50 RDU features a distinctive memory architecture combining 2TB of DDR5 memory, 64GB HBM3, and 520MB SRAM, creating what SambaNova calls "agentic caching"
2
. This combination aims to deliver minimal latency, high throughput, and substantial memory capacity for Agentic AI applications. The SN50 is reportedly the only accelerator to feature such a memory layout, positioning it as a specialized solution for decode tasks that require rapid token generation2
. The platform's flexibility extends to GPU selection, as it doesn't lock users into a specific hyperscaler option, meaning ASICs could also integrate into this configuration2
.Related Stories
According to SambaNova's internal data, Xeon 6 achieves over 50% faster LLVM compilation compared to Arm-based server processors and delivers up to 70% higher performance in vector database workloads relative to AMD EPYC
1
. These gains are designed to shorten end-to-end development cycles for coding agents and similar AI inference workloads. A significant advantage lies in data center compatibility: SambaNova SN50 and Xeon-based servers are drop-in compatible with data centers that can handle 30kW, which encompasses the vast majority of enterprise data centers1
. This positions the platform as an alternative to Nvidia for organizations seeking to avoid costly infrastructure overhauls.
Source: Tom's Hardware
Kevork Kechichian, Executive Vice President and General Manager of Intel's Data Center Group, emphasized the x86 ecosystem's maturity: "The data center software ecosystem is built on x86, and it runs on Xeon -- providing a mature, proven foundation that developers, enterprises, and cloud operators rely on at scale"
1
. The solution targets enterprises, cloud operators, and sovereign AI programs seeking scalable inference platforms, particularly for coding agents and other agentic workloads, with availability scheduled for the second half of 20261
. Intel's CEO participated in SambaNova's latest funding round, and there were reportedly plans to acquire the company that were halted after board disagreement, leading Intel to settle on being a funding participant2
. This partnership represents Intel's bet on a modular, heterogeneous approach that could reshape how organizations deploy AI infrastructure beyond traditional GPU-centric solutions.Summarized by
Navi
[1]
24 Feb 2026•Technology

08 Oct 2024•Technology

25 Sept 2024
