BlueField-4 STX: Nvidia's AI Storage for Agentic AI

Nvidia Targets Storage Bottleneck Limiting Agentic AI Performance

Nvidia announced BlueField-4 STX at GTC 2026 on March 16, introducing a modular AI storage reference architecture designed to address the data access bottleneck that limits agentic AI inference . Built around a new storage-optimized BlueField-4 Data Processing Unit (DPU) and ConnectX-9 SuperNIC, the platform targets GPU underutilization that occurs when AI agents operating across extended sessions and expanding context windows exceed the throughput of conventional storage paths 1

. Nvidia claims BlueField-4 STX delivers up to 5x the token throughput and energy efficiency, 4x better energy efficiency, and 2x the data ingestion speed compared with traditional CPU-based storage architectures 2

Context Memory Layer Solves Key-Value Cache Management Crisis

The specific challenge that BlueField-4 STX addresses is key-value cache management during transformer inference 1

. Large Language Models (LLMs) compute KV pairs for every token in context, which must be stored and retrieved for each subsequent generation step. As context windows grow into hundreds of thousands of tokens, the KV cache outgrows GPU HBM capacity, forcing offloads to host DRAM or NVMe storage that pass through the CPU, adding latency that compounds with context length and stalls GPU execution 1

Source: VentureBeat

The context memory layer that STX introduces between GPUs and traditional storage is designed to maintain coherent working memory across sessions, tool calls, and reasoning steps without forcing round trips through general-purpose storage 2

. "Traditional data centers provide high-capacity, general-purpose storage, but generally lack the responsiveness required for interaction with AI agents that need to work across many steps, tools and different sessions," Ian Buck, Nvidia's vice president of hyperscale and high-performance computing, said in a briefing 2

AI-Native Infrastructure Built on Extreme Co-Design

BlueField-4 STX bypasses the host CPU by routing data through a dedicated accelerated storage layer via RDMA over Spectrum-X Ethernet 1

. The BlueField-4 processor manages NVMe SSDs directly and handles data integrity and encryption for the KV cache, keeping context accessible at the storage processor rather than transiting the host 1

. The full stack runs on the Vera Rubin platform and integrates the Vera CPU alongside ConnectX-9, Spectrum-X Ethernet, DOCA software, and AI Enterprise software 1

. The architecture employs what Jensen Huang calls extreme co-design, a multidisciplinary engineering approach that treats the entire data center as a single integrated unit to eliminate traditional silos between networking and storage 3

. "AI systems that reason across massive context and continuously learn require a new class of storage," Jensen Huang, founder and CEO of Nvidia, said at GTC 1

CMX Platform and Partner Ecosystem Signal Industry Shift

The first rack-scale implementation built on BlueField-4 STX is the Nvidia CMX context memory storage platform, which extends GPU memory with a high-performance context layer designed specifically for storing and retrieving KV cache data generated by Large Language Models during inference 2

. Storage vendors co-designing systems based on STX include DDN, Dell Technologies, HPE, IBM, NetApp, and VAST Data, alongside manufacturing partners AIC, Supermicro, and Quanta Cloud Technology 1

Source: SiliconANGLE

Eight cloud providers including CoreWeave, Lambda, Mistral AI, and Oracle Cloud Infrastructure committed to early adoption for context memory storage 1

. Buck confirmed that STX ships with a software reference platform alongside the hardware architecture, with Nvidia expanding DOCA to include a new component called DOCA Memo 2

. STX-based platforms are expected from partners in the second half of 2026 1

. The combination of enterprise storage incumbents and AI-native cloud providers signals that Nvidia is positioning STX as the reference standard for anyone building storage infrastructure serving agentic AI workloads, which within the next two to three years is likely to include most enterprise AI deployments running multi-step inference at scale 2

. For long-context reasoning tasks that define agentic AI, improved agentic AI throughput means the difference between scaling AI factories or watching expensive GPUs sit idle waiting for data 3

Nvidia unveils BlueField-4 STX to eliminate storage bottlenecks stalling agentic AI inference

Nvidia Targets Storage Bottleneck Limiting Agentic AI Performance

Context Memory Layer Solves Key-Value Cache Management Crisis

AI-Native Infrastructure Built on Extreme Co-Design

CMX Platform and Partner Ecosystem Signal Industry Shift

References

Nvidia launches BlueField-4 STX storage architecture for agentic AI at GTC 2026

Nvidia BlueField-4 STX adds a context memory layer to storage to close the agentic AI throughput gap

The convergence of context: Why Nvidia's BlueField-4 STX marries the network and storage admin - SiliconANGLE

Nvidia introduces BlueField-4 STX reference architecture for AI storage systems - SiliconANGLE

Related Stories

NVIDIA Expands AI Infrastructure with New Storage Certification and Reference Architectures

Nvidia Unveils Blackwell Ultra GPUs and AI Desktops, Focusing on Reasoning Models and Revenue Generation

NVIDIA Unveils NVLink Fusion: Enabling Custom AI Infrastructure with Industry Partners

Recent Highlights

Anthropic restricts Mythos AI model release, citing unprecedented cybersecurity capabilities

Top US Officials Warn Banks About Anthropic Mythos AI Model's Cybersecurity Threats

Meta unveils Muse Spark AI model as Superintelligence Labs makes its debut

Recent Highlights

Today's Top Stories

Anthropic launches Claude for Word with legal review as primary focus, challenging Microsoft

Apple tests four frame styles for smart glasses as part of ambitious AI wearables push

Intel and SambaNova unveil heterogeneous AI inference platform to challenge Nvidia's dominance

Canva acquires Simtheory and Ortto to accelerate AI platform and marketing automation push