2 Sources
2 Sources
[1]
Micron Ships Out the "World's First" 256GB SOCAMM2 Modules Targeted Toward the Agentic AI Frenzy
Micron's latest breakthrough in the memory industry is the debut of the more capable SOCAMM2 memory modules, featuring leading capacity and power efficiency. With the 'applications' layer of AI, the memory bottleneck is growing as workloads continue to scale, which is why DRAM manufacturers have paid special attention to advancements being made with HBM and other AI-specific memory products. In Micron's latest announcement, the firm has set a "new benchmark" with SOCAMM2 memory modules, as they ramp up the per-module capacity to 256 GB, marking a massive leap from the previous 192 GB threshold. According to the company, SOCAMM2 will be integrated into modern-day AI infrastructure equipment and help address memory constraints. Micron's achievements in delivering massive memory capacity and bandwidth using less power than traditional server memory with 256GB SOCAMM2 is enabling the next generation of AI CPUs. - Ian Finder, Head of Product, Data Center CPUs at NVIDIA With the latest SOCAMM2 iteration, Micron has increased the capacity of a single LPDRAM monolithic die to 32 GB. With the 256 GB model, 2 TB of LPDRAM per 8-channel CPU is provided, enabling AI servers to process long-context windows with ease. Micron also says that with the new SOCAMM2 solution, the TTFT has increased by 2.3 times for long-context inference, significantly helping complement agentic workloads, where standalone CPU applications are a key focus. SOCAMM2 is a solution developed in cooperation with NVIDIA, and in a previous post, we discussed how Vera Rubin will be one of the first AI infrastructure offerings to utilize the new memory standard. In the AI world, memory has become a significant asset in workloads where latency and context are important, but at the same time, SOCAMM2 is one of the products that would take up a decent portion of the DRAM supply, likely eating into allocations for general-purpose products like GDDR7. Micron says that 256GB SOCAMM2 samples have already been shipped to customers, and that the solution will be on showcase at this year's GTC 2026.
[2]
Micron Technology, Inc. Sets New Benchmark with the World's First High-Capacity 256Gb LPDRAM SOCAMM2 for Data Center Infrastructure
Micron Technology, Inc. extended its leadership in low-power server memory by shipping customer samples of the industry?s highest-capacity LPDRAM module ? 256GB SOCAMM2. Enabled by the industry's first monolithic 32Gb LPDDR5X design, this milestone represents a transformational step forward for AI data centers, delivering low-power memory capacity that can unlock new system architectures. The convergence of AI training, inference, agentic AI and general-purpose compute are driving more demanding memory requirements and reshaping data center system architectures. Modern AI workloads drive large model parameters, expansive context windows and persistent key value (KV) caches, while core compute continues to scale in data intensity, concurrency and memory footprint. Across these workloads, memory capacity, bandwidth efficiency, latency and power efficiency have become primary system level constraints, directly influencing performance, scalability and total cost of ownership. LPDRAM?s unique combination of these attributes position it as a cornerstone solution for both AI and core compute servers in increasingly power and thermally constrained data center environments. Micron is collaborating with NVIDIA to co-design sophisticated memory for the needs of advanced AI infrastructure. Micron?s 256GB SOCAMM2 delivers higher memory capacity, substantially lower power consumption and faster performance for a variety of AI and general-purpose computing workloads. With one-third more capacity than the prior highest capacity 192GB SOCAMM2, 256GB SOCAMM2 provides 2TB of LPDRAM per 8-channel CPU for larger context windows and complex inference workloads. SOCAMM2 consumes one-third of the power compared with equivalent RDIMMs, while using only one-third of the footprint, improving rack density and reducing the total cost of ownership. In unified memory architectures, 256GB SOCAMM2 improves time to first token by more than 2.3 times for long context, real-time LLM inference when used for KV cache offload compared to currently available solutions. In standalone CPU applications, LPDRAM delivers more than 3 times better performance per watt than mainstream memory modules for high-performance computing workloads. The modular SOCAMM2 design improves serviceability, supports liquid-cooled server architectures and enables future capacity expansion as AI and core compute memory requirements continue to grow. Micron continues to play a leading role in the JEDEC SOCAMM2 specification definition and maintains deep technical collaborations with system designers to drive industry-wide improvements in power efficiency and performance for next-generation data center platforms. Micron is now shipping customer samples of its 256GB SOCAMM2 and offers the industry?s broadest data center LPDRAM portfolio, spanning 8GB to 64GB components and 48GB to 256GB SOCAMM2 modules. One-third of the power consumption calculated based on watts of power used by one 128GB, 128-bit bus width SOCAMM2 module compared to two 64GB, 64-bit bus width DDR5 RDIMMs. One-third footprint calculation compares SOCAMM2 area (14x90mm) versus a standard server RDIMM. Results are based on Micron internal testing of real-time inference with Llama3 70B model (with FP16 quantization) using 500K context length and 16 concurrent users. The projected TTFT latency improvement is based on a latency of 0.12s for 2TB LPDRAM per CPU vs. 0.28s for 1.5TB LPDRAM per CPU. Micron internal testing measuring Pot3D solar physics HPC code performance on identical capacities of LPDDR5X and DDR5.
Share
Share
Copy Link
Micron has begun shipping customer samples of the industry's first 256GB SOCAMM2 modules, enabled by a monolithic 32Gb LPDDR5X design. The high-capacity memory modules deliver 2TB of LPDRAM per 8-channel CPU while consuming one-third the power of traditional server memory, addressing critical constraints in AI data centers as agentic AI workloads and large language model inference demands continue to scale.
Micron has begun shipping customer samples of the world's first 256GB SOCAMM2 modules, marking a significant leap in memory capacity for Artificial Intelligence (AI) infrastructure
1
. The breakthrough is enabled by the industry's first monolithic 32Gb LPDDR5X design, representing a 33% capacity increase over the previous 192GB threshold2
. This development addresses growing memory constraints in AI data centers as workloads become increasingly complex and data-intensive.
Source: Wccftech
The new high-capacity memory modules were developed through collaboration with NVIDIA, which will integrate them into next-generation AI CPUs. According to Ian Finder, Head of Product for Data Center CPUs at NVIDIA, "Micron's achievements in delivering massive memory capacity and bandwidth using less power than traditional server memory with 256GB SOCAMM2 is enabling the next generation of AI CPUs"
1
. The solution will be showcased at GTC 2026, signaling its importance to future AI infrastructure deployments.The 256GB SOCAMM2 modules deliver 2TB of LPDRAM per 8-channel CPU, enabling AI servers to process larger context windows and complex inference workloads
2
. This expanded memory capacity directly addresses the convergence of AI training, inference, and agentic AI workloads, which drive increasingly demanding requirements for model parameters, expansive context windows, and persistent key value caches.Power efficiency stands out as a critical advantage. The modules consume one-third of the power compared with equivalent RDIMMs while using only one-third of the footprint, improving rack density and reducing total cost of ownership
2
. This low-power memory solution becomes increasingly vital as data centers face growing power and thermal constraints.For LLM inference applications, the performance improvements are substantial. In unified memory architectures, 256GB SOCAMM2 improves time to first token (TTFT) by more than 2.3 times for long-context, real-time inference when used for KV cache offload compared to currently available solutions
2
. Micron's internal testing using the Llama3 70B model with FP16 quantization, 500K context length, and 16 concurrent users showed TTFT latency dropping from 0.28 seconds with 1.5TB LPDRAM per CPU to just 0.12 seconds with 2TB.These performance gains prove particularly valuable for agentic AI workloads, where standalone CPU applications require rapid response times and efficient memory utilization
1
. In high-performance computing workloads, LPDRAM delivers more than 3 times better performance per watt than mainstream server memory modules, demonstrating its versatility beyond AI-specific applications2
.Related Stories
Micron continues to play a leading role in the JEDEC SOCAMM2 specification definition, working with system designers to drive industry-wide improvements in power efficiency and performance for next-generation data center platforms
2
. The company now offers the industry's broadest data center LPDRAM portfolio, spanning 8GB to 64GB components and 48GB to 256GB SOCAMM2 modules.The modular SOCAMM2 design improves serviceability, supports liquid-cooled server architectures, and enables future capacity expansion as AI and core compute memory requirements continue to grow
2
. However, the solution's success may impact DRAM supply allocation, potentially affecting general-purpose products like GDDR7 as manufacturers balance production between AI-specific memory solutions like SOCAMM2 and HBM versus traditional memory products1
.As AI workloads continue scaling in complexity and data intensity, memory capacity, bandwidth, latency, and power efficiency have become primary system-level constraints directly influencing performance, scalability, and total cost of ownership
2
. LPDRAM's unique combination of these attributes positions it as a cornerstone solution for both AI and core compute servers in increasingly constrained data center environments.Summarized by
Navi
[1]
24 Mar 2025•Technology

05 Jun 2025•Technology

11 Jun 2025•Technology

1
Business and Economy

2
Policy and Regulation

3
Technology
