Meta repurposes DDR4 memory in DDR5 servers with custom chip to cut AI infrastructure costs

2 Sources

Share

Meta revealed it's reusing DDR4 memory from retired servers in new DDR5-only systems using a custom CXL ASIC called Vistara. The approach cuts AI inference server count by up to 25% and reduces job-restart overhead by 33%, helping the company navigate memory supply shortages while lowering infrastructure costs.

Meta Tackles Memory Supply Shortages with Innovative Hardware Solution

Meta is using DDR4 in DDR5-only servers to address memory supply shortages and reduce hardware costs, the company revealed at the International Symposium on Computer Architecture (ISCA) this week

1

2

. The tech giant has developed a custom CXL ASIC named Vistara chip to bridge the compatibility gap between legacy DDR4 memory and modern DDR5 servers, demonstrating how even companies with substantial budgets are finding creative workarounds to supply chain constraints. This marks a shift in how hyperscalers approach AI infrastructure design, prioritizing resource optimization over pure performance.

Source: PC Magazine

Source: PC Magazine

MemServer Configuration Combines Old and New Memory Technologies

Meta's new MemServer design pairs 768GB of DDR5 6400 with 256GB of DDR4 2400 memory, creating a combined 1TB memory pool powered by AMD Epyc Turin CPUs featuring 158 cores and 316 threads

2

. The Vistara chip enables this hybrid configuration by connecting DDR4 memory to host processors via a CXL 2.0/1.1-compliant PCIe Gen5 x16 interface. Each Vistara chip integrates two independent 72-bit DDR4 memory channels, supporting speeds up to 3,200 MT/s and capacities up to 256GB per chip using 64GB DIMMs. Driven by custom RISC-V processors, the ASIC treats DDR4 memory as a distinct, CPU-less NUMA nodes configuration, separate from local DDR5 DRAM nodes attached directly to the processor

2

.

Performance Trade-offs Justified by AI Inference Workloads

The performance gap between the two memory types is substantial. The 768GB of DDR5 delivers a local peak bandwidth of 614GB/s, while 256GB of DDR4 can deliver only 76GB/s, with idle latency nearly double that of DDR5

1

. However, Meta's analysis of memory access patterns across its server workloads revealed that large portions of memory remain idle for extended periods. "A small fraction of memory is accessed at any given moment, the rest is cold," Meta explained, noting that using a slower CXL-memory tier will minimally impact overall application performance

1

. This approach allows the system to keep frequently accessed data in faster DDR5 memory while relegating cold data storage to the slower DDR4 pool, optimizing resources for AI inference workloads.

Measurable Cost Savings and Infrastructure Benefits

The design reportedly cuts AI inference server count by up to 25% and reduces job-restart and fragmentation overhead by 33%

2

. Beyond immediate cost savings, the added capacity helps prevent servers from running out of memory, improves retention rates of Meta hardware, and reduces wear on SSDs and DDR5 memory

1

. Meta also modified the Linux CXL driver to enable older DIMMs to work on platforms that don't officially support them, with all kernel code either already upstream or on track for integration

2

. As Meta looks to sell compute access to recoup AI costs, these hardware design advantages may become critical to its ongoing AI efforts.

Source: TechSpot

Source: TechSpot

Industry Trend Emerges as Other Companies Follow Suit

Meta is not alone in exploring hybrid memory solutions. South Korean semiconductor firm Panmnesia presented similar CXL 2.0 research at ISCA 2026 on June 29, announcing a custom CXL controller and fabric switch with Port-Based Routing

2

. Panmnesia is currently sampling its PCIe 6.4/CXL 3.2 Fusion Switch with select customers and developing a PCIe 7.0/CXL 4.0 Combo IP controller with support for the latest CXL 4.0 features. This signals a broader industry shift toward creative memory management solutions as hyperscalers navigate ongoing supply constraints while scaling AI infrastructure. The question now is whether consumer-grade motherboard manufacturers will adopt similar technology to give end users access to hybrid memory configurations.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved