2 Sources
[1]
SMI CEO claims Nvidia wants SSDs with 100M IOPS -- up to 33X performance uplift could eliminate AI GPU bottlenecks
Now that the AI industry has exceptionally high-performance GPUs with high-bandwidth memory (HBM), one of the bottlenecks that AI training and inference systems face is storage performance. To that end, Nvidia is working with partners to build SSDs that can hit random read performance of 100 million input/output operations per second (IOPS) in small-block workloads, according to Wallace C. Kuo, who spoke with Tom's Hardware in an exclusive interview. "Right now, they are aiming for 100 million IOPS -- which is huge," Kuo told Tom's Hardware. Modern AI accelerators, such as Nvidia's B200, feature HBM3E memory bandwidth of around 8 TB/s, which significantly exceeds the capabilities of modern storage subsystems in both overall throughput and latency. Modern PCIe 5.0 x4 SSDs top at around 14.5 GB/s and deliver 2 - 3 million IOPS for both 4K and 512B random reads. Although 4K blocks are better suited for bandwidth, AI models typically perform small, random fetches, which makes 512B blocks a better fit for their latency-sensitive patterns. However, increasing the number of I/O operations per second by 33 times is hard, given the limitations of both SSD controllers and NAND memory. In fact, Kioxia is already working on an 'AI SSD' based on its XL-Flash memory designed to surpass 10 million 512K IOPS. The company currently plans to release this drive during the second half of next year, possibly to coincide with the rollout of Nvidia's Vera Rubin platform. To get to 100 million IOPS, one might use multiple 'AI SSDs.' However, the head of SMI believes that achieving 100 million IOPS on a single drive featuring conventional NAND with decent cost and power consumption will be extremely hard, so a new type of memory might be needed. "I believe they are looking for a media change," said Kuo. "Optane was supposed to be the ideal solution, but it is gone now. Kioxia is trying to bring XL-NAND and improve its performance. SanDisk is trying to introduce High Bandwidth Flash (HBF), but honestly, I don't really believe in it. Right now, everyone is promoting their own technology, but the industry really needs something fundamentally new. Otherwise, it will be very hard to achieve 100 million IOPS and still be cost-effective." Currently, many companies, including Micron and SanDisk, are developing new types of non-volatile memory. However, when these new types of memory will be commercially viable is something that even the head of Silicon Motion is not sure about.
[2]
Kioxia introduces high-IOPS SSDs and long-term flash strategies for emerging AI and enterprise storage demands
New drive uses XL-Flash, a type of SLC NAND, and a new in-house controller Kioxia has unveiled plans for a new SSD it says could hit an impressive 10 million IOPS, a level of performance aimed squarely at the demands of AI-driven systems. The SSD will use XL-Flash, a type of single-level cell (SLC) NAND, combined with a new in-house controller. A Kioxia spokesperson told TechPowerUp, "We're taking our ultra-fast XL-Flash memory chips, which use single-level cells, and pairing them with a completely new controller... We're targeting over 10 million IOPS, and we plan to have samples ready by the second half of 2026." IOPS, or input/output operations per second, measures how quickly a storage device can handle small, random requests, particularly important in AI and server applications where fast access to small files is key. This is different from GBps, which refers to the actual data transfer speed and is used to measure how fast large files can be read or written. A drive with high GBps might excel in video editing or large file transfers, but for machine learning tasks where thousands of small data packets are read or written constantly, high IOPS matters more. Kioxia's approach to next-gen storage includes not just one-off projects but a wider effort to meet varied use cases. Its CM9 series, which is sampling to customers now, focuses on speed and reliability to match high-end GPUs used in AI, while the LC9 series delivers massive 122TB capacities for large databases. Behind these products is the 8th generation BiCS FLASH, which introduces CBA tech to boost performance and efficiency. Kioxia is also preparing future flash memory generations using two methods. The first will add more layers for capacity, while the second blends new CMOS designs with older cell structures to keep investment costs in check.
Share
Copy Link
Nvidia is working with partners to develop SSDs capable of 100 million IOPS, aiming to address storage performance bottlenecks in AI systems. Meanwhile, Kioxia plans to release an 'AI SSD' with 10 million IOPS by 2026.
Nvidia is collaborating with partners to develop solid-state drives (SSDs) capable of achieving an unprecedented 100 million input/output operations per second (IOPS) for small-block workloads. This initiative aims to address the storage performance bottlenecks faced by AI training and inference systems 1.
Wallace C. Kuo, CEO of Silicon Motion Inc. (SMI), revealed this ambitious goal in an exclusive interview with Tom's Hardware. The target of 100 million IOPS represents a significant leap from current PCIe 5.0 x4 SSDs, which top out at around 2-3 million IOPS for both 4K and 512B random reads 1.
Source: Tom's Hardware
Modern AI accelerators, such as Nvidia's B200, feature high-bandwidth memory (HBM3E) with bandwidth around 8 TB/s. This significantly exceeds the capabilities of current storage subsystems in both overall throughput and latency. AI models typically perform small, random fetches, making 512B blocks more suitable for their latency-sensitive patterns 1.
In response to these emerging demands, Kioxia is developing an 'AI SSD' based on its XL-Flash memory. This drive aims to surpass 10 million 512K IOPS, a significant improvement over current SSDs. Kioxia plans to release this drive during the second half of 2026, potentially aligning with the rollout of Nvidia's Vera Rubin platform 1 2.
A Kioxia spokesperson told TechPowerUp, "We're taking our ultra-fast XL-Flash memory chips, which use single-level cells, and pairing them with a completely new controller... We're targeting over 10 million IOPS, and we plan to have samples ready by the second half of 2026" 2.
Source: TechRadar
Achieving 100 million IOPS on a single drive with conventional NAND while maintaining cost-effectiveness and power efficiency poses significant challenges. SMI's CEO believes that a new type of memory might be necessary to reach this goal 1.
Several companies, including Micron and SanDisk, are developing new types of non-volatile memory. However, the commercial viability of these technologies remains uncertain 1.
Kioxia is not only focusing on high-IOPS SSDs but also developing a range of products to meet diverse storage needs. Their CM9 series, currently sampling to customers, aims to match the speed and reliability requirements of high-end GPUs used in AI. The LC9 series offers massive 122TB capacities for large databases 2.
The company is also preparing future flash memory generations using two methods: adding more layers for increased capacity and blending new CMOS designs with older cell structures to manage investment costs 2.
Google has launched its new Pixel 10 series, featuring improved AI capabilities, camera upgrades, and the new Tensor G5 chip. The lineup includes the Pixel 10, Pixel 10 Pro, and Pixel 10 Pro XL, with prices starting at $799.
60 Sources
Technology
9 hrs ago
60 Sources
Technology
9 hrs ago
Google launches its new Pixel 10 smartphone series, showcasing advanced AI capabilities powered by Gemini, aiming to compete with Apple in the premium handset market.
22 Sources
Technology
9 hrs ago
22 Sources
Technology
9 hrs ago
NASA and IBM have developed Surya, an open-source AI model that can predict solar flares and space weather with improved accuracy, potentially helping to protect Earth's infrastructure from solar storm damage.
6 Sources
Technology
17 hrs ago
6 Sources
Technology
17 hrs ago
Google's latest smartwatch, the Pixel Watch 4, introduces significant upgrades including a curved display, AI-powered features, and satellite communication capabilities, positioning it as a strong competitor in the smartwatch market.
18 Sources
Technology
9 hrs ago
18 Sources
Technology
9 hrs ago
FieldAI, a robotics startup, has raised $405 million to develop "foundational embodied AI models" for various robot types. The company's innovative approach integrates physics principles into AI, enabling safer and more adaptable robot operations across diverse environments.
7 Sources
Technology
9 hrs ago
7 Sources
Technology
9 hrs ago