- Nvidia is pushing SSD makers toward 100 million IOPS storage target
- But Silicon Motion CEO says industry lacks memory tech to meet AI demands
- New memory may be needed to unlock ultra-fast AI storage
As GPUs grow faster and memory bandwidth scales into the terabytes per second, storage has become the next major bottleneck in AI computing.
Nvidia is looking to push storage to match the demands of AI models by hitting an ambitious target for small-block random reads.
“Right now, they are aiming for 100 million IOPS – which is huge,” Wallace C. Kuo, CEO of Silicon Motion, told Tom’s Hardware.
Looking for a media change
Today’s fastest PCIe 5.0 SSDs top out at around 14.5GB/s and 2 to 3 million IOPS in workloads involving 4K and 512-byte reads.
While larger blocks favor bandwidth, AI inference typically pulls small, scattered bits of data. That makes 512B random reads more relevant, and much harder to accelerate.
Kioxia is already preparing an “AI SSD” based on XL-Flash, that is expected to exceed 10 million IOPS. It could launch alongside Nvidia’s upcoming Vera Rubin platform next year. But scaling beyond that could require more than just faster controllers or NAND tweaks.
“I believe they are looking for a media change,” Kuo said. “Optane was supposed to be the ideal solution, but it is gone now. Kioxia is trying to bring XL-NAND and improve its performance. SanDisk is trying to introduce High Bandwidth Flash, but honestly, I don’t really believe in it.”
Power, cost, and latency all pose challenges. “The industry really needs something fundamentally new,” Kuo added. “Otherwise, it will be very hard to achieve 100 million IOPS and still be cost-effective.”
Micron, SanDisk, and others are racing to invent new forms of non-volatile memory.
Whether any of them will arrive in time for Nvidia’s next wave of hardware is the big unknown.
You might also like
- Take a look at our picks for the fastest SSDs around
- Check out our choice for the largest HDDs and SSDs
- Samsung rival plans monstrously fast SSD that can reach 10 million IOPS