设计工具
存储

The 微米 9400 NVMe 固态硬盘 is the top PCIe Gen4 固态硬盘 for AI 存储

韦斯·瓦斯克| 2023年9月

According to the their website, MLCommons was started in 2018 “…to accelerate machine learning innovation and increase its positive impact on society...“今天, MLCommons maintains and develops 6 different benchmark suites and is developing open datasets to support future state-of-the-art model development. The MLPerf 存储 Benchmark Suite is the latest addition to the benchmark collection.

As a member of the MLCommons 存储 Working Group, I’ve helped develop benchmark rules and processes to help ensure that benchmark results are meaningful to re搜索ers, 客户, and vendors alike and we’ve just published the first round of submissions 包括 美光9400固态硬盘.

But why do we need a new benchmark utility that’s specific to AI workloads?

Characterizing the 存储 workload for AI Training systems faces two unique challenges that the MLPerf 存储 Benchmark Suite aims to address – the cost of AI accelerators and the small size of available datasets.

第一点很明显, AI accelerators can be expensive, complex compute systems and most 存储 vendors won’t have enough AI systems available just to analyze their products’ scalability in 存储 solutions.

The second issue is that the openly-available datasets are small compared to what is commonly used in AI industry. Whereas the datasets available to MLCommons and its participants may get as large as 150 Gigabytes, datasets used in production are frequently 10s to 100s of Terabytes. Modern 服务器 can easily have 1 to 2 Terabytes of DRAM which has the effect of caching the small benchmark datasets in system memory after the first training epoch then executing subsequent runs from that in-DRAM data. But production datasets would not see the same behavior due to their size.

MLPerf 存储 addresses the first issue by emulating the accelerators in standard CPU-based 服务器. 在低水平上, MLPerf 存储 is using the same AI frameworks as the commonly-used workloads (pytorch, tensorflow, 等.) but MLPerf bypasses the compute portion of the platform with a “sleep time” that is found experimentally by running the real workload on systems with the actual AI accelerators.

Comparisons of the emulated accelerators and real accelerators show that the workloads are extremely similar.

MLPerf 存储 addresses the second issue by creating datasets that are similar to actual, production datasets but replicated to be much larger. The benchmark supports various data 存储 technologies like filesystems and object 存储 as well as multiple data types like serialized numpy arrays, TFRecord文件, HDF5文件, 和更多的.

In addition to solving these problems, 在之前的一篇博文中 和John Mazzie, we showed that the AI training workload is more complex than many expect – the workload is both bursty and latency sensitive.

The MLPerf 存储 Benchmark Suite is a great way to exercise 存储 systems in a way that represents real AI Training workloads without requiring expensive AI accelerators while also supporting dataset sizes representative of real-world datasets.

Now we’re proud to announce that the 微米 9400 NVMe 固态硬盘 supports 17x accelerators in the 3D Medical Imaging benchmark (Unet3D). This translates to 41 samples per second or 6.1 GB/s的IO吞吐量.

Armed with this benchmark that’s easy to run and representative of real AI Training environments the 微米 Data Center Workloads Engineering team will be presenting data across 存储 devices and solutions so that we can all better understand how to tune and design 存储 to increase accelerator utilization.

微米 9400 NVMe 固态硬盘 微米 9400 NVMe 固态硬盘

SMTS Systems Performance Engineer

韦斯Vaske

韦斯Vaske is a principal 存储 solution engineer with 微米.