MLPerf Storage V1.0 Results Show Critical Role Of Next Gen Storage In AI Model Training
Blocks & Files, Thursday, September 26th, 2024
MLCommons just released the result of MLPerf Storage Benchmark V1.0 which contains three workloads of 3D-Unet, resnet50, and cosmoflow.
Compared with V0.5, V1.0 removed Bert workload, added resnet50 and cosmoflow, when NVIDIA H100 and A100 were also added to accelerator types.
Huawei participated in the 3D-Unet workload test using an 8U dual-node OceanStor A800 and it successfully supported the data throughput requirement of 255 simulated NVIDIA H100s for training, by providing a stable bandwidth of 679 GB/s and maintaining over 90 percent accelerator utilization.