NVIDIA Pushes Hopper HBM Memory, And That Lifts GPU Performance
The Next Platform, Monday, November 13,2023
For very sound technical and economic reasons, processors of all kinds have been overprovisioned on compute and underprovisioned on memory bandwidth - and sometimes memory capacity depending on the device and depending on the workload - for decades.
Web infrastructure workloads and some relatively simple analytics and database workloads can do fine on modern CPUs with a dozen or so DDR memory channels, but for HPC simulation and modeling and AI training and inference, even the most advanced GPUs are literally starving for memory bandwidth and memory capacity to actually drive up the utilization on the vector and matrix engines that are already there on the silicon. These GPUs spend a lot of time waiting for data, scratching themselves.