MLPerf v1.0 Training Benchmarks: Insights into a Record-Setting NVIDIA Performance

Originally published at:

MLPerf v1.0 showcases the continuous innovation that is happening in the AI domain. In the last two-and-a-half years since the first MLPerf training benchmark launched, NVIDIA performance has increased by nearly 7x. In this post, we describe some of the major optimizations that enabled such improvements.

Hi I am trying to reproduce RNNT distributed training. Do you have any specific throughput/step time numbers that I can refer to? Thanks!