Using Nsight Compute or Nvprof to Show Mixed Precision Use in Deep Learning Models

Originally published at: Using Nsight Compute or Nvprof to Show Mixed Precision Use in Deep Learning Models | NVIDIA Technical Blog

Mixed precision combines different numerical precisions in a computational method. The Volta and Turing generation of GPUs introduced Tensor Cores, which provide significant throughput speedups over single precision math pipelines. Deep learning networks can be trained with lower precision for high throughput, by halving storage requirements and memory traffic on gradient and activation tensors. The…