Tensor Core Programming Using CUDA Fortran

Originally published at: Tensor Core Programming Using CUDA Fortran | NVIDIA Technical Blog

The CUDA Fortran compiler from PGI now supports programming Tensor Cores with NVIDIA’s Volta V100 and Turing GPUs. This enables scientific programmers using Fortran to take advantage of FP16 matrix operations accelerated by Tensor Cores. Let’s take a look at how Fortran supports Tensor Cores. Tensor Cores Tensor Cores offer substantial performance gains over typical CUDA…