cuTENSOR 2.0: A Comprehensive Guide for Accelerating Tensor Computations

Originally published at: https://developer.nvidia.com/blog/cutensor-2-0-a-comprehensive-guide-for-accelerating-tensor-computations/

NVIDIA cuTENSOR is a CUDA math library that provides optimized implementations of tensor operations where tensors are dense, multi-dimensional arrays or array slices. The release of cuTENSOR 2.0 represents a major update—in both functionality and performance—over its predecessor. This version reimagines its APIs to be more expressive, including advanced just-in-time compilation capabilities all with the…