Hi,

I would like to perform convolutions on an A100 excluding the tensor units.

When I invoke cudnnFindConvolutionForwardAlgorithm, cudnnFindConvolutionBackwardFilterAlgorithm and cudnnFindConvolutionBackwardDataAlgorithm functions with a convolution descriptor set with a CUDNN_FMA_MATH math type the results contain only performance obtained with CUDNN_FMA_MATH despite the documentation says that

“[the function] will attempt both the provided convDescmathType and CUDNN_DEFAULT_MATH (assuming the two differ)”.

Can I rely on this behavior? I.e., will passing to these functions a CUDNN_FMA_MATH descriptor produce a list containing only CUDNN_FMA_MATH attempts?

Many thanks,

Paolo

PS from the docs: “With NVIDIA Ampere Architecture and CUDA toolkit 11, CUDNN_DEFAULT_MATH permits TF32 Tensor Core operation and CUDNN_FMA_MATH does not.”