cudnnFindConvolution*Algorithm excluding tensor units

anon68653800 · February 14, 2022, 5:54pm

Hi,

I would like to perform convolutions on an A100 excluding the tensor units.

When I invoke cudnnFindConvolutionForwardAlgorithm, cudnnFindConvolutionBackwardFilterAlgorithm and cudnnFindConvolutionBackwardDataAlgorithm functions with a convolution descriptor set with a CUDNN_FMA_MATH math type the results contain only performance obtained with CUDNN_FMA_MATH despite the documentation says that
“[the function] will attempt both the provided convDescmathType and CUDNN_DEFAULT_MATH (assuming the two differ)”.

Can I rely on this behavior? I.e., will passing to these functions a CUDNN_FMA_MATH descriptor produce a list containing only CUDNN_FMA_MATH attempts?

Many thanks,
Paolo

PS from the docs: “With NVIDIA Ampere Architecture and CUDA toolkit 11, CUDNN_DEFAULT_MATH permits TF32 Tensor Core operation and CUDNN_FMA_MATH does not.”

spolisetty · March 8, 2022, 6:51am

Hi,

Yes. As with NVIDIA Ampere Architecture and CUDA toolkit 11, CUDNN_DEFAULT_MATH permits TF32 Tensor Core operation and CUDNN_FMA_MATH does not.
https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnMathType_t

Thank you.

anon68653800 · March 8, 2022, 9:05am

Thanks for your answer, spolisetty.

I suppose that CUDNN_TENSOR_OP_MATH allows TF32 Tensor Core operations (i.e. converting fp32 into tf32) for CUDA 11 and Ampere architecture. In other words, the documentation doesn’t consider converting fp32 into tf32 an active datatype down conversion.

Did I understand correctly?

Thanks again,
Paolo

Topic		Replies	Views
Disabling TF32 in cuDNN at runtime on Ampere cuDNN	5	1713	August 11, 2022
cuDNN fp16 Support Jetson TX1	3	4004	July 22, 2016
What's the rules that cudnn convolution select fp32 vector core and tf32 tensor core? cuDNN	1	854	March 31, 2023
Cudnn can't use tensorcore cuDNN	0	767	March 16, 2023
Is there tensorcore kernel for 3D convolution? cuDNN	3	2197	December 30, 2019
CUDNN_STATUS_NOT_SUPPORTED for cudnnConvolutionBiasActivationForward() cuDNN	3	2340	January 19, 2021
Fp32 & a100 GPU-Accelerated Libraries cublas	3	761	December 16, 2021
Accelerating AI Training with NVIDIA TF32 Tensor Cores Technical Blog	1	560	January 29, 2021
peculiar return values of cudnnGetConvolutionForwardAlgorithm function GPU-Accelerated Libraries	0	1095	November 7, 2016
Conv3D - Running it on Tensor Core - cuDNN cuDNN	6	1517	June 12, 2020

cudnnFindConvolution*Algorithm excluding tensor units

Related topics