How to convert Filter descriptors into CUDNN_TENSOR_NCHW_VECT_C format with cudnnTransformTensor() ?


I want to run for INT8 convolutions i.e DP4A product enabled GPUs for 4x faster inference.
I checked the CUDNN user guide and found “INT8x4_EXT_CONFIG” configuration which takes xdesc and wdesc as CUDNN_DATA_INT8x4 4-byte packed signed integers as inputs with convdesc as CUDNN_DATA_INT32 and giving output as CUDNN_DATA_FLOAT.

According to NVIDIA CUDNN guide, Pg 63 : “Tensors can be converted to/from CUDNN_TENSOR_NCHW_VECT_C with

This statement means if I read my input images as NCHW format, I can convert them to CUDNN_TENSOR_NCHW_VECT_C format using cudnnTransformTensor() . -> FINE ! Is my understanding correct ?

But; how to convert the filter descriptors in the same format ? I can’t use the same cudnnTransformTensor() API because it’s arguments are ‘cudnnTensorDescriptor_t’ and not “cudnnFilterDescriptor_t” for input and transformed output.

Please let me know this !!



I also have this question. Did you get the answer?

Thank you.