In page 20 of pdf version, there is:
CUDNN_DATA_UINT8x4 (new for 7.1)
The data is 32-bit elements each composed of 4 8-bit unsigned integer. This data type is only supported with tensor format CUDNN_TENSOR_NCHW_VECT_C.
In page 60, there is a table about convolution type:
UINT8x4_EXT_CONFIG CUDNN_DATA_UINT8x4 CUDNN_DATA_INT32 CUDNN_DATA_FLOAT
which means that UINT8x4_EXT_CONFIG must use CUDNN_DATA_UINT8x4 datatype. So its format should be CUDNN_TENSOR_NCHW_VECT_C. However, below in page 63, there is:
For the datatype configurations INT8_CONFIG, INT8_EXT_CONFIG, UINT8x4_CONFIG, and UINT8x4_EXT_CONFIG, the only algo supported is CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM with the following conditions:
‣ xDesc Format Support: CUDNN_TENSOR_NHWC
‣ yDesc Format Support: CUDNN_TENSOR_NHWC
‣ Input and output features maps must be multiple of 4
‣ wDesc Format Support: CUDNN_TENSOR_NHWC
‣ Dilation: 1 for all dimensions
‣ convDesc Group Count Support: Greater than 0.
This is contradictive with former conclusion that the xDesc and wDesc format should be CUDNN_TENSOR_NCHW_VECT_C.
Which format should I use?