Supported cudnn layers for int8 inference

jprabhas · October 6, 2017, 10:02pm

cudnn6 introduced INT8 inference for convolution layers.

But it is not clear to me if the other cudnn layers also support int8.
For instance, I was able to call ReluForward with input and output tensor descriptor with data type set to int8, and passed in the actual data of int8 type as well. There were no compilation errors. Does that mean it supports INT8 inputs ?

I would appreciate a listing of the cudnn layers that can support INT8 inference in v6. The Doc does not mention anything other than Convolution Layer. How about layers such as Fully connected and Softmax layers ?

andrei.stoian · October 18, 2017, 9:37pm

Bumping this since I’m also interested in the same information. Additionally can we have the same information for fp16? I have a network that compiles but in half2 fp16 mode produces bogus results although it works well in fp32. Some other networks compile and work without a problem in fp16 mode.

Basically I’d like a list of supported operations, for example Convolution: 1x1?, 3x3?, 5x5? etc, Activations: Relu?, ELU? LeakyRELU?. For example Intel gives a detailed list of limitations for CNNs on their fpga accelerator: page 32 of https://www.intel.com/content/dam/support/us/en/documents/server-products/server-accessories/Intel_DLIA_UserGuide_1.0.pdf.

Andrei Stoian,
Thales

Topic		Replies	Views
cuDNN v6 INT8 convolution failing with CUDNN_STATUS_NOT_SUPPORTED cuDNN	12	5304	March 3, 2020
cudnn 7.0 does not support int8 convolution? GPU-Accelerated Libraries	0	650	March 13, 2018
Does cudnnSetRNNDescriptor_v6 support datatype CUDNN_DATA_INT8? cuDNN	0	786	August 29, 2018
Does cudnnPoolingBackward support int8 data type? cuDNN	2	477	September 2, 2021
Why Convolution in 8bits with CUDNN6.0 takes more time than fp32 convolution? GPU-Accelerated Libraries	0	517	October 10, 2017
TensorRT5/6 FC Layer not support Int8 quantization. TensorRT	4	976	October 20, 2019
Is there any layer that fp16 supports but int8 does not？ TensorRT	5	517	December 1, 2021
cuDNN How to correctly use CUDNN_DATA_INT8x4 GPU-Accelerated Libraries	3	1550	October 22, 2017
int8 fails for group convolutions (depthwise) on Xavier cuDNN	0	560	July 3, 2019
Data process about TensorRT INT8 and FP16 inference Engine Jetson TX2	4	2022	October 18, 2021

Supported cudnn layers for int8 inference

Related topics