cudnn6 introduced INT8 inference for convolution layers.
But it is not clear to me if the other cudnn layers also support int8.
For instance, I was able to call ReluForward with input and output tensor descriptor with data type set to int8, and passed in the actual data of int8 type as well. There were no compilation errors. Does that mean it supports INT8 inputs ?
I would appreciate a listing of the cudnn layers that can support INT8 inference in v6. The Doc does not mention anything other than Convolution Layer. How about layers such as Fully connected and Softmax layers ?
Bumping this since I’m also interested in the same information. Additionally can we have the same information for fp16? I have a network that compiles but in half2 fp16 mode produces bogus results although it works well in fp32. Some other networks compile and work without a problem in fp16 mode.
Basically I’d like a list of supported operations, for example Convolution: 1x1?, 3x3?, 5x5? etc, Activations: Relu?, ELU? LeakyRELU?. For example Intel gives a detailed list of limitations for CNNs on their fpga accelerator: page 32 of https://www.intel.com/content/dam/support/us/en/documents/server-products/server-accessories/Intel_DLIA_UserGuide_1.0.pdf.