Output scale for cudnnConvolutionBiasActivationForward

canxes · March 18, 2019, 7:46am

Hello!

I have some troubles with cudnnConvolutionBiasActivationForward function for int8(x4, x32). The well known practice for integer arithmetic in CNN is to approximate floats via integers + common scale factor. To be precise: float ~ integer*scale. However, Figure 2 from the cuDNN developers guide indicates that after convolution results are casted to int8 based on the min-max values. Hence, the output is a single tensor and there is no way to cast it back to floats and obtain ‘real’ results. That is a problem, I can’t work with relative values, I need proper mapping. Am I missing something? Is there any workaround?

I’m especially interested in procedures that supports int8x32 data type. Mainly because of speed.

References:
Figure 2 [url]https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#scaling-parameters__fig-conv-bias-activation-forward[/url]
cudnnConvolutionBiasActivationForward [url]https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnConvolutionBiasActivationForward[/url]

By the way, it looks like INT8_EXT(from cudnnConvolutionForward) config can be a solution for my problem. But I’m not sure how does it works with overflow and I wasn’t able to run cudnnConvolutionForward with inputs/filters in INT8x32 and output data type FLOAT.

UPD: After some tests I found out that cudnnConvolutionForward with INT8x32(input, output and weight) does perform rounding followed by saturation cast when transforming from FLOAT to INT8x32. That is nice. I will continue to experiment with cudnnConvolutionBiasActivationForward.

ocwins.z · March 23, 2021, 9:05am

How does it do the “saturation cast” ? How does it confirm the range ? Could you provide more info ?

Topic		Replies	Views
mixed data types and casting in cudnnConvolutionBiasActivationForward cuDNN	0	580	July 25, 2018
cuDNN v6 INT8 convolution failing with CUDNN_STATUS_NOT_SUPPORTED cuDNN	12	5232	March 3, 2020
cuDNN How to correctly use CUDNN_DATA_INT8x4 GPU-Accelerated Libraries	3	1516	October 22, 2017
Cudnn convolution performance(fp32, fp16. int8) on the jetson xavier cuDNN	3	1024	June 14, 2022
[CUDNN 8.1 conv2d] if we set alpha to 1.0 and set output as INT32, are there type conversions under the hood? cuDNN	4	812	November 6, 2021
How to reduce time spent in transforming tensors using CUDNNv6.0 for API cudnnTransformTensor() ? cuDNN	7	2022	December 28, 2022
CUDNN: cudnnConvolutionForward very bad performance(very long execution time) on xavier Jetson AGX Xavier	4	1040	October 18, 2021
Undeterministic 8-bit convolution output when channels are increased more than 4 with cudnnConvoluti... CUDA Programming and Performance	0	704	October 18, 2017
code samples for BatchNormalizationForwardInference cuDNN	2	849	October 12, 2018
cuDNN v6.0 failure of filter and workspace initialization for 3D convolution (CUDNN_STATUS_NOT_SUPPORTED) GPU-Accelerated Libraries	0	1651	May 3, 2017

Output scale for cudnnConvolutionBiasActivationForward

Related topics