int8 fails for group convolutions (depthwise) on Xavier

Hello!

I am trying to make mobilenet-v1 work on the Jetson Xavier using all int8.

I followed the documentation to use the function cudnnConvolutionForward with the right params:

  • algo => _IMPLICIT_PRECOMP_GEMM
  • config => INT8_CONFIG
  • layout=> _NHWC
  • For the normal convolution, it was working well. However, it does not work for grouped convolution using the same setup (no problem when using F16 or F32). Isnt it supported?

    Surprisingly, if I set the following params, both normal and group convolution work:

  • algo => _IMPLICIT_GEMM
  • config => INT8_CONFIG
  • layout=> _NCHW
  • Following this last setup, normal convolution works as fast as with the first (supposed to be the right) one and also grouped convs work, although very slowly. How is it possible that setting a wrong layout and algo, normal convs work and even so fast?

    Also, I’d like to ask if there are some function to convert F32 to S8 and also, if there are layout conversion API as well.

    Thanks a lot!

    1 Like