What's the difference between Cuda Cores kernels (icudnn, hcudnn and scudnn) and Tensor Cores Kernels (h884 and i8816)?

Hi all,

I used trtexec to generate TensorRT engine from an ONNX YOLOv3-Tiny model (batch size = 16) with this command line

/usr/src/tensorrt/bin/trtexec --onnx=yolov3-tiny-416-bs16.onnx --best --workspace=2048 --saveEngine=yolov3-tiny-416-bs16.trt --calib=calib_yolov3-tiny-int8-416.bin --verbose

And that’s a part what i had on the logs

[06/01/2021-12:30:53] [V] [TRT] Fastest Tactic: 0 Time: 0.134384
[06/01/2021-12:30:53] [V] [TRT] *************** Autotuning format combination: Float(1,104,10816,346112) -> Float(1,104,10816,692224) ***************
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1
[06/01/2021-12:30:53] [V] [TRT] --------------- Timing Runner: 005_convolutional (FusedConvActConvolution)
[06/01/2021-12:30:53] [V] [TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:53] [V] [TRT] --------------- Timing Runner: 005_convolutional (CaskConvolution)
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1
[06/01/2021-12:30:53] [V] [TRT] Tactic: 1825138533642645384 time 9.57251
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:53] [V] [TRT] Tactic: 2775507031594384867 time 3.58727
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[06/01/2021-12:30:53] [V] [TRT] Tactic: 2842488832350522458 time 4.8962
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1
[06/01/2021-12:30:53] [V] [TRT] Tactic: 3915320020053085238 time 9.49643
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1
[06/01/2021-12:30:53] [V] [TRT] Tactic: 6448355332020552203 time 10.5895
[06/01/2021-12:30:53] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1
[06/01/2021-12:30:54] [V] [TRT] Tactic: 6808617066150061604 time 4.89256
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1
[06/01/2021-12:30:54] [V] [TRT] Tactic: -8060443123034038864 time 4.93347
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1
[06/01/2021-12:30:54] [V] [TRT] Tactic: -4420849921117327522 time 5.7211
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1
[06/01/2021-12:30:54] [V] [TRT] Tactic: -3946921629105938337 time 5.29323
[06/01/2021-12:30:54] [V] [TRT] Fastest Tactic: 2775507031594384867 Time: 3.58727
[06/01/2021-12:30:54] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaConvolution)
[06/01/2021-12:30:54] [V] [TRT] Tactic: 0 time 8.21752
[06/01/2021-12:30:54] [V] [TRT] Tactic: 2 time 10.1619
[06/01/2021-12:30:54] [V] [TRT] Tactic: 5 time 6.925
[06/01/2021-12:30:54] [V] [TRT] Tactic: 6 time 4.47044
[06/01/2021-12:30:54] [V] [TRT] Tactic: 57 time 6.39634
[06/01/2021-12:30:54] [V] [TRT] Fastest Tactic: 6 Time: 4.47044
[06/01/2021-12:30:54] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaDepthwiseConvolution)
[06/01/2021-12:30:54] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:54] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:54] [V] [TRT] 
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1
[06/01/2021-12:30:54] [V] [TRT] 005_convolutional (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:54] [V] [TRT] *************** Autotuning format combination: Half(1,104,10816,346112) -> Half(1,104,10816,692224) ***************
[06/01/2021-12:30:54] [V] [TRT] --------------- Timing Runner: 005_convolutional (FusedConvActConvolution)
[06/01/2021-12:30:54] [V] [TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:54] [V] [TRT] --------------- Timing Runner: 005_convolutional (CaskConvolution)
[06/01/2021-12:30:54] [V] [TRT] CaskConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:54] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaConvolution)
[06/01/2021-12:30:55] [V] [TRT] Tactic: 0 time 8.72234
[06/01/2021-12:30:55] [V] [TRT] Tactic: 1 time 6.55663
[06/01/2021-12:30:55] [V] [TRT] Tactic: 2 time 9.48303
[06/01/2021-12:30:56] [V] [TRT] Tactic: 4 time 42.3345
[06/01/2021-12:30:56] [V] [TRT] Tactic: 5 time 6.80992
[06/01/2021-12:30:56] [V] [TRT] Tactic: 6 time 5.71228
[06/01/2021-12:30:56] [V] [TRT] Fastest Tactic: 6 Time: 5.71228
[06/01/2021-12:30:56] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaDepthwiseConvolution)
[06/01/2021-12:30:56] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:56] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 6
[06/01/2021-12:30:56] [V] [TRT] 
[06/01/2021-12:30:56] [V] [TRT] *************** Autotuning format combination: Half(1,104,10816:2,173056) -> Half(1,104,10816:2,346112) ***************
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn_winograd) Set Tactic Name: volta_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1
[06/01/2021-12:30:56] [V] [TRT] --------------- Timing Runner: 005_convolutional (FusedConvActConvolution)
[06/01/2021-12:30:56] [V] [TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:56] [V] [TRT] --------------- Timing Runner: 005_convolutional (CaskConvolution)
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: 1145226902788474763 time 2.53245
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: 2418518597804310654 time 2.58319
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: 8292881859266835088 time 2.93809
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: 8401509141903434922 time 2.50344
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: -8654297089785671176 time 5.4783
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: -7448936905981214224 time 3.10704
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: -3689982367035295496 time 5.50244
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn_winograd) Set Tactic Name: volta_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: -3140347171730126532 time 2.11615
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1
[06/01/2021-12:30:56] [V] [TRT] Tactic: -2027588946874785071 time 2.7553
[06/01/2021-12:30:56] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: -245090590808296743 time 5.57664
[06/01/2021-12:30:57] [V] [TRT] Fastest Tactic: -3140347171730126532 Time: 2.11615
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaConvolution)
[06/01/2021-12:30:57] [V] [TRT] CudaConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaDepthwiseConvolution)
[06/01/2021-12:30:57] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3140347171730126532
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn_winograd) Set Tactic Name: volta_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:57] [V] [TRT] 
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn_winograd) Set Tactic Name: volta_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn) Set Tactic Name: volta_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (hcudnn_winograd) Set Tactic Name: volta_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148t_nt_v1
[06/01/2021-12:30:57] [V] [TRT] *************** Autotuning format combination: Half(4,416,1:8,43264) -> Float(1,104,10816,692224) ***************
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (FusedConvActConvolution)
[06/01/2021-12:30:57] [V] [TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CaskConvolution)
[06/01/2021-12:30:57] [V] [TRT] CaskConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaConvolution)
[06/01/2021-12:30:57] [V] [TRT] CudaConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaDepthwiseConvolution)
[06/01/2021-12:30:57] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] *************** Autotuning format combination: Half(4,416,1:8,43264) -> Half(8,832,1:8,86528) ***************
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_128x128_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x128_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_sliced1x2_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_sliced1x2_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_128x128_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x128_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (FusedConvActConvolution)
[06/01/2021-12:30:57] [V] [TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CaskConvolution)
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_128x128_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: 3754069740140581927 time 1.62507
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x128_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: 5925270497649423688 time 1.76214
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_sliced1x2_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: 6680916730816870145 time 2.02223
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: 7158029511300006471 time 1.03288
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_sliced1x2_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: 7859952145590271433 time 1.94858
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_128x128_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: 8283847742354150423 time 1.67331
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x128_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: -4534876761957424274 time 1.71714
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] Tactic: -3237051169894153788 time 1.02711
[06/01/2021-12:30:57] [V] [TRT] Fastest Tactic: -3237051169894153788 Time: 1.02711
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaConvolution)
[06/01/2021-12:30:57] [V] [TRT] Tactic: 0 time 11.5548
[06/01/2021-12:30:57] [V] [TRT] Tactic: 1 time 7.23211
[06/01/2021-12:30:57] [V] [TRT] Tactic: 2 time 11.6116
[06/01/2021-12:30:57] [V] [TRT] Tactic: 6 time 5.55104
[06/01/2021-12:30:57] [V] [TRT] Fastest Tactic: 6 Time: 5.55104
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaDepthwiseConvolution)
[06/01/2021-12:30:57] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3237051169894153788
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_128x128_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x128_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_sliced1x2_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_sliced1x2_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_128x128_ldg8_relu_exp_medium_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x128_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] 005_convolutional (h884cudnn) Set Tactic Name: volta_h884cudnn_256x64_ldg8_relu_exp_small_nhwc_tn_v1
[06/01/2021-12:30:57] [V] [TRT] *************** Autotuning format combination: Half(1,104,10816:32,10816) -> Half(1,104,10816:32,21632) ***************
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (FusedConvActConvolution)
[06/01/2021-12:30:57] [V] [TRT] FusedConvActConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CaskConvolution)
[06/01/2021-12:30:57] [V] [TRT] CaskConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaConvolution)
[06/01/2021-12:30:57] [V] [TRT] CudaConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:57] [V] [TRT] --------------- Timing Runner: 005_convolutional (CudaDepthwiseConvolution)
[06/01/2021-12:30:57] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping
[06/01/2021-12:30:58] [V] [TRT] *************** Autotuning format combination: Int8(1,104,10816:4,86528) -> Float(1,104,10816,692224) ***************
[06/01/2021-12:30:58] [V] [TRT] 005_convolutional (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_medium_nn_v1
[06/01/2021-12:30:58] [V] [TRT] 005_convolutional (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1

Environment

Jetpack Version: 4.5.1
Board : NVIDIA Jetson AGX Xavier
TensorRT Version: 7.1.3
GPU Type: Volta 512 CUDA Cores, 64 Tensor Cores
Nvidia Driver Version:
CUDA Version: 10.2
CUDNN Version: 8.0

If i truly understand, TensorRT chooses between CUDA cores and Tensor cores first and then, TRT chooses one of CUDA kernels or Tensor Core kernels which had the less latency, so my questions are

  1. Do hcudnn (CUDA core) and h884cudnn (Tensor core) are kernels with HMMA (half-precision matrix multiply and accumulate) machine instructions ?

  2. Do icudnn (CUDA core) and i8816cudnn (Tensor core) are kernels with IMMA (integer-precision matrix multiply and accumulate) machine instructions ?

  3. scudnn string means using CUDA core or Tensor core and what type of machine instruction it uses HMMA or IMMA?

  4. Is there any NVIDIA Documentation about the different kernels ? because i couldn’t find the meaning of this string volta_scudnn_128x128_relu_medium_nn_v1 like nn neither of this string volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 like nt ?

Thanks

Hi @chakibdace,

We don’t have documentation for this. Please let us know the purpose of knowing this.
h884 = HMMA = FP16 TensorCore
i8816 = IMMA = INT8 TensorCore
hcudnn = FP16 normal CUDA kernel (without TensorCore)
icudnn = INT8 normal CUDA kernel (without TensorCore)
scudnn = FP32 normal CUDA kernel (without TensorCore)

Thank you.

Hi @spolisetty,

Thanks for your answer, the purpose is to know if TensorRT use cuDNN to build TRT Engine, because in TensorRT documentation, it’s not mentionned that TensorRT use the cuDNN library and when i used Logs when i run the inference with TensorRT i got a lot of kenels which include cuDNN name and that confused me.

1 Like