How tactic generated and layer fusion work in TensorRT?

Recently I want to know how layer fusion work and how tactic chose and generated in TensorRT. I found verbose messages show many tactic and time:

[TensorRT] VERBOSE: Tactic: 0 time 0.009216
[TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.009216
[TensorRT] VERBOSE: *************** Autotuning format combination: Float(1,224,50176,150528) -> Float(1,112,12544,802816) ***************
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (FusedConvActConvolution)
[TensorRT] VERBOSE: Tactic: 131071 time 0.131328
[TensorRT] VERBOSE: Tactic: 3276799 time 0.110976
[TensorRT] VERBOSE: Tactic: 8454143 time 0.11232
[TensorRT] VERBOSE: Fastest Tactic: 3276799 Time: 0.110976
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution)
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1
[TensorRT] VERBOSE: Tactic: 1825138533642645384 time 0.137568
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: Tactic: 2842488832350522458 time 0.078592
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: Tactic: 6448355332020552203 time 0.152704
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1
[TensorRT] VERBOSE: Tactic: -8060443123034038864 time 0.079968
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1
[TensorRT] VERBOSE: Tactic: -4420849921117327522 time 0.084832
[TensorRT] VERBOSE: Fastest Tactic: 2842488832350522458 Time: 0.078592
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CudaConvolution)
[TensorRT] VERBOSE: Tactic: 0 time 0.176064
[TensorRT] VERBOSE: Tactic: 1 time 0.120128
[TensorRT] VERBOSE: Tactic: 2 time 0.253344
[TensorRT] VERBOSE: Tactic: 5 time 2.07347
[TensorRT] VERBOSE: Tactic: 56 time 0.176672
[TensorRT] VERBOSE: Tactic: 57 time 0.119968
[TensorRT] VERBOSE: Tactic: 58 time 0.25248
[TensorRT] VERBOSE: Tactic: 61 time 2.19757
[TensorRT] VERBOSE: Fastest Tactic: 57 Time: 0.119968
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CudaDepthwiseConvolution)
[TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CublasConvolution)
[TensorRT] VERBOSE: CublasConvolution has no valid tactics for this config, skipping
[TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2842488832350522458
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: 
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1
[TensorRT] VERBOSE: *************** Autotuning format combination: Float(3,672,1,150528) -> Float(64,7168,1,802816) ***************
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (FusedConvActConvolution)
[TensorRT] VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution)
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1
[TensorRT] VERBOSE: Tactic: 861694390046228376 time 0.31616
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1
[TensorRT] VERBOSE: Tactic: -3853827649136781465 time 0.318176
[TensorRT] VERBOSE: Fastest Tactic: 861694390046228376 Time: 0.31616
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CudaConvolution)
[TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CudaDepthwiseConvolution)
[TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping
[TensorRT] VERBOSE: --------------- Timing Runner: Conv_0 + Relu_1 (CublasConvolution)
[TensorRT] VERBOSE: CublasConvolution has no valid tactics for this config, skipping
[TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 861694390046228376
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1
[TensorRT] VERBOSE: 
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1
[TensorRT] VERBOSE: Conv_0 + Relu_1 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1
[TensorRT] VERBOSE: --------------- Timing Runner: <reformat> (Reformat)
[TensorRT] VERBOSE: Tactic: 1002 time 0.0288
[TensorRT] VERBOSE: Tactic: 0 time 0.060928
[TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.0288
[TensorRT] VERBOSE: *************** Autotuning format combination: Float(1,112,12544,802816) -> Float(1,56,3136,200704) ***************

and layer fusion:

[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_0 with Relu_1
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_3 with Relu_4
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_8 with Add_9
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_5 with Relu_6
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_8 + Add_9 with Relu_10
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_11 with Relu_12
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_13 with Relu_14
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_15 with Add_16
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_15 + Add_16 with Relu_17
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_18 with Relu_19
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_20 with Relu_21
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_22 with Add_23
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_22 + Add_23 with Relu_24
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_25 with Relu_26
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_30 with Add_31
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_27 with Relu_28
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_30 + Add_31 with Relu_32
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_33 with Relu_34
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_35 with Relu_36
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_37 with Add_38
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_37 + Add_38 with Relu_39
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_40 with Relu_41
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_42 with Relu_43
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_44 with Add_45
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_44 + Add_45 with Relu_46
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_47 with Relu_48
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_49 with Relu_50
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_51 with Add_52
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_51 + Add_52 with Relu_53
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_54 with Relu_55
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_59 with Add_60
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_56 with Relu_57
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_59 + Add_60 with Relu_61
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_62 with Relu_63
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_64 with Relu_65
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_66 with Add_67
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_66 + Add_67 with Relu_68
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_69 with Relu_70
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_71 with Relu_72
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_73 with Add_74
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_73 + Add_74 with Relu_75
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_76 with Relu_77
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_78 with Relu_79
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_80 with Add_81
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_80 + Add_81 with Relu_82
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_83 with Relu_84
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_85 with Relu_86
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_87 with Add_88
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_87 + Add_88 with Relu_89
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_90 with Relu_91
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_92 with Relu_93
[TensorRT] VERBOSE: ConvEltwiseSumFusion: Fusing Conv_94 with Add_95
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_94 + Add_95 with Relu_96
[TensorRT] VERBOSE: BinaryFusion: Fusing Conv_97 with Relu_98

So my question are:

Are there any documents or source code show:

  1. What algorithm be used to fuse the layer?
  2. How Tactic generated?

Hi @chilin.cs07 ,

This are TRT internal information and involves a lot of TensorRT implementation details, which may change from time to time.

Thanks