Getting the new optimzed layers using TensorRT

Description

Hello all,
I have converted my model from Caffe to TRT using the trtexec command. I have set the precision calibration to 16 and the maxbatch to 1, with verbose The report is attached as a pdf file.
I want to know what is happening in these lines, do these layers have been merged?
Thanks in advance

Layer(Scale): data_bn + data_scale, Tactic: 0, data[Float(-2,3,300,300)] -> data_bn[Float(-2,3,300,300)] 
Layer(CaskConvolution): conv1_h + conv1_relu, Tactic: 1062367460111450758, data_bn[Float(-
2,3,300,300)] -> conv1_h[Float(-2,32,150,150)] 
Layer(TiledPooling): conv1_pool, Tactic: 6947073, conv1_h[Float(-2,32,150,150)] -> Reformatted 
Output Tensor 0 to conv1_pool[Float(-2,32,75,75)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to conv1_pool, Tactic: 0, Reformatted 
Output Tensor 0 to conv1_pool[Float(-2,32,75,75)] -> conv1_pool[Half(-2,32,75,75)] 
Layer(CaskConvolution): layer_64_1_conv1_h + layer_64_1_relu2, Tactic: 4772821744921268633, 
conv1_pool[Half(-2,32,75,75)] -> layer_64_1_conv1_h[Half(-2,32,75,75)] 
Layer(CaskConvolution): layer_64_1_conv2_h + layer_64_1_sum, Tactic: 4772821744921268633, 
layer_64_1_conv1_h[Half(-2,32,75,75)], conv1_pool[Half(-2,32,75,75)] -> layer_64_1_sum[Half(-
2,32,75,75)] 
Layer(Scale): layer_128_1_bn1_h + layer_128_1_scale1_h + layer_128_1_relu1, Tactic: 0, 
layer_64_1_sum[Half(-2,32,75,75)] -> layer_128_1_bn1_h[Half(-2,32,75,75)] 
Layer(CaskConvolution): layer_128_1_conv1_h + layer_128_1_relu2, Tactic: -4212163711445252890, 
layer_128_1_bn1_h[Half(-2,32,75,75)] -> layer_128_1_conv1_h[Half(-2,128,38,38)] 
Layer(CaskConvolution): layer_128_1_conv_expand_h, Tactic: -4212163711445252890, 
layer_128_1_bn1_h[Half(-2,32,75,75)] -> layer_128_1_conv_expand_h[Half(-2,128,38,38)] 
Layer(CaskConvolution): layer_128_1_conv2 + layer_128_1_sum, Tactic: 4772821744921268633, 
layer_128_1_conv1_h[Half(-2,128,38,38)], layer_128_1_conv_expand_h[Half(-2,128,38,38)] -> 
layer_128_1_sum[Half(-2,128,38,38)] 


Layer(Scale): layer_256_1_bn1 + layer_256_1_scale1 + layer_256_1_relu1, Tactic: 0, 
layer_128_1_sum[Half(-2,128,38,38)] -> layer_256_1_bn1[Half(-2,128,38,38)] 
Layer(CaskConvolution): layer_256_1_conv1 + layer_256_1_relu2, Tactic: -4212163711445252890, 
layer_256_1_bn1[Half(-2,128,38,38)] -> layer_256_1_conv1[Half(-2,256,19,19)] 
Layer(CaskConvolution): layer_256_1_conv_expand, Tactic: -1716393687483585322, 
layer_256_1_bn1[Half(-2,128,38,38)] -> layer_256_1_conv_expand[Half(-2,256,19,19)] 
Layer(CaskConvolution): layer_256_1_conv2 + layer_256_1_sum, Tactic: -4212163711445252890, 
layer_256_1_conv1[Half(-2,256,19,19)], layer_256_1_conv_expand[Half(-2,256,19,19)] -> 
layer_256_1_sum[Half(-2,256,19,19)] 
Layer(Scale): layer_512_1_bn1 + layer_512_1_scale1 + layer_512_1_relu1, Tactic: 0, 
layer_256_1_sum[Half(-2,256,19,19)] -> layer_512_1_bn1[Half(-2,256,19,19)] 
Layer(CaskConvolution): layer_512_1_conv1_h + layer_512_1_relu2, Tactic: -4212163711445252890, 
layer_512_1_bn1[Half(-2,256,19,19)] -> layer_512_1_conv1_h[Half(-2,128,19,19)] 
Layer(CaskConvolution): layer_512_1_conv_expand_h, Tactic: -1716393687483585322, 
layer_512_1_bn1[Half(-2,256,19,19)] -> layer_512_1_conv_expand_h[Half(-2,256,19,19)] 
Layer(CaskConvolution): layer_512_1_conv2_h + layer_512_1_sum, Tactic: -4212163711445252890, 
layer_512_1_conv1_h[Half(-2,128,19,19)], layer_512_1_conv_expand_h[Half(-2,256,19,19)] -> 
(Unnamed Layer* 42) [ElementWise]_output[Half(-2,256,19,19)] 
Layer(Scale): last_bn_h + last_scale_h + last_relu, Tactic: 0, (Unnamed Layer* 42) 
[ElementWise]_output[Half(-2,256,19,19)] -> fc7[Half(-2,256,19,19)] 
Layer(CaskConvolution): conv6_1_h + conv6_1_relu, Tactic: -1716393687483585322, fc7[Half(-
2,256,19,19)] -> conv6_1_h[Half(-2,128,19,19)] 
Layer(CaskConvolution): conv6_2_h + conv6_2_relu, Tactic: -4212163711445252890, 
conv6_1_h[Half(-2,128,19,19)] -> conv6_2_h[Half(-2,256,10,10)] 
Layer(CaskConvolution): conv7_1_h + conv7_1_relu, Tactic: 8163473458334948789, conv6_2_h[Half(-
2,256,10,10)] -> conv7_1_h[Half(-2,64,10,10)] 
Layer(CaskConvolution): conv7_2_h + conv7_2_relu, Tactic: -4212163711445252890, 
conv7_1_h[Half(-2,64,10,10)] -> conv7_2_h[Half(-2,128,5,5)] 
Layer(FusedConvActConvolution): conv8_1_h + conv8_1_relu, Tactic: 8716287, conv7_2_h[Half(-
2,128,5,5)] -> conv8_1_h[Half(-2,64,5,5)] 
Layer(FusedConvActConvolution): conv8_2_h + conv8_2_relu, Tactic: 3145727, conv8_1_h[Half(-
2,64,5,5)] -> conv8_2_h[Half(-2,128,5,5)] 
Layer(FusedConvActConvolution): conv9_1_h + conv9_1_relu, Tactic: 8716287, conv8_2_h[Half(-
2,128,5,5)] -> conv9_1_h[Half(-2,64,5,5)] 
Layer(FusedConvActConvolution): conv9_2_h + conv9_2_relu, Tactic: 3145727, conv9_1_h[Half(-
2,64,5,5)] -> conv9_2_h[Half(-2,128,5,5)] 
Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to conv4_3_norm, Tactic: 0, 
layer_256_1_bn1[Half(-2,128,38,38)] -> Reformatted Input Tensor 0 to conv4_3_norm[Float(-
2,128,38,38)] 
Layer(PluginV2): conv4_3_norm, Tactic: 0, Reformatted Input Tensor 0 to conv4_3_norm[Float(-
2,128,38,38)] -> conv4_3_norm[Float(-2,128,38,38)] 


Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to conv4_3_norm_mbox_loc || 
conv4_3_norm_mbox_conf, Tactic: 0, conv4_3_norm[Float(-2,128,38,38)] -> Reformatted Input Tensor 
0 to conv4_3_norm_mbox_loc || conv4_3_norm_mbox_conf[Half(-2,128,38,38)] 
Layer(CaskConvolution): conv4_3_norm_mbox_loc || conv4_3_norm_mbox_conf, Tactic: 
4772821744921268633, Reformatted Input Tensor 0 to conv4_3_norm_mbox_loc || 
conv4_3_norm_mbox_conf[Half(-2,128,38,38)] -> Reformatted Output Tensor 0 to 
conv4_3_norm_mbox_loc || conv4_3_norm_mbox_conf[Half(-2,24,38,38)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to conv4_3_norm_mbox_loc || 
conv4_3_norm_mbox_conf, Tactic: 0, Reformatted Output Tensor 0 to conv4_3_norm_mbox_loc || 
conv4_3_norm_mbox_conf[Half(-2,24,38,38)] -> conv4_3_norm_mbox_loc || 
conv4_3_norm_mbox_conf[Half(-2,24,38,38)] 
Layer(Shuffle): conv4_3_norm_mbox_loc_perm + conv4_3_norm_mbox_loc_flat, Tactic: 0, 
conv4_3_norm_mbox_loc || conv4_3_norm_mbox_conf[Half(-2,16,38,38)] -> mbox_loc[Half(-
2,23104,1,1)] 
Layer(Shuffle): conv4_3_norm_mbox_conf_perm + conv4_3_norm_mbox_conf_flat, Tactic: 0, 
conv4_3_norm_mbox_loc || conv4_3_norm_mbox_conf[Half(-2,8,38,38)] -> mbox_conf[Half(-
2,11552,1,1)] 
Layer(PluginV2): conv4_3_norm_mbox_priorbox, Tactic: 0, conv4_3_norm[Float(-2,128,38,38)], 
data[Float(-2,3,300,300)] -> conv4_3_norm_mbox_priorbox[Float(-2,2,23104,1)] 
Layer(CaskConvolution): fc7_mbox_loc || fc7_mbox_conf, Tactic: -2409163523992614473, fc7[Half(-
2,256,19,19)] -> Reformatted Output Tensor 0 to fc7_mbox_loc || fc7_mbox_conf[Half(-2,36,19,19)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to fc7_mbox_loc || fc7_mbox_conf, 
Tactic: 0, Reformatted Output Tensor 0 to fc7_mbox_loc || fc7_mbox_conf[Half(-2,36,19,19)] -> 
fc7_mbox_loc || fc7_mbox_conf[Half(-2,36,19,19)] 
Layer(Shuffle): fc7_mbox_loc_perm + fc7_mbox_loc_flat, Tactic: 0, fc7_mbox_loc || 
fc7_mbox_conf[Half(-2,24,19,19)] -> mbox_loc[Half(-2,8664,1,1)] 
Layer(Shuffle): fc7_mbox_conf_perm + fc7_mbox_conf_flat, Tactic: 0, fc7_mbox_loc || 
fc7_mbox_conf[Half(-2,12,19,19)] -> mbox_conf[Half(-2,4332,1,1)] 
Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to fc7_mbox_priorbox, Tactic: 0, 
fc7[Half(-2,256,19,19)] -> Reformatted Input Tensor 0 to fc7_mbox_priorbox[Float(-2,256,19,19)] 
Layer(PluginV2): fc7_mbox_priorbox, Tactic: 0, Reformatted Input Tensor 0 to 
fc7_mbox_priorbox[Float(-2,256,19,19)], data[Float(-2,3,300,300)] -> fc7_mbox_priorbox[Float(-
2,2,8664,1)] 
Layer(FusedConvActConvolution): conv6_2_mbox_loc || conv6_2_mbox_conf, Tactic: 720895, 
conv6_2_h[Half(-2,256,10,10)] -> Reformatted Output Tensor 0 to conv6_2_mbox_loc || 
conv6_2_mbox_conf[Half(-2,36,10,10)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to conv6_2_mbox_loc || 
conv6_2_mbox_conf, Tactic: 0, Reformatted Output Tensor 0 to conv6_2_mbox_loc || 
conv6_2_mbox_conf[Half(-2,36,10,10)] -> conv6_2_mbox_loc || conv6_2_mbox_conf[Half(-
2,36,10,10)] 
Layer(Shuffle): conv6_2_mbox_loc_perm + conv6_2_mbox_loc_flat, Tactic: 0, conv6_2_mbox_loc || 
conv6_2_mbox_conf[Half(-2,24,10,10)] -> mbox_loc[Half(-2,2400,1,1)] 


Layer(Shuffle): conv6_2_mbox_conf_perm + conv6_2_mbox_conf_flat, Tactic: 0, conv6_2_mbox_loc || 
conv6_2_mbox_conf[Half(-2,12,10,10)] -> mbox_conf[Half(-2,1200,1,1)] 
Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to conv6_2_mbox_priorbox, Tactic: 0, 
conv6_2_h[Half(-2,256,10,10)] -> Reformatted Input Tensor 0 to conv6_2_mbox_priorbox[Float(-
2,256,10,10)] 
Layer(PluginV2): conv6_2_mbox_priorbox, Tactic: 0, Reformatted Input Tensor 0 to 
conv6_2_mbox_priorbox[Float(-2,256,10,10)], data[Float(-2,3,300,300)] -> 
conv6_2_mbox_priorbox[Float(-2,2,2400,1)] 
Layer(FusedConvActConvolution): conv7_2_mbox_loc || conv7_2_mbox_conf, Tactic: 10485759, 
conv7_2_h[Half(-2,128,5,5)] -> Reformatted Output Tensor 0 to conv7_2_mbox_loc || 
conv7_2_mbox_conf[Half(-2,36,5,5)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to conv7_2_mbox_loc || 
conv7_2_mbox_conf, Tactic: 0, Reformatted Output Tensor 0 to conv7_2_mbox_loc || 
conv7_2_mbox_conf[Half(-2,36,5,5)] -> conv7_2_mbox_loc || conv7_2_mbox_conf[Half(-2,36,5,5)] 
Layer(Shuffle): conv7_2_mbox_loc_perm + conv7_2_mbox_loc_flat, Tactic: 0, conv7_2_mbox_loc || 
conv7_2_mbox_conf[Half(-2,24,5,5)] -> mbox_loc[Half(-2,600,1,1)] 
Layer(Shuffle): conv7_2_mbox_conf_perm + conv7_2_mbox_conf_flat, Tactic: 0, conv7_2_mbox_loc || 
conv7_2_mbox_conf[Half(-2,12,5,5)] -> mbox_conf[Half(-2,300,1,1)] 
Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to conv7_2_mbox_priorbox, Tactic: 0, 
conv7_2_h[Half(-2,128,5,5)] -> Reformatted Input Tensor 0 to conv7_2_mbox_priorbox[Float(-
2,128,5,5)] 
Layer(PluginV2): conv7_2_mbox_priorbox, Tactic: 0, Reformatted Input Tensor 0 to 
conv7_2_mbox_priorbox[Float(-2,128,5,5)], data[Float(-2,3,300,300)] -> 
conv7_2_mbox_priorbox[Float(-2,2,600,1)] 
Layer(FusedConvActConvolution): conv8_2_mbox_loc || conv8_2_mbox_conf, Tactic: 10682367, 
conv8_2_h[Half(-2,128,5,5)] -> Reformatted Output Tensor 0 to conv8_2_mbox_loc || 
conv8_2_mbox_conf[Half(-2,24,5,5)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to conv8_2_mbox_loc || 
conv8_2_mbox_conf, Tactic: 0, Reformatted Output Tensor 0 to conv8_2_mbox_loc || 
conv8_2_mbox_conf[Half(-2,24,5,5)] -> conv8_2_mbox_loc || conv8_2_mbox_conf[Half(-2,24,5,5)] 
Layer(Shuffle): conv8_2_mbox_loc_perm + conv8_2_mbox_loc_flat, Tactic: 0, conv8_2_mbox_loc || 
conv8_2_mbox_conf[Half(-2,16,5,5)] -> mbox_loc[Half(-2,400,1,1)] 
Layer(Shuffle): conv8_2_mbox_conf_perm + conv8_2_mbox_conf_flat, Tactic: 0, conv8_2_mbox_loc || 
conv8_2_mbox_conf[Half(-2,8,5,5)] -> mbox_conf[Half(-2,200,1,1)] 
Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to conv8_2_mbox_priorbox, Tactic: 0, 
conv8_2_h[Half(-2,128,5,5)] -> Reformatted Input Tensor 0 to conv8_2_mbox_priorbox[Float(-
2,128,5,5)] 
Layer(PluginV2): conv8_2_mbox_priorbox, Tactic: 0, Reformatted Input Tensor 0 to 
conv8_2_mbox_priorbox[Float(-2,128,5,5)], data[Float(-2,3,300,300)] -> 
conv8_2_mbox_priorbox[Float(-2,2,400,1)] 


Layer(FusedConvActConvolution): conv9_2_mbox_loc || conv9_2_mbox_conf, Tactic: 10682367, 
conv9_2_h[Half(-2,128,5,5)] -> Reformatted Output Tensor 0 to conv9_2_mbox_loc || 
conv9_2_mbox_conf[Half(-2,24,5,5)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to conv9_2_mbox_loc || 
conv9_2_mbox_conf, Tactic: 0, Reformatted Output Tensor 0 to conv9_2_mbox_loc || 
conv9_2_mbox_conf[Half(-2,24,5,5)] -> conv9_2_mbox_loc || conv9_2_mbox_conf[Half(-2,24,5,5)] 
Layer(Shuffle): conv9_2_mbox_loc_perm + conv9_2_mbox_loc_flat, Tactic: 0, conv9_2_mbox_loc || 
conv9_2_mbox_conf[Half(-2,16,5,5)] -> mbox_loc[Half(-2,400,1,1)] 
Layer(Shuffle): conv9_2_mbox_conf_perm + conv9_2_mbox_conf_flat, Tactic: 0, conv9_2_mbox_loc || 
conv9_2_mbox_conf[Half(-2,8,5,5)] -> mbox_conf[Half(-2,200,1,1)] 
Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to conv9_2_mbox_priorbox, Tactic: 0, 
conv9_2_h[Half(-2,128,5,5)] -> Reformatted Input Tensor 0 to conv9_2_mbox_priorbox[Float(-
2,128,5,5)] 
Layer(PluginV2): conv9_2_mbox_priorbox, Tactic: 0, Reformatted Input Tensor 0 to 
conv9_2_mbox_priorbox[Float(-2,128,5,5)], data[Float(-2,3,300,300)] -> 
conv9_2_mbox_priorbox[Float(-2,2,400,1)] 
Layer(Reformat): conv4_3_norm_mbox_priorbox copy, Tactic: 1002, 
conv4_3_norm_mbox_priorbox[Float(-2,2,23104,1)] -> mbox_priorbox[Half(-2,2,23104,1)] 
Layer(Reformat): fc7_mbox_priorbox copy, Tactic: 1002, fc7_mbox_priorbox[Float(-2,2,8664,1)] -> 
mbox_priorbox[Half(-2,2,8664,1)] 
Layer(Reformat): conv6_2_mbox_priorbox copy, Tactic: 0, conv6_2_mbox_priorbox[Float(-2,2,2400,1)] 
-> mbox_priorbox[Half(-2,2,2400,1)] 
Layer(Reformat): conv7_2_mbox_priorbox copy, Tactic: 0, conv7_2_mbox_priorbox[Float(-2,2,600,1)] -
> mbox_priorbox[Half(-2,2,600,1)] 
Layer(Reformat): conv8_2_mbox_priorbox copy, Tactic: 0, conv8_2_mbox_priorbox[Float(-2,2,400,1)] -
> mbox_priorbox[Half(-2,2,400,1)] 
Layer(Reformat): conv9_2_mbox_priorbox copy, Tactic: 0, conv9_2_mbox_priorbox[Float(-2,2,400,1)] -
> mbox_priorbox[Half(-2,2,400,1)] 
Layer(NoOp): mbox_conf_reshape, Tactic: 0, mbox_conf[Half(-2,17784,1,1)] -> 
mbox_conf_reshape[Half(-2,8892,2)] 
Layer(CudaSoftMax): mbox_conf_softmax, Tactic: 1001, mbox_conf_reshape[Half(-2,8892,2)] -> 
mbox_conf_softmax[Half(-2,8892,2)] 
Layer(NoOp): mbox_conf_flatten, Tactic: 0, mbox_conf_softmax[Half(-2,8892,2)] -> 
mbox_conf_flatten[Half(-2,17784,1,1)] 
Layer(PluginV2): detection_out, Tactic: 0, mbox_loc[Half(-2,35568,1,1)], mbox_conf_flatten[Half(-
2,17784,1,1)], mbox_priorbox[Half(-2,2,35568,1)] -> Reformatted Output Tensor 0 to 
detection_out[Half(-2,1,200,7)], keep_count[Float(-2,1,1,1)] 
Layer(Reformat): Reformatting CopyNode for Output Tensor 0 to detection_out, Tactic: 0, Reformatted 
Output Tensor 0 to detection_out[Half(-2,1,200,7)] -> detection_out[Float(-2,1,200,7)]

.

Environment

TensorRT: 8.2.1
GPU: Tegra X
Nvidia Jetpack Version: 4.6.2
CUDA Version: 10.2
Python Version:3
Jetson Nano Developer Kit 4GB model B:

Relevant Files

see attachment
resultverbose.pdf (1.4 MB)