How to know what type of optimization have been done to my model when using trtexec

Hello all,
I have converted my model from Caffe to TRT using the trtexec command. I have set the precision calibration to 16 and the maxbatch to 1.
Is there any method to know if the trtexec has applied to my model layer fusion technique or model pruning., see the report attached below Also how to extract the memory performance from this report?
Thanks in advance

Environment

jetpack 4.6.2
tensorrt 8.2.1
cuda 10.2
python 3
jetson Nano 4GB

Relevant Files

[02/11/2023-18:14:23] [I] === Model Options ===
[02/11/2023-18:14:23] [I] Format: Caffe
[02/11/2023-18:14:23] [I] Output: detection_out
[02/11/2023-18:14:23] [I] === Build Options ===
[02/11/2023-18:14:23] [I] Max batch: 1
[02/11/2023-18:14:23] [I] Workspace: 16 MiB
[02/11/2023-18:14:23] [I] minTiming: 1
[02/11/2023-18:14:23] [I] avgTiming: 8
[02/11/2023-18:14:23] [I] Precision: FP32+FP16
[02/11/2023-18:14:23] [I] Calibration: 
[02/11/2023-18:14:23] [I] Refit: Disabled
[02/11/2023-18:14:23] [I] Sparsity: Disabled
[02/11/2023-18:14:23] [I] Safe mode: Disabled
[02/11/2023-18:14:23] [I] DirectIO mode: Disabled
[02/11/2023-18:14:23] [I] Restricted mode: Disabled
[02/11/2023-18:14:23] [I] Save engine: /home/kamal/Desktop/FinalPaperCodes/caffe2TRT/facedetection16.trt
[02/11/2023-18:14:23] [I] Load engine: 
[02/11/2023-18:14:23] [I] Profiling verbosity: 0
[02/11/2023-18:14:23] [I] Tactic sources: Using default tactic sources
[02/11/2023-18:14:23] [I] timingCacheMode: local
[02/11/2023-18:14:23] [I] timingCacheFile: 
[02/11/2023-18:14:23] [I] Input(s)s format: fp32:CHW
[02/11/2023-18:14:23] [I] Output(s)s format: fp32:CHW
[02/11/2023-18:14:23] [I] Input build shapes: model
[02/11/2023-18:14:23] [I] Input calibration shapes: model
[02/11/2023-18:14:23] [I] === System Options ===
[02/11/2023-18:14:23] [I] Device: 0
[02/11/2023-18:14:23] [I] DLACore: 
[02/11/2023-18:14:23] [I] Plugins:
[02/11/2023-18:14:23] [I] === Inference Options ===
[02/11/2023-18:14:23] [I] Batch: 1
[02/11/2023-18:14:23] [I] Input inference shapes: model
[02/11/2023-18:14:23] [I] Iterations: 10
[02/11/2023-18:14:23] [I] Duration: 3s (+ 200ms warm up)
[02/11/2023-18:14:23] [I] Sleep time: 0ms
[02/11/2023-18:14:23] [I] Idle time: 0ms
[02/11/2023-18:14:23] [I] Streams: 1
[02/11/2023-18:14:23] [I] ExposeDMA: Disabled
[02/11/2023-18:14:23] [I] Data transfers: Enabled
[02/11/2023-18:14:23] [I] Spin-wait: Disabled
[02/11/2023-18:14:23] [I] Multithreading: Disabled
[02/11/2023-18:14:23] [I] CUDA Graph: Disabled
[02/11/2023-18:14:23] [I] Separate profiling: Disabled
[02/11/2023-18:14:23] [I] Time Deserialize: Disabled
[02/11/2023-18:14:23] [I] Time Refit: Disabled
[02/11/2023-18:14:23] [I] Skip inference: Disabled
[02/11/2023-18:14:23] [I] Inputs:
[02/11/2023-18:14:23] [I] === Reporting Options ===
[02/11/2023-18:14:23] [I] Verbose: Disabled
[02/11/2023-18:14:23] [I] Averages: 10 inferences
[02/11/2023-18:14:23] [I] Percentile: 99
[02/11/2023-18:14:23] [I] Dump refittable layers:Disabled
[02/11/2023-18:14:23] [I] Dump output: Disabled
[02/11/2023-18:14:23] [I] Profile: Disabled
[02/11/2023-18:14:23] [I] Export timing to JSON file: 
[02/11/2023-18:14:23] [I] Export output to JSON file: 
[02/11/2023-18:14:23] [I] Export profile to JSON file: 
[02/11/2023-18:14:23] [I] 
[02/11/2023-18:14:23] [I] === Device Information ===
[02/11/2023-18:14:23] [I] Selected Device: NVIDIA Tegra X1
[02/11/2023-18:14:23] [I] Compute Capability: 5.3
[02/11/2023-18:14:23] [I] SMs: 1
[02/11/2023-18:14:23] [I] Compute Clock Rate: 0.9216 GHz
[02/11/2023-18:14:23] [I] Device Global Memory: 3956 MiB
[02/11/2023-18:14:23] [I] Shared Memory per SM: 64 KiB
[02/11/2023-18:14:23] [I] Memory Bus Width: 64 bits (ECC disabled)
[02/11/2023-18:14:23] [I] Memory Clock Rate: 0.01275 GHz
[02/11/2023-18:14:23] [I] 
[02/11/2023-18:14:23] [I] TensorRT version: 8.2.1
[02/11/2023-18:14:25] [I] [TRT] [MemUsageChange] Init CUDA: CPU +229, GPU +0, now: CPU 248, GPU 3608 (MiB)
[02/11/2023-18:14:26] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 248 MiB, GPU 3608 MiB
[02/11/2023-18:14:26] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 278 MiB, GPU 3635 MiB
[02/11/2023-18:14:26] [I] Start parsing network model
[02/11/2023-18:14:26] [I] Finish parsing network model
[02/11/2023-18:14:26] [I] [TRT] ---------- Layers Running on DLA ----------
[02/11/2023-18:14:26] [I] [TRT] ---------- Layers Running on GPU ----------
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] data_bn + data_scale
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv1_h + conv1_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv1_pool
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_64_1_conv1_h + layer_64_1_relu2
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_64_1_conv2_h + layer_64_1_sum
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_128_1_bn1_h + layer_128_1_scale1_h + layer_128_1_relu1
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_128_1_conv1_h + layer_128_1_relu2
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_128_1_conv_expand_h
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_128_1_conv2 + layer_128_1_sum
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_256_1_bn1 + layer_256_1_scale1 + layer_256_1_relu1
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_256_1_conv1 + layer_256_1_relu2
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_256_1_conv_expand
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_256_1_conv2 + layer_256_1_sum
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_512_1_bn1 + layer_512_1_scale1 + layer_512_1_relu1
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_512_1_conv1_h + layer_512_1_relu2
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_512_1_conv_expand_h
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] layer_512_1_conv2_h + layer_512_1_sum
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] last_bn_h + last_scale_h + last_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv6_1_h + conv6_1_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv6_2_h + conv6_2_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv7_1_h + conv7_1_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv7_2_h + conv7_2_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv8_1_h + conv8_1_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv8_2_h + conv8_2_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv9_1_h + conv9_1_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv9_2_h + conv9_2_relu
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv4_3_norm
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv4_3_norm_mbox_loc || conv4_3_norm_mbox_conf
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv4_3_norm_mbox_loc_perm + conv4_3_norm_mbox_loc_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv4_3_norm_mbox_conf_perm + conv4_3_norm_mbox_conf_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv4_3_norm_mbox_priorbox
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] fc7_mbox_loc || fc7_mbox_conf
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] fc7_mbox_loc_perm + fc7_mbox_loc_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] fc7_mbox_conf_perm + fc7_mbox_conf_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] fc7_mbox_priorbox
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv6_2_mbox_loc || conv6_2_mbox_conf
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv6_2_mbox_loc_perm + conv6_2_mbox_loc_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv6_2_mbox_conf_perm + conv6_2_mbox_conf_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv6_2_mbox_priorbox
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv7_2_mbox_loc || conv7_2_mbox_conf
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv7_2_mbox_loc_perm + conv7_2_mbox_loc_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv7_2_mbox_conf_perm + conv7_2_mbox_conf_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv7_2_mbox_priorbox
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv8_2_mbox_loc || conv8_2_mbox_conf
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv8_2_mbox_loc_perm + conv8_2_mbox_loc_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv8_2_mbox_conf_perm + conv8_2_mbox_conf_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv8_2_mbox_priorbox
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv9_2_mbox_loc || conv9_2_mbox_conf
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv9_2_mbox_loc_perm + conv9_2_mbox_loc_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv9_2_mbox_conf_perm + conv9_2_mbox_conf_flat
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv9_2_mbox_priorbox
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv4_3_norm_mbox_priorbox copy
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] fc7_mbox_priorbox copy
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv6_2_mbox_priorbox copy
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv7_2_mbox_priorbox copy
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv8_2_mbox_priorbox copy
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] conv9_2_mbox_priorbox copy
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] mbox_conf_reshape
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] mbox_conf_softmax
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] mbox_conf_flatten
[02/11/2023-18:14:26] [I] [TRT] [GpuLayer] detection_out
[02/11/2023-18:14:29] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +158, GPU -26, now: CPU 468, GPU 3587 (MiB)
[02/11/2023-18:14:33] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +241, GPU -3, now: CPU 709, GPU 3584 (MiB)
[02/11/2023-18:14:33] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[02/11/2023-18:15:08] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[02/11/2023-18:16:13] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[02/11/2023-18:16:13] [I] [TRT] Total Host Persistent Memory: 51328
[02/11/2023-18:16:13] [I] [TRT] Total Device Persistent Memory: 5534208
[02/11/2023-18:16:13] [I] [TRT] Total Scratch Memory: 434176
[02/11/2023-18:16:13] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 8 MiB, GPU 21 MiB
[02/11/2023-18:16:13] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 22.407ms to assign 10 blocks to 62 nodes requiring 4691456 bytes.
[02/11/2023-18:16:13] [I] [TRT] Total Activation Memory: 4691456
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +6, now: CPU 957, GPU 3800 (MiB)
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +1, GPU +2, now: CPU 958, GPU 3802 (MiB)
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +5, GPU +8, now: CPU 5, GPU 8 (MiB)
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 954, GPU 3808 (MiB)
[02/11/2023-18:16:13] [I] [TRT] Loaded engine size: 6 MiB
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 954, GPU 3808 (MiB)
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 954, GPU 3808 (MiB)
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +5, now: CPU 0, GPU 5 (MiB)
[02/11/2023-18:16:13] [I] Engine built in 109.88 sec.
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 914, GPU 3816 (MiB)
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +1, now: CPU 914, GPU 3817 (MiB)
[02/11/2023-18:16:13] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +10, now: CPU 0, GPU 15 (MiB)
[02/11/2023-18:16:13] [I] Using random values for input data
[02/11/2023-18:16:13] [I] Created input binding for data with dimensions 3x300x300
[02/11/2023-18:16:13] [I] Using random values for output detection_out
[02/11/2023-18:16:13] [I] Created output binding for detection_out with dimensions 1x200x7
[02/11/2023-18:16:13] [I] Starting inference
[02/11/2023-18:16:16] [I] Warmup completed 15 queries over 200 ms
[02/11/2023-18:16:16] [I] Timing trace has 253 queries over 3.01328 s
[02/11/2023-18:16:16] [I] 
[02/11/2023-18:16:16] [I] === Trace details ===
[02/11/2023-18:16:16] [I] Trace averages of 10 runs:
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.9803 ms - Host latency: 12.0944 ms (end to end 12.1054 ms, enqueue 4.94264 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7885 ms - Host latency: 11.9033 ms (end to end 11.9142 ms, enqueue 4.07983 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7851 ms - Host latency: 11.8992 ms (end to end 11.9102 ms, enqueue 3.94636 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7735 ms - Host latency: 11.8871 ms (end to end 11.8979 ms, enqueue 3.95506 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.771 ms - Host latency: 11.8844 ms (end to end 11.8954 ms, enqueue 4.29905 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.853 ms - Host latency: 11.9667 ms (end to end 11.9777 ms, enqueue 3.99654 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.756 ms - Host latency: 11.8697 ms (end to end 11.8809 ms, enqueue 5.29893 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7458 ms - Host latency: 11.8598 ms (end to end 11.8708 ms, enqueue 6.24279 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7496 ms - Host latency: 11.8637 ms (end to end 11.8746 ms, enqueue 5.90843 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7694 ms - Host latency: 11.8825 ms (end to end 11.8935 ms, enqueue 4.7172 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.8486 ms - Host latency: 11.9634 ms (end to end 11.9744 ms, enqueue 5.13914 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7715 ms - Host latency: 11.8859 ms (end to end 11.897 ms, enqueue 4.30391 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7534 ms - Host latency: 11.8668 ms (end to end 11.878 ms, enqueue 4.66155 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7684 ms - Host latency: 11.8818 ms (end to end 11.8927 ms, enqueue 5.19998 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7729 ms - Host latency: 11.8866 ms (end to end 11.8975 ms, enqueue 3.7774 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.8268 ms - Host latency: 11.9401 ms (end to end 11.9526 ms, enqueue 5.23444 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7552 ms - Host latency: 11.8691 ms (end to end 11.8799 ms, enqueue 5.8259 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7518 ms - Host latency: 11.8654 ms (end to end 11.8763 ms, enqueue 5.24194 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7636 ms - Host latency: 11.8778 ms (end to end 11.8883 ms, enqueue 4.99146 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7434 ms - Host latency: 11.8583 ms (end to end 11.8693 ms, enqueue 5.32524 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.8449 ms - Host latency: 11.9585 ms (end to end 11.9697 ms, enqueue 5.48176 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7705 ms - Host latency: 11.8834 ms (end to end 11.8945 ms, enqueue 4.96077 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7657 ms - Host latency: 11.8789 ms (end to end 11.8898 ms, enqueue 4.58608 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7515 ms - Host latency: 11.8644 ms (end to end 11.8754 ms, enqueue 6.05349 ms)
[02/11/2023-18:16:16] [I] Average on 10 runs - GPU latency: 11.7641 ms - Host latency: 11.8779 ms (end to end 11.889 ms, enqueue 5.54666 ms)
[02/11/2023-18:16:16] [I] 
[02/11/2023-18:16:16] [I] === Performance summary ===
[02/11/2023-18:16:16] [I] Throughput: 83.9616 qps
[02/11/2023-18:16:16] [I] Latency: min = 11.7981 ms, max = 13.8956 ms, mean = 11.8986 ms, median = 11.8766 ms, percentile(99%) = 12.6138 ms
[02/11/2023-18:16:16] [I] End-to-End Host Latency: min = 11.8091 ms, max = 13.9058 ms, mean = 11.9096 ms, median = 11.8877 ms, percentile(99%) = 12.6245 ms
[02/11/2023-18:16:16] [I] Enqueue Time: min = 2.92126 ms, max = 8.55396 ms, mean = 4.94308 ms, median = 4.8728 ms, percentile(99%) = 7.71155 ms
[02/11/2023-18:16:16] [I] H2D Latency: min = 0.105225 ms, max = 0.117065 ms, mean = 0.109309 ms, median = 0.109192 ms, percentile(99%) = 0.11499 ms
[02/11/2023-18:16:16] [I] GPU Compute Time: min = 11.6844 ms, max = 13.7804 ms, mean = 11.7848 ms, median = 11.7629 ms, percentile(99%) = 12.4998 ms
[02/11/2023-18:16:16] [I] D2H Latency: min = 0.00317383 ms, max = 0.00537109 ms, mean = 0.00448361 ms, median = 0.0045166 ms, percentile(99%) = 0.00512695 ms
[02/11/2023-18:16:16] [I] Total Host Walltime: 3.01328 s
[02/11/2023-18:16:16] [I] Total GPU Compute Time: 2.98156 s
[02/11/2023-18:16:16] [I] Explanations of the performance metrics are printed in the verbose logs.

Hi @KamalLAGH
To check, are you using --verbose flag while generating this output?
thanks

Hi @AakankshaS
Thank you a lot for your reply.
This is the report, could you please explain to me what it means:
Adding reformat layer: Reformatted Output Tensor
Layer(Scale)
Layer(CaskConvolution)
Layer(TiledPooling)
Layer(Reformat)
Layer(Shuffle)
Layer(PluginV2)
Layer(FusedConvActConvolution)
are these the fused layers?
Thanks for your help in advance
resultverbose.pdf (1.4 MB)