Unable to generate and Load Engine files on deepstream-7.1

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

GPU
Deepstream 7.1
TensorRT tensorrt @ file:///opt/nvidia/deepstream/deepstream-7.1/TensorRT-10.3.0.26/python/tensorrt-10.3.0-cp310-none-linux_x86_64.whl

NVIDIA-SMI 560.35.05 Driver Version: 560.35.05

I believe this is due to tensorrt installation.
How do I fix this?

WARNING: …/nvdsinfer/nvdsinfer_model_builder.cpp:1152 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-7.1/nvodin24/models/assets/person/person.engine open error
0:00:00.223085955 41698 0x58409e6fe640 WARN nvinfer gstnvinfer.cpp:681:gst_nvinfer_logger: NvDsInferContext[UID 5]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2080> [UID = 5]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-7.1/nvodin24/models/assets/person/person.engine failed
0:00:00.223099354 41698 0x58409e6fe640 WARN nvinfer gstnvinfer.cpp:681:gst_nvinfer_logger: NvDsInferContext[UID 5]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2185> [UID = 5]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-7.1/nvodin24/models/assets/person/person.engine failed, try rebuild
0:00:00.223104923 41698 0x58409e6fe640 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 5]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2106> [UID = 5]: Trying to create engine from model files
Connected to MQTT broker
Connected to MQTT broker
0:00:32.927030453 41698 0x58409e6fe640 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 5]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2138> [UID = 5]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.1/nvodin24/models/assets/person/resnet18_trafficcamnet_pruned.onnx_b1_gpu0_int8.engine successfully
Implicit layer support has been deprecated
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

0:00:33.170395686 41698 0x58409e6fe640 INFO nvinfer gstnvinfer_impl.cpp:343:notifyL

from the log, the app is using person.engine and resnet18_trafficcamnet engine, which engine can not be generated and loaded? could you share the whole log? could you share the nvinfer cfg? Thanks!

person nvinfer config

################################################################################
# SPDX-FileCopyrightText: Copyright (c) 2019-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

# Following properties are mandatory when engine files are not specified:
#   int8-calib-file(Only in INT8)
#   Caffemodel mandatory properties: model-file, proto-file, output-blob-names
#   UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
#   ONNX: onnx-file
#
# Mandatory properties for detectors:
#   num-detected-classes
#
# Optional properties for detectors:
#   cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)
#   custom-lib-path,
#   parse-bbox-func-name
#
# Mandatory properties for classifiers:
#   classifier-threshold, is-classifier
#
# Optional properties for classifiers:
#   classifier-async-mode(Secondary mode only, Default=false)
#
# Optional properties in secondary mode:
#   operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
#   input-object-min-width, input-object-min-height, input-object-max-width,
#   input-object-max-height
#
# Following properties are always recommended:
#   batch-size(Default=1)
#
# Other optional properties:
#   net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
#   model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
#   mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),
#   custom-lib-path, network-mode(Default=0 i.e FP32)
#
# The values in the config file are overridden by values set through GObject
# properties.

[property]
gpu-id=0
net-scale-factor=0.00392156862745098
#tlt-model-key=tlt_encode
#tlt-encoded-model=/opt/nvidia/deepstream/deepstream/nvodin24/models/assets/person/person.etlt
onnx-file = /opt/nvidia/deepstream/deepstream/nvodin24/models/assets/person/resnet18_trafficcamnet_pruned.onnx
model-engine-file=/opt/nvidia/deepstream/deepstream/nvodin24/models/assets/person/person.engine
labelfile-path=/opt/nvidia/deepstream/deepstream/nvodin24/models/assets/person/labels.txt
int8-calib-file=/opt/nvidia/deepstream/deepstream/nvodin24/models/assets/person/cal_trt.bin
force-implicit-batch-dim=0
batch-size=8
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=4
interval=0
gie-unique-id=5
#uff-input-order=0
#uff-input-blob-name=input_1
#output-blob-names=output_cov/Sigmoid;output_bbox/BiasAdd
#scaling-filter=0
#scaling-compute-hw=0
cluster-mode=2
infer-dims=3;544;960

#[class-attrs-all]
#pre-cluster-threshold=0.1
#eps=0.1
#group-threshold=1



[class-attrs-0]
pre-cluster-threshold=0.9
eps=0.9
group-threshold=1


[class-attrs-1]
pre-cluster-threshold=0.9
eps=0.9
group-threshold=1

[class-attrs-2]
pre-cluster-threshold=0.2
eps=0.25
group-threshold=1

[class-attrs-3]
pre-cluster-threshold=0.9
eps=0.9
group-threshold=1

The loaded model is the resnet18_trafficcamnet model which is renamed to person.engine.

When the it tries to rebuild the engine, I believe it builds the engine, but when the engine loads it gives this
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

It should print the layer names… like input, and output, and its dimensions right?

  1. please set batch-size=1 in the cfg.
  2. if still can’t work, could you run “export NVDSINFER_LOG_LEVEL=3” first, then run the applciaton to get more log? Thanks!
  3. did you only rename /opt/nvidia/deepstream/deepstream-7.1/nvodin24/models/assets/person/resnet18_trafficcamnet_pruned.onnx_b1_gpu0_int8.engine to /opt/nvidia/deepstream/deepstream/nvodin24/models/assets/person/person.engine? could you use md5sum to check if the two files are the same.

Well that is the second step.

When the engine is generated, the layer info does not give any input/output
0:00:33.497159746 43444 0x62a029a064c0 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 5]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2138> [UID = 5]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.1/nvodin24/models/assets/person/resnet18_trafficcamnet_pruned.onnx_b1_gpu0_int8.engine successfully
Implicit layer support has been deprecated
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

rght after the sucessful message, the layer information gets printed, but in this case it does not print anything.

DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00452901
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00738574
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00444547
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00888026 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00444547
DEBUG: [TRT]: *************** Autotuning Reformat: Int8(4080,2040:32,60,1) -> Float(28560,1:4,840,14) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00478049
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00732522
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00472338
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00855875 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00472338
DEBUG: [TRT]: *************** Autotuning Reformat: Int8(4080,2040:32,60,1) -> Int8(28560,2040:4,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00374471
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00767224
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00344435
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00951262 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00344435
DEBUG: [TRT]: *************** Autotuning Reformat: Int8(4080,2040:32,60,1) -> Int8(8160,1:16,240,4) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00520566
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00760097
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.0051671
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu_1/Relu:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00827865 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.0051671
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382473 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382703 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00340419 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00745437 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00748515 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00412215 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.0061442 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00737716 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x0000000000000000, 0.00679936 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00787225 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00386403 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00616185 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.010239 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x0000000000000000, 0.00727768 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00735884 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00718525 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00631427 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00679573 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00796826 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00716164 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00724357 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00402866 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00861484 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00729577 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00333952 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00748492 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0103318 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00421106 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00702266 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00749819 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0077166 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.0037299 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00716732 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00672661 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00722428 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00757745 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00757855 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00736927 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00670955 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00720499 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00373606 ms
DEBUG: [TRT]: Optimizer Reformat(block_4a_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0074688 ms
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382473 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382703 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,2040,60,1) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00718093 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,2040,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00745437 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00748515 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,1,7200,120) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00412215 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,1,7200,120) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.0061442 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,1,7200,120) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00809854 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,1,7200,120) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x0000000000000000, 0.00679936 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(244800,1,7200,120) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00787225 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00386403 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00616185 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00751621 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x0000000000000000, 0.00727768 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00735884 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00718525 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00631427 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00679573 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00716164 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00724357 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00716732 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00672661 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00722428 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00782139 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(16320,1:16,480,8) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00757855 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00736927 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00670955 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00720499 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0071537 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_bn_2/batchnorm/add_1:0 -> <out>) [Int8(8160,2040:32,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0074688 ms
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00423372 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00385165 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,2040,60,1) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.010022 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,2040,60,1) -> Int8(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00389767 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,2040,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.0035355 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,2040,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00730782 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00740127 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,1,7200,120) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.0042012 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,1,7200,120) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00613586 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,1,7200,120) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00796597 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,1,7200,120) -> Int8(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0074093 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,1,7200,120) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00776434 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,1,7200,120) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x0000000000000000, 0.00674816 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(244800,1,7200,120) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00724823 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(61200,1:4,1800,30) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00394667 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(61200,1:4,1800,30) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00611685 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(61200,1:4,1800,30) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00726006 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(61200,1:4,1800,30) -> Int8(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00714576 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(61200,1:4,1800,30) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00744225 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(61200,1:4,1800,30) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x0000000000000000, 0.00726979 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(61200,1:4,1800,30) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0101133 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00719909 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00629635 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00680729 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(8160,2040:32,60,1) -> Int8(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00729229 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00784248 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(8160,2040:32,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x0000000000000000, 0.00731084 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Float(8160,2040:32,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00785042 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(16320,1:16,480,8) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00722065 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(16320,1:16,480,8) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00666688 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(16320,1:16,480,8) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.0072633 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(16320,1:16,480,8) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00714553 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(16320,1:16,480,8) -> Int8(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.009392 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(16320,1:16,480,8) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00743111 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(16320,1:16,480,8) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00787547 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00713963 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00675968 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00727884 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(8160,2040:32,60,1) -> Float(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00781867 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(8160,2040:32,60,1) -> Int8(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00801092 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00338651 ms
DEBUG: [TRT]: Optimizer Reformat(<in> -> block_4b_relu/Relu:0) [Int8(8160,2040:32,60,1) -> Int8(16320,1:16,480,8)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00747994 ms
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382473 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382703 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00340419 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00748515 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00412215 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.0061442 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00737716 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00787225 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00386403 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00616185 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.010239 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00735884 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00718525 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00631427 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00679573 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00796826 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00724357 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00402866 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00861484 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00729577 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00333952 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0103318 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00421106 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00702266 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00749819 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.0037299 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00716732 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00672661 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00722428 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00757745 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00757855 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00736927 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00670955 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00720499 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00373606 ms
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,2040,60,1) -> Float(8160,1,240,4) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00271818
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00763636
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00282102
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00790828 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00271818
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,2040,60,1) -> Float(1,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00267708
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00751336
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00399378
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00811529 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00267708
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,2040,60,1) -> Float(2040,1:4,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00277854
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00761988
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00274763
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.0078024 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00274763
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,2040,60,1) -> Float(2040,2040:32,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.0157828
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00964327
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.0157198
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00683264 seconds. Fastest Tactic: 0x00000000000003ea Time: 0.00964327
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,2040,60,1) -> Float(1:4,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.0031359
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00778443
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00358733
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00840164 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.0031359
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,1,240,4) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00305144
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00721044
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00270624
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00837696 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00270624
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,1,240,4) -> Float(1,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00276029
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00723154
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00306456
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00814129 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00276029
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,1,240,4) -> Float(2040,1:4,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00277942
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.008448
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00416786
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00754402 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00277942
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,1,240,4) -> Float(2040,2040:32,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.0157115
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00765188
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.0160752
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00708275 seconds. Fastest Tactic: 0x00000000000003ea Time: 0.00765188
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,1,240,4) -> Float(1:4,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00418946
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.0102781
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00416933
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00844639 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00416933
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,1:4,60,1) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00312593
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00726979
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00268262
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00827451 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00268262
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,1:4,60,1) -> Float(8160,1,240,4) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.0027632
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00785315
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00355166
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00775072 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.0027632
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,1:4,60,1) -> Float(1,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00368058
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00733843
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00282192
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.0078718 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00282192
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,1:4,60,1) -> Float(2040,2040:32,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.0157023
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00775665
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.0156504
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.0069692 seconds. Fastest Tactic: 0x00000000000003ea Time: 0.00775665
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,1:4,60,1) -> Float(1:4,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.0041936
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.0102387
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.0041956
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00848138 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.0041936
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,2040:32,60,1) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00288616
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00770424
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00312175
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00827199 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00288616
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,2040:32,60,1) -> Float(8160,1,240,4) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00395005
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00823441
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00285748
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00787961 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00285748
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,2040:32,60,1) -> Float(1,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00322246
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00716595
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00289527
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00852998 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00289527
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,2040:32,60,1) -> Float(2040,1:4,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00275526
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.0102083
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.0028377
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00752346 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00275526
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,2040:32,60,1) -> Float(1:4,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00424425
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.0103952
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00437499
DEBUG: [TRT]: Optimizer Reformat(output_cov/BiasAdd:0 -> <out>) (Reformat[0x80000006]) profiling completed in 0.00868213 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00424425
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,1,240,4) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00403213
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00778245
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00273171
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006]) profiling completed in 0.00815749 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00273171
DEBUG: [TRT]: *************** Autotuning Reformat: Float(1,2040,60,1) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00255772
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00767006
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00401851
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006]) profiling completed in 0.00834542 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00255772
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,1:4,60,1) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00275436
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00714984
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00268092
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006]) profiling completed in 0.00815799 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00268092
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,2040:32,60,1) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00288662
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00955093
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00311586
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006]) profiling completed in 0.00786296 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00288662
DEBUG: [TRT]: *************** Autotuning Reformat: Float(1:4,2040,60,1) -> Float(8160,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00400609
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.009424
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00402326
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_cov/Sigmoid:0) (Reformat[0x80000006]) profiling completed in 0.00906201 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00400609
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382473 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00382703 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00340419 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00748515 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00412215 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.0061442 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00737716 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(244800,1,7200,120) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00787225 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00386403 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00616185 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.010239 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(61200,1:4,1800,30) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00735884 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00718525 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00631427 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00679573 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00796826 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Float(8160,2040:32,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00724357 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00402866 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x0000000000000000, 0.00861484 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00729577 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00333952 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(244800,2040,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.0103318 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00421106 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00702266 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00749819 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(61200,2040:4,60,1) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.0037299 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00716732 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00672661 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00722428 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00757745 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(16320,1:16,480,8) -> Int8(8160,2040:32,60,1)] got cached result: Reformat, tactic 0x00000000000003ea, 0.00757855 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,2040,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00736927 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(244800,1,7200,120)] got cached result: Reformat, tactic 0x00000000000003e8, 0.00670955 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Float(61200,1:4,1800,30)] got cached result: Reformat, tactic 0x0000000000000000, 0.00720499 ms
DEBUG: [TRT]: Optimizer Reformat(block_4b_relu/Relu:0 -> <out>) [Int8(8160,2040:32,60,1) -> Int8(61200,2040:4,60,1)] got cached result: Reformat, tactic 0x0000000000000000, 0.00373606 ms
DEBUG: [TRT]: =============== Computing reformatting costs for available format set
DEBUG: [TRT]: =============== Computing reformatting costs: 
DEBUG: [TRT]: *************** Autotuning Reformat: Float(32640,1,960,16) -> Float(32640,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_bbox/BiasAdd:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00321908
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00725101
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00271627
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_bbox/BiasAdd:0) (Reformat[0x80000006]) profiling completed in 0.00864864 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00271627
DEBUG: [TRT]: *************** Autotuning Reformat: Float(8160,1:4,240,4) -> Float(32640,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_bbox/BiasAdd:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.0033091
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00982933
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00274518
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_bbox/BiasAdd:0) (Reformat[0x80000006]) profiling completed in 0.00897286 seconds. Fastest Tactic: 0x0000000000000000 Time: 0.00274518
DEBUG: [TRT]: *************** Autotuning Reformat: Float(2040,2040:32,60,1) -> Float(32640,2040,60,1) ***************
DEBUG: [TRT]: --------------- Timing Runner: Optimizer Reformat(<in> -> output_bbox/BiasAdd:0) (Reformat[0x80000006])
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003e8 Time: 0.00333011
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x00000000000003ea Time: 0.00716959
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for 
DEBUG: [TRT]: Tactic: 0x0000000000000000 Time: 0.00345138
DEBUG: [TRT]: Optimizer Reformat(<in> -> output_bbox/BiasAdd:0) (Reformat[0x80000006]) profiling completed in 0.00927021 seconds. Fastest Tactic: 0x00000000000003e8 Time: 0.00333011
DEBUG: [TRT]: Adding reformat layer: Reformatted Input Tensor 0 to conv1/convolution + Reshape__113:0 + ONNXTRT_Broadcast + conv1/BiasAdd + bn_conv1/batchnorm/mul__10 + bn_conv1/batchnorm/mul_1 + bn_conv1/batchnorm/sub__11 + bn_conv1/batchnorm/add_1 + activation_1/Relu (input_1:0) from Float(1566720,522240,960,1) to Int8(522240,522240:4,960,1)
DEBUG: [TRT]: Adding reformat layer: Reformatted Input Tensor 0 to PWN(output_cov/Sigmoid) (output_cov/BiasAdd:0) from Float(8160,2040,60,1) to Float(1,2040,60,1)
DEBUG: [TRT]: Adding reformat layer: Reformatted Output Tensor 0 to PWN(output_cov/Sigmoid) (output_cov/Sigmoid:0) from Float(1,2040,60,1) to Float(8160,2040,60,1)
DEBUG: [TRT]: Formats and tactics selection completed in 29.9672 seconds.
DEBUG: [TRT]: After reformat layers: 31 layers
DEBUG: [TRT]: Total number of blocks in pre-optimized block assignment: 30
DEBUG: [TRT]: Detected 1 inputs and 2 output network tensors.
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for Reformatting CopyNode for Input Tensor 0 to PWN(output_cov/Sigmoid)
DEBUG: [TRT]: Setting a default quantization params because quantization data is missing for Reformatting CopyNode for Output Tensor 0 to PWN(output_cov/Sigmoid)
DEBUG: [TRT]: Layer: conv1/convolution + Reshape__113:0 + ONNXTRT_Broadcast + conv1/BiasAdd + bn_conv1/batchnorm/mul__10 + bn_conv1/batchnorm/mul_1 + bn_conv1/batchnorm/sub__11 + bn_conv1/batchnorm/add_1 + activation_1/Relu Host Persistent: 4896 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_1a_conv_1/convolution + Reshape__115:0 + ONNXTRT_Broadcast_1 + block_1a_conv_1/BiasAdd + block_1a_bn_1/batchnorm/mul__14 + block_1a_bn_1/batchnorm/mul_1 + block_1a_bn_1/batchnorm/sub__15 + block_1a_bn_1/batchnorm/add_1 + block_1a_relu_1/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_1a_conv_2/convolution + Reshape__117:0 + ONNXTRT_Broadcast_3 + block_1a_conv_2/BiasAdd + block_1a_bn_2/batchnorm/mul__20 + block_1a_bn_2/batchnorm/mul_1 + block_1a_bn_2/batchnorm/sub__21 + block_1a_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_1a_conv_shortcut/BiasAdd + block_1a_bn_shortcut/batchnorm/mul__22 + block_1a_bn_shortcut/batchnorm/mul_1 + block_1a_bn_shortcut/batchnorm/sub__23 + block_1a_bn_shortcut/batchnorm/add_1 + add_1/add + block_1a_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_1b_conv_1/convolution + Reshape__119:0 + ONNXTRT_Broadcast_5 + block_1b_conv_1/BiasAdd + block_1b_bn_1/batchnorm/mul__26 + block_1b_bn_1/batchnorm/mul_1 + block_1b_bn_1/batchnorm/sub__27 + block_1b_bn_1/batchnorm/add_1 + block_1b_relu_1/Relu Host Persistent: 4384 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_1b_conv_2/convolution + Reshape__121:0 + ONNXTRT_Broadcast_7 + block_1b_conv_2/BiasAdd + block_1b_bn_2/batchnorm/mul__32 + block_1b_bn_2/batchnorm/mul_1 + block_1b_bn_2/batchnorm/sub__33 + block_1b_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_1b_conv_shortcut/BiasAdd + block_1b_bn_shortcut/batchnorm/mul__34 + block_1b_bn_shortcut/batchnorm/mul_1 + block_1b_bn_shortcut/batchnorm/sub__35 + block_1b_bn_shortcut/batchnorm/add_1 + add_2/add + block_1b_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_2a_conv_1/convolution + Reshape__123:0 + ONNXTRT_Broadcast_9 + block_2a_conv_1/BiasAdd + block_2a_bn_1/batchnorm/mul__38 + block_2a_bn_1/batchnorm/mul_1 + block_2a_bn_1/batchnorm/sub__39 + block_2a_bn_1/batchnorm/add_1 + block_2a_relu_1/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_2a_conv_2/convolution + Reshape__125:0 + ONNXTRT_Broadcast_11 + block_2a_conv_2/BiasAdd + block_2a_bn_2/batchnorm/mul__44 + block_2a_bn_2/batchnorm/mul_1 + block_2a_bn_2/batchnorm/sub__45 + block_2a_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_2a_conv_shortcut/BiasAdd + block_2a_bn_shortcut/batchnorm/mul__46 + block_2a_bn_shortcut/batchnorm/mul_1 + block_2a_bn_shortcut/batchnorm/sub__47 + block_2a_bn_shortcut/batchnorm/add_1 + add_3/add + block_2a_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_2b_conv_1/convolution + Reshape__127:0 + ONNXTRT_Broadcast_13 + block_2b_conv_1/BiasAdd + block_2b_bn_1/batchnorm/mul__50 + block_2b_bn_1/batchnorm/mul_1 + block_2b_bn_1/batchnorm/sub__51 + block_2b_bn_1/batchnorm/add_1 + block_2b_relu_1/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_2b_conv_2/convolution + Reshape__129:0 + ONNXTRT_Broadcast_15 + block_2b_conv_2/BiasAdd + block_2b_bn_2/batchnorm/mul__56 + block_2b_bn_2/batchnorm/mul_1 + block_2b_bn_2/batchnorm/sub__57 + block_2b_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_2b_conv_shortcut/BiasAdd + block_2b_bn_shortcut/batchnorm/mul__58 + block_2b_bn_shortcut/batchnorm/mul_1 + block_2b_bn_shortcut/batchnorm/sub__59 + block_2b_bn_shortcut/batchnorm/add_1 + add_4/add + block_2b_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_3a_conv_1/convolution + Reshape__131:0 + ONNXTRT_Broadcast_17 + block_3a_conv_1/BiasAdd + block_3a_bn_1/batchnorm/mul__62 + block_3a_bn_1/batchnorm/mul_1 + block_3a_bn_1/batchnorm/sub__63 + block_3a_bn_1/batchnorm/add_1 + block_3a_relu_1/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_3a_conv_2/convolution + Reshape__133:0 + ONNXTRT_Broadcast_19 + block_3a_conv_2/BiasAdd + block_3a_bn_2/batchnorm/mul__68 + block_3a_bn_2/batchnorm/mul_1 + block_3a_bn_2/batchnorm/sub__69 + block_3a_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_3a_conv_shortcut/BiasAdd + block_3a_bn_shortcut/batchnorm/mul__70 + block_3a_bn_shortcut/batchnorm/mul_1 + block_3a_bn_shortcut/batchnorm/sub__71 + block_3a_bn_shortcut/batchnorm/add_1 + add_5/add + block_3a_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_3b_conv_1/convolution + Reshape__135:0 + ONNXTRT_Broadcast_21 + block_3b_conv_1/BiasAdd + block_3b_bn_1/batchnorm/mul__74 + block_3b_bn_1/batchnorm/mul_1 + block_3b_bn_1/batchnorm/sub__75 + block_3b_bn_1/batchnorm/add_1 + block_3b_relu_1/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_3b_conv_2/convolution + Reshape__137:0 + ONNXTRT_Broadcast_23 + block_3b_conv_2/BiasAdd + block_3b_bn_2/batchnorm/mul__80 + block_3b_bn_2/batchnorm/mul_1 + block_3b_bn_2/batchnorm/sub__81 + block_3b_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_3b_conv_shortcut/BiasAdd + block_3b_bn_shortcut/batchnorm/mul__82 + block_3b_bn_shortcut/batchnorm/mul_1 + block_3b_bn_shortcut/batchnorm/sub__83 + block_3b_bn_shortcut/batchnorm/add_1 + add_6/add + block_3b_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_4a_conv_1/convolution + Reshape__139:0 + ONNXTRT_Broadcast_25 + block_4a_conv_1/BiasAdd + block_4a_bn_1/batchnorm/mul__86 + block_4a_bn_1/batchnorm/mul_1 + block_4a_bn_1/batchnorm/sub__87 + block_4a_bn_1/batchnorm/add_1 + block_4a_relu_1/Relu Host Persistent: 4384 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_4a_conv_2/convolution + Reshape__141:0 + ONNXTRT_Broadcast_27 + block_4a_conv_2/BiasAdd + block_4a_bn_2/batchnorm/mul__92 + block_4a_bn_2/batchnorm/mul_1 + block_4a_bn_2/batchnorm/sub__93 + block_4a_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_4a_conv_shortcut/BiasAdd + block_4a_bn_shortcut/batchnorm/mul__94 + block_4a_bn_shortcut/batchnorm/mul_1 + block_4a_bn_shortcut/batchnorm/sub__95 + block_4a_bn_shortcut/batchnorm/add_1 + add_7/add + block_4a_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_4b_conv_1/convolution + Reshape__143:0 + ONNXTRT_Broadcast_29 + block_4b_conv_1/BiasAdd + block_4b_bn_1/batchnorm/mul__98 + block_4b_bn_1/batchnorm/mul_1 + block_4b_bn_1/batchnorm/sub__99 + block_4b_bn_1/batchnorm/add_1 + block_4b_relu_1/Relu Host Persistent: 4384 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_4b_conv_2/convolution + Reshape__145:0 + ONNXTRT_Broadcast_31 + block_4b_conv_2/BiasAdd + block_4b_bn_2/batchnorm/mul__104 + block_4b_bn_2/batchnorm/mul_1 + block_4b_bn_2/batchnorm/sub__105 + block_4b_bn_2/batchnorm/add_1 Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: block_4b_conv_shortcut/BiasAdd + block_4b_bn_shortcut/batchnorm/mul__106 + block_4b_bn_shortcut/batchnorm/mul_1 + block_4b_bn_shortcut/batchnorm/sub__107 + block_4b_bn_shortcut/batchnorm/add_1 + add_8/add + block_4b_relu/Relu Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: output_cov/convolution + Reshape__147:0 + ONNXTRT_Broadcast_33 + output_cov/BiasAdd Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: PWN(output_cov/Sigmoid) Host Persistent: 308 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Layer: output_bbox/convolution + Reshape__149:0 + ONNXTRT_Broadcast_36 + output_bbox/BiasAdd Host Persistent: 5152 bytes Device Persistent: 0 bytes Scratch Memory: 0 bytes
DEBUG: [TRT]: Skipped printing memory information for 3 layers with 0 memory size i.e. Host Persistent + Device Persistent + Scratch Memory == 0.
DEBUG: [TRT]: Total Host Persistent Memory: 136864 bytes
DEBUG: [TRT]: Total Device Persistent Memory: 0 bytes
DEBUG: [TRT]: Max Scratch Memory: 0 bytes
DEBUG: [TRT]: [BlockAssignment] Started assigning block shifts. This will take 29 steps to complete.
DEBUG: [TRT]: STILL ALIVE: Started step 26 of 29
DEBUG: [TRT]: [BlockAssignment] Algorithm ShiftNTopDown took 0.170428ms to assign 4 blocks to 29 nodes requiring 8356352 bytes.
DEBUG: [TRT]: Total number of blocks in optimized block assignment: 3
DEBUG: [TRT]: Total Activation Memory: 8355840 bytes
DEBUG: [TRT]: Total Weights Memory: 1571392 bytes
DEBUG: [TRT]: Finalize: conv1/convolution + Reshape__113:0 + ONNXTRT_Broadcast + conv1/BiasAdd + bn_conv1/batchnorm/mul__10 + bn_conv1/batchnorm/mul_1 + bn_conv1/batchnorm/sub__11 + bn_conv1/batchnorm/add_1 + activation_1/Relu Set kernel index: 0
DEBUG: [TRT]: Finalize: block_1a_conv_1/convolution + Reshape__115:0 + ONNXTRT_Broadcast_1 + block_1a_conv_1/BiasAdd + block_1a_bn_1/batchnorm/mul__14 + block_1a_bn_1/batchnorm/mul_1 + block_1a_bn_1/batchnorm/sub__15 + block_1a_bn_1/batchnorm/add_1 + block_1a_relu_1/Relu Set kernel index: 1
DEBUG: [TRT]: Finalize: block_1a_conv_2/convolution + Reshape__117:0 + ONNXTRT_Broadcast_3 + block_1a_conv_2/BiasAdd + block_1a_bn_2/batchnorm/mul__20 + block_1a_bn_2/batchnorm/mul_1 + block_1a_bn_2/batchnorm/sub__21 + block_1a_bn_2/batchnorm/add_1 Set kernel index: 2
DEBUG: [TRT]: Finalize: block_1a_conv_shortcut/BiasAdd + block_1a_bn_shortcut/batchnorm/mul__22 + block_1a_bn_shortcut/batchnorm/mul_1 + block_1a_bn_shortcut/batchnorm/sub__23 + block_1a_bn_shortcut/batchnorm/add_1 + add_1/add + block_1a_relu/Relu Set kernel index: 3
DEBUG: [TRT]: Finalize: block_1b_conv_1/convolution + Reshape__119:0 + ONNXTRT_Broadcast_5 + block_1b_conv_1/BiasAdd + block_1b_bn_1/batchnorm/mul__26 + block_1b_bn_1/batchnorm/mul_1 + block_1b_bn_1/batchnorm/sub__27 + block_1b_bn_1/batchnorm/add_1 + block_1b_relu_1/Relu Set kernel index: 4
DEBUG: [TRT]: Finalize: block_1b_conv_2/convolution + Reshape__121:0 + ONNXTRT_Broadcast_7 + block_1b_conv_2/BiasAdd + block_1b_bn_2/batchnorm/mul__32 + block_1b_bn_2/batchnorm/mul_1 + block_1b_bn_2/batchnorm/sub__33 + block_1b_bn_2/batchnorm/add_1 Set kernel index: 5
DEBUG: [TRT]: Finalize: block_1b_conv_shortcut/BiasAdd + block_1b_bn_shortcut/batchnorm/mul__34 + block_1b_bn_shortcut/batchnorm/mul_1 + block_1b_bn_shortcut/batchnorm/sub__35 + block_1b_bn_shortcut/batchnorm/add_1 + add_2/add + block_1b_relu/Relu Set kernel index: 6
DEBUG: [TRT]: Finalize: block_2a_conv_1/convolution + Reshape__123:0 + ONNXTRT_Broadcast_9 + block_2a_conv_1/BiasAdd + block_2a_bn_1/batchnorm/mul__38 + block_2a_bn_1/batchnorm/mul_1 + block_2a_bn_1/batchnorm/sub__39 + block_2a_bn_1/batchnorm/add_1 + block_2a_relu_1/Relu Set kernel index: 7
DEBUG: [TRT]: Finalize: block_2a_conv_2/convolution + Reshape__125:0 + ONNXTRT_Broadcast_11 + block_2a_conv_2/BiasAdd + block_2a_bn_2/batchnorm/mul__44 + block_2a_bn_2/batchnorm/mul_1 + block_2a_bn_2/batchnorm/sub__45 + block_2a_bn_2/batchnorm/add_1 Set kernel index: 8
DEBUG: [TRT]: Finalize: block_2a_conv_shortcut/BiasAdd + block_2a_bn_shortcut/batchnorm/mul__46 + block_2a_bn_shortcut/batchnorm/mul_1 + block_2a_bn_shortcut/batchnorm/sub__47 + block_2a_bn_shortcut/batchnorm/add_1 + add_3/add + block_2a_relu/Relu Set kernel index: 9
DEBUG: [TRT]: Finalize: block_2b_conv_1/convolution + Reshape__127:0 + ONNXTRT_Broadcast_13 + block_2b_conv_1/BiasAdd + block_2b_bn_1/batchnorm/mul__50 + block_2b_bn_1/batchnorm/mul_1 + block_2b_bn_1/batchnorm/sub__51 + block_2b_bn_1/batchnorm/add_1 + block_2b_relu_1/Relu Set kernel index: 7
DEBUG: [TRT]: Finalize: block_2b_conv_2/convolution + Reshape__129:0 + ONNXTRT_Broadcast_15 + block_2b_conv_2/BiasAdd + block_2b_bn_2/batchnorm/mul__56 + block_2b_bn_2/batchnorm/mul_1 + block_2b_bn_2/batchnorm/sub__57 + block_2b_bn_2/batchnorm/add_1 Set kernel index: 7
DEBUG: [TRT]: Finalize: block_2b_conv_shortcut/BiasAdd + block_2b_bn_shortcut/batchnorm/mul__58 + block_2b_bn_shortcut/batchnorm/mul_1 + block_2b_bn_shortcut/batchnorm/sub__59 + block_2b_bn_shortcut/batchnorm/add_1 + add_4/add + block_2b_relu/Relu Set kernel index: 10
DEBUG: [TRT]: Finalize: block_3a_conv_1/convolution + Reshape__131:0 + ONNXTRT_Broadcast_17 + block_3a_conv_1/BiasAdd + block_3a_bn_1/batchnorm/mul__62 + block_3a_bn_1/batchnorm/mul_1 + block_3a_bn_1/batchnorm/sub__63 + block_3a_bn_1/batchnorm/add_1 + block_3a_relu_1/Relu Set kernel index: 11
DEBUG: [TRT]: Finalize: block_3a_conv_2/convolution + Reshape__133:0 + ONNXTRT_Broadcast_19 + block_3a_conv_2/BiasAdd + block_3a_bn_2/batchnorm/mul__68 + block_3a_bn_2/batchnorm/mul_1 + block_3a_bn_2/batchnorm/sub__69 + block_3a_bn_2/batchnorm/add_1 Set kernel index: 12
DEBUG: [TRT]: Finalize: block_3a_conv_shortcut/BiasAdd + block_3a_bn_shortcut/batchnorm/mul__70 + block_3a_bn_shortcut/batchnorm/mul_1 + block_3a_bn_shortcut/batchnorm/sub__71 + block_3a_bn_shortcut/batchnorm/add_1 + add_5/add + block_3a_relu/Relu Set kernel index: 13
DEBUG: [TRT]: Finalize: block_3b_conv_1/convolution + Reshape__135:0 + ONNXTRT_Broadcast_21 + block_3b_conv_1/BiasAdd + block_3b_bn_1/batchnorm/mul__74 + block_3b_bn_1/batchnorm/mul_1 + block_3b_bn_1/batchnorm/sub__75 + block_3b_bn_1/batchnorm/add_1 + block_3b_relu_1/Relu Set kernel index: 12
DEBUG: [TRT]: Finalize: block_3b_conv_2/convolution + Reshape__137:0 + ONNXTRT_Broadcast_23 + block_3b_conv_2/BiasAdd + block_3b_bn_2/batchnorm/mul__80 + block_3b_bn_2/batchnorm/mul_1 + block_3b_bn_2/batchnorm/sub__81 + block_3b_bn_2/batchnorm/add_1 Set kernel index: 14
DEBUG: [TRT]: Finalize: block_3b_conv_shortcut/BiasAdd + block_3b_bn_shortcut/batchnorm/mul__82 + block_3b_bn_shortcut/batchnorm/mul_1 + block_3b_bn_shortcut/batchnorm/sub__83 + block_3b_bn_shortcut/batchnorm/add_1 + add_6/add + block_3b_relu/Relu Set kernel index: 15
DEBUG: [TRT]: Finalize: block_4a_conv_1/convolution + Reshape__139:0 + ONNXTRT_Broadcast_25 + block_4a_conv_1/BiasAdd + block_4a_bn_1/batchnorm/mul__86 + block_4a_bn_1/batchnorm/mul_1 + block_4a_bn_1/batchnorm/sub__87 + block_4a_bn_1/batchnorm/add_1 + block_4a_relu_1/Relu Set kernel index: 4
DEBUG: [TRT]: Finalize: block_4a_conv_2/convolution + Reshape__141:0 + ONNXTRT_Broadcast_27 + block_4a_conv_2/BiasAdd + block_4a_bn_2/batchnorm/mul__92 + block_4a_bn_2/batchnorm/mul_1 + block_4a_bn_2/batchnorm/sub__93 + block_4a_bn_2/batchnorm/add_1 Set kernel index: 16
DEBUG: [TRT]: Finalize: block_4a_conv_shortcut/BiasAdd + block_4a_bn_shortcut/batchnorm/mul__94 + block_4a_bn_shortcut/batchnorm/mul_1 + block_4a_bn_shortcut/batchnorm/sub__95 + block_4a_bn_shortcut/batchnorm/add_1 + add_7/add + block_4a_relu/Relu Set kernel index: 15
DEBUG: [TRT]: Finalize: block_4b_conv_1/convolution + Reshape__143:0 + ONNXTRT_Broadcast_29 + block_4b_conv_1/BiasAdd + block_4b_bn_1/batchnorm/mul__98 + block_4b_bn_1/batchnorm/mul_1 + block_4b_bn_1/batchnorm/sub__99 + block_4b_bn_1/batchnorm/add_1 + block_4b_relu_1/Relu Set kernel index: 17
DEBUG: [TRT]: Finalize: block_4b_conv_2/convolution + Reshape__145:0 + ONNXTRT_Broadcast_31 + block_4b_conv_2/BiasAdd + block_4b_bn_2/batchnorm/mul__104 + block_4b_bn_2/batchnorm/mul_1 + block_4b_bn_2/batchnorm/sub__105 + block_4b_bn_2/batchnorm/add_1 Set kernel index: 16
DEBUG: [TRT]: Finalize: block_4b_conv_shortcut/BiasAdd + block_4b_bn_shortcut/batchnorm/mul__106 + block_4b_bn_shortcut/batchnorm/mul_1 + block_4b_bn_shortcut/batchnorm/sub__107 + block_4b_bn_shortcut/batchnorm/add_1 + add_8/add + block_4b_relu/Relu Set kernel index: 18
DEBUG: [TRT]: Finalize: output_cov/convolution + Reshape__147:0 + ONNXTRT_Broadcast_33 + output_cov/BiasAdd Set kernel index: 19
DEBUG: [TRT]: Finalize: PWN(output_cov/Sigmoid) Set kernel index: 20
DEBUG: [TRT]: Finalize: output_bbox/convolution + Reshape__149:0 + ONNXTRT_Broadcast_36 + output_bbox/BiasAdd Set kernel index: 19
DEBUG: [TRT]: Total number of generated kernels selected for the engine: 21
DEBUG: [TRT]: Kernel: 0 CASK_STATIC
DEBUG: [TRT]: Kernel: 1 CASK_STATIC
DEBUG: [TRT]: Kernel: 2 CASK_STATIC
DEBUG: [TRT]: Kernel: 3 CASK_STATIC
DEBUG: [TRT]: Kernel: 4 CASK_STATIC
DEBUG: [TRT]: Kernel: 5 CASK_STATIC
DEBUG: [TRT]: Kernel: 6 CASK_STATIC
DEBUG: [TRT]: Kernel: 7 CASK_STATIC
DEBUG: [TRT]: Kernel: 8 CASK_STATIC
DEBUG: [TRT]: Kernel: 9 CASK_STATIC
DEBUG: [TRT]: Kernel: 10 CASK_STATIC
DEBUG: [TRT]: Kernel: 11 CASK_STATIC
DEBUG: [TRT]: Kernel: 12 CASK_STATIC
DEBUG: [TRT]: Kernel: 13 CASK_STATIC
DEBUG: [TRT]: Kernel: 14 CASK_STATIC
DEBUG: [TRT]: Kernel: 15 CASK_STATIC
DEBUG: [TRT]: Kernel: 16 CASK_STATIC
DEBUG: [TRT]: Kernel: 17 CASK_STATIC
DEBUG: [TRT]: Kernel: 18 CASK_STATIC
DEBUG: [TRT]: Kernel: 19 CASK_STATIC
DEBUG: [TRT]: Kernel: 20 TRT_SERIALIZABLE:generatedNativePointwise
DEBUG: [TRT]: Disabling unused tactic source: JIT_CONVOLUTIONS
DEBUG: [TRT]: Engine generation completed in 31.2342 seconds.
DEBUG: [TRT]: Engine Layer Information:
Layer(Reformat): Reformatting CopyNode for Input Tensor 0 to conv1/convolution + Reshape__113:0 + ONNXTRT_Broadcast + conv1/BiasAdd + bn_conv1/batchnorm/mul__10 + bn_conv1/batchnorm/mul_1 + bn_conv1/batchnorm/sub__11 + bn_conv1/batchnorm/add_1 + activation_1/Relu, Tactic: 0x0000000000000000, input_1:0 (Float[1,3,544,960]) -> Reformatted Input Tensor 0 to conv1/convolution + Reshape__113:0 + ONNXTRT_Broadcast + conv1/BiasAdd + bn_conv1/batchnorm/mul__10 + bn_conv1/batchnorm/mul_1 + bn_conv1/batchnorm/sub__11 + bn_conv1/batchnorm/add_1 + activation_1/Relu (Int8[1,3:4,544,960])
Layer(CaskConvolution): conv1/convolution + Reshape__113:0 + ONNXTRT_Broadcast + conv1/BiasAdd + bn_conv1/batchnorm/mul__10 + bn_conv1/batchnorm/mul_1 + bn_conv1/batchnorm/sub__11 + bn_conv1/batchnorm/add_1 + activation_1/Relu, Tactic: 0x77e275948c7dace9, Reformatted Input Tensor 0 to conv1/convolution + Reshape__113:0 + ONNXTRT_Broadcast + conv1/BiasAdd + bn_conv1/batchnorm/mul__10 + bn_conv1/batchnorm/mul_1 + bn_conv1/batchnorm/sub__11 + bn_conv1/batchnorm/add_1 + activation_1/Relu (Int8[1,3:4,544,960]) -> activation_1/Relu:0 (Int8[1,24:32,272,480])
Layer(CaskConvolution): block_1a_conv_1/convolution + Reshape__115:0 + ONNXTRT_Broadcast_1 + block_1a_conv_1/BiasAdd + block_1a_bn_1/batchnorm/mul__14 + block_1a_bn_1/batchnorm/mul_1 + block_1a_bn_1/batchnorm/sub__15 + block_1a_bn_1/batchnorm/add_1 + block_1a_relu_1/Relu, Tactic: 0x70ccdad7e8ced9ab, activation_1/Relu:0 (Int8[1,24:32,272,480]) -> block_1a_relu_1/Relu:0 (Int8[1,48:32,136,240])
Layer(CaskConvolution): block_1a_conv_2/convolution + Reshape__117:0 + ONNXTRT_Broadcast_3 + block_1a_conv_2/BiasAdd + block_1a_bn_2/batchnorm/mul__20 + block_1a_bn_2/batchnorm/mul_1 + block_1a_bn_2/batchnorm/sub__21 + block_1a_bn_2/batchnorm/add_1, Tactic: 0xef01fb6e433afa50, block_1a_relu_1/Relu:0 (Int8[1,48:32,136,240]) -> block_1a_bn_2/batchnorm/add_1:0 (Int8[1,64:32,136,240])
Layer(CaskConvolution): block_1a_conv_shortcut/BiasAdd + block_1a_bn_shortcut/batchnorm/mul__22 + block_1a_bn_shortcut/batchnorm/mul_1 + block_1a_bn_shortcut/batchnorm/sub__23 + block_1a_bn_shortcut/batchnorm/add_1 + add_1/add + block_1a_relu/Relu, Tactic: 0xfa5f2e15625aa266, activation_1/Relu:0 (Int8[1,24:32,272,480]), block_1a_bn_2/batchnorm/add_1:0 (Int8[1,64:32,136,240]) -> block_1a_relu/Relu:0 (Int8[1,64:32,136,240])
Layer(CaskConvolution): block_1b_conv_1/convolution + Reshape__119:0 + ONNXTRT_Broadcast_5 + block_1b_conv_1/BiasAdd + block_1b_bn_1/batchnorm/mul__26 + block_1b_bn_1/batchnorm/mul_1 + block_1b_bn_1/batchnorm/sub__27 + block_1b_bn_1/batchnorm/add_1 + block_1b_relu_1/Relu, Tactic: 0x214f03e23f252333, block_1a_relu/Relu:0 (Int8[1,64:32,136,240]) -> block_1b_relu_1/Relu:0 (Int8[1,64:32,136,240])
Layer(CaskConvolution): block_1b_conv_2/convolution + Reshape__121:0 + ONNXTRT_Broadcast_7 + block_1b_conv_2/BiasAdd + block_1b_bn_2/batchnorm/mul__32 + block_1b_bn_2/batchnorm/mul_1 + block_1b_bn_2/batchnorm/sub__33 + block_1b_bn_2/batchnorm/add_1, Tactic: 0x26a29d5b8b3f62af, block_1b_relu_1/Relu:0 (Int8[1,64:32,136,240]) -> block_1b_bn_2/batchnorm/add_1:0 (Int8[1,64:32,136,240])
Layer(CaskConvolution): block_1b_conv_shortcut/BiasAdd + block_1b_bn_shortcut/batchnorm/mul__34 + block_1b_bn_shortcut/batchnorm/mul_1 + block_1b_bn_shortcut/batchnorm/sub__35 + block_1b_bn_shortcut/batchnorm/add_1 + add_2/add + block_1b_relu/Relu, Tactic: 0x483ad1560c6e5e27, block_1a_relu/Relu:0 (Int8[1,64:32,136,240]), block_1b_bn_2/batchnorm/add_1:0 (Int8[1,64:32,136,240]) -> block_1b_relu/Relu:0 (Int8[1,64:32,136,240])
Layer(CaskConvolution): block_2a_conv_1/convolution + Reshape__123:0 + ONNXTRT_Broadcast_9 + block_2a_conv_1/BiasAdd + block_2a_bn_1/batchnorm/mul__38 + block_2a_bn_1/batchnorm/mul_1 + block_2a_bn_1/batchnorm/sub__39 + block_2a_bn_1/batchnorm/add_1 + block_2a_relu_1/Relu, Tactic: 0xd277f13d771603ee, block_1b_relu/Relu:0 (Int8[1,64:32,136,240]) -> block_2a_relu_1/Relu:0 (Int8[1,80:32,68,120])
Layer(CaskConvolution): block_2a_conv_2/convolution + Reshape__125:0 + ONNXTRT_Broadcast_11 + block_2a_conv_2/Bias
DEBUG: [TRT]: [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 36 MiB
DEBUG: [TRT]: Adding 1 engine(s) to plan file.
DEBUG: [TRT]: Adding 1 engine weights(s) to plan file.
DEBUG: [TRT]: Loaded engine size: 4 MiB
DEBUG: [TRT]: Deserialization required 5416 microseconds.
DEBUG: [TRT]: Adding 1 engine(s) to plan file.
DEBUG: [TRT]: Adding 1 engine weights(s) to plan file.
0:00:33.966084301 43894 0x5f55deba2910 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<person> NvDsInferContext[UID 5]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2138> [UID = 5]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.1/nvodin24/models/assets/person/resnet18_trafficcamnet_pruned.onnx_b1_gpu0_int8.engine successfully
DEBUG: [TRT]: Total per-runner device persistent memory is 0
DEBUG: [TRT]: Total per-runner host persistent memory is 136864
DEBUG: [TRT]: Allocated device scratch memory of size 8355840
DEBUG: [TRT]: - Runner scratch: 8355840 bytes
DEBUG: [TRT]: [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +8, now: CPU 0, GPU 9 (MiB)
DEBUG: [TRT]: CUDA lazy loading is enabled.
Implicit layer support has been deprecated
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

This is what we have to fix

isn’t it deprecated?
Implicit layer support has been deprecated

Notcing you are testing resnet18_trafficcamnet_pruned.onnx, which is used in deepstream-test1. can you reproduce this issue using deepstream-test1 withouout any modifications?

Same issue when tested on deepstream python apps

when used this, the engine doesn’t get created, and after a while the process gets killed.

DEBUG: [TRT]: output_bbox/convolution [Conv] inputs: [block_4b_relu/Relu:0 → (-1, 120, 34, 60)[FLOAT]], [output_bbox/kernel/read__108 → (16, 120, 1, 1)[FLOAT]],
DEBUG: [TRT]: Convolution input dimensions: (-1, 120, 34, 60)
DEBUG: [TRT]: Registering layer: output_bbox/convolution for ONNX node: output_bbox/convolution
DEBUG: [TRT]: Using kernel: (1, 1), strides: (1, 1), prepadding: (0, 0), postpadding: (0, 0), dilations: (1, 1), numOutputs: 16, nbGroups: 1
DEBUG: [TRT]: Convolution output dimensions: (-1, 16, 34, 60)
DEBUG: [TRT]: Registering tensor: output_bbox/convolution:0 for ONNX tensor: output_bbox/convolution:0
DEBUG: [TRT]: output_bbox/convolution [Conv] outputs: [output_bbox/convolution:0 → (-1, 16, 34, 60)[FLOAT]],
DEBUG: [TRT]: Static check for parsing node: output_bbox/BiasAdd [Add]
DEBUG: [TRT]: Parsing node: output_bbox/BiasAdd [Add]
DEBUG: [TRT]: Searching for input: output_bbox/convolution:0
DEBUG: [TRT]: Searching for input: Reshape__149:0
DEBUG: [TRT]: output_bbox/BiasAdd [Add] inputs: [output_bbox/convolution:0 → (-1, 16, 34, 60)[FLOAT]], [Reshape__149:0 → (16, 1, 1)[FLOAT]],
DEBUG: [TRT]: Registering layer: Reshape__149:0 required by ONNX-TRT
DEBUG: [TRT]: Registering layer: ONNXTRT_ShapeShuffle_35 required by ONNX-TRT
DEBUG: [TRT]: Registering layer: ONNXTRT_Broadcast_36 required by ONNX-TRT
DEBUG: [TRT]: Registering layer: output_bbox/BiasAdd for ONNX node: output_bbox/BiasAdd
DEBUG: [TRT]: Registering tensor: output_bbox/BiasAdd:0_37 for ONNX tensor: output_bbox/BiasAdd:0
DEBUG: [TRT]: output_bbox/BiasAdd [Add] outputs: [output_bbox/BiasAdd:0 → (-1, 16, 34, 60)[FLOAT]],
DEBUG: [TRT]: Marking output_cov/Sigmoid:0_34 as output: output_cov/Sigmoid:0
DEBUG: [TRT]: Marking output_bbox/BiasAdd:0_37 as output: output_bbox/BiasAdd:0
Implicit layer support has been deprecated

I can’t reproduce this issue on my side.

  1. at the frist time, the egine was generted from the log " serialize cuda engine to file: …onnx_b1_gpu0_int8.engine successfully", can you see the output video with bboxes?
  2. from my test, after the first run,there is also the log “Implicit layer support has been deprecated”, so this is not the fatal issue. please refer to my test log test1.txt (2.1 KB).
  3. what is the GPU model? are you testing in docker?

Since the model is used in deeptream-test1, please use the same configuratons with this file. are you sure the applicaton was killed? At the first run, it will take much time to generate the TRT engine. From the log, I did not see any error log.

:00:00.396806048 354 0x579884768d40 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.1/samples/models/Primary_Detector/resnet18_trafficcamnet_pruned.onnx_b1_gpu0_int8.engine
Implicit layer support has been deprecated
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

0:00:00.396845164 354 0x579884768d40 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.1/samples/models/Primary_Detector/resnet18_trafficcamnet_pruned.onnx_b1_gpu0_int8.engine
0:00:00.404660954 354 0x579884768d40 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:dstest1_pgie_config.txt sucessfully

the model is not loaded, so I don’t see any FPS at all

In my case, if I enable the implicit layer, then my engine creation fails.
From your log, I dont see the layers info, I see layers nums:0


Timestamp                                 : Thu Mar 13 09:00:38 2025
Driver Version                            : 560.35.05
CUDA Version                              : 12.6

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Product Name                          : NVIDIA GeForce RTX 3070 Laptop GPU
    Product Brand                         : GeForce
    Product Architecture                  : Ampere
    Display Mode                          : Enabled
    Display Active                        : Enabled
    Persistence Mode                      : Disabled
    Addressing Mode                       : None
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-78794fb9-79a7-a224-9bd0-4557ea530ae9
    Minor Number                          : 0
    VBIOS Version                         : 94.04.81.00.20
    MultiGPU Board                        : No
    Board ID                              : 0x100
    Board Part Number                     : N/A
    GPU Part Number                       : 24DD-750-A1
    FRU Part Number                       : N/A
    Module ID                             : 1
    Inforom Version
        Image Version                     : G001.0000.94.01
        OEM Object                        : 2.0
        ECC Object                        : N/A
        Power Management Object           : N/A
    Inforom BBX Object Flush
        Latest Timestamp                  : N/A
        Latest Duration                   : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GPU C2C Mode                          : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
        vGPU Heterogeneous Mode           : N/A
    GPU Reset Status
        Reset Required                    : No
        Drain and Reset Recommended       : No
    GSP Firmware Version                  : 560.35.05
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x01
        Device                            : 0x00
        Domain                            : 0x0000
        Base Classcode                    : 0x3
        Sub Classcode                     : 0x0
        Device Id                         : 0x24DD10DE
        Bus Id                            : 00000000:01:00.0
        Sub System Id                     : 0x3E8417AA
        GPU Link Info
            PCIe Generation
                Max                       : 4
                Current                   : 1
                Device Current            : 1
                Device Max                : 4
                Host Max                  : 4
            Link Width
                Max                       : 16x
                Current                   : 16x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 78 KB/s
        Rx Throughput                     : 447 KB/s
        Atomic Caps Outbound              : N/A
        Atomic Caps Inbound               : N/A
    Fan Speed                             : N/A
    Performance State                     : P8
    Clocks Event Reasons
        Idle                              : Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    Sparse Operation Mode                 : N/A
    FB Memory Usage
        Total                             : 8192 MiB
        Reserved                          : 322 MiB
        Used                              : 76 MiB
        Free                              : 7796 MiB
    BAR1 Memory Usage
        Total                             : 8192 MiB
        Used                              : 3 MiB
        Free                              : 8189 MiB
    Conf Compute Protected Memory Usage
        Total                             : 0 MiB
        Used                              : 0 MiB
        Free                              : 0 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 7 %
        Memory                            : 4 %
        Encoder                           : 0 %
        Decoder                           : 0 %
        JPEG                              : 0 %
        OFA                               : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    ECC Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable Parity     : N/A
            SRAM Uncorrectable SEC-DED    : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable Parity     : N/A
            SRAM Uncorrectable SEC-DED    : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
            SRAM Threshold Exceeded       : N/A
        Aggregate Uncorrectable SRAM Sources
            SRAM L2                       : N/A
            SRAM SM                       : N/A
            SRAM Microcontroller          : N/A
            SRAM PCIE                     : N/A
            SRAM Other                    : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 49 C
        GPU T.Limit Temp                  : N/A
        GPU Shutdown Temp                 : 101 C
        GPU Slowdown Temp                 : 98 C
        GPU Max Operating Temp            : 105 C
        GPU Target Temperature            : 87 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    GPU Power Readings
        Power Draw                        : 13.65 W
        Current Power Limit               : 115.00 W
        Requested Power Limit             : 115.00 W
        Default Power Limit               : 115.00 W
        Min Power Limit                   : 1.00 W
        Max Power Limit                   : 140.00 W
    GPU Memory Power Readings 
        Power Draw                        : N/A
    Module Power Readings
        Power Draw                        : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A
    Clocks
        Graphics                          : 210 MHz
        SM                                : 210 MHz
        Memory                            : 405 MHz
        Video                             : 555 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Deferred Clocks
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2100 MHz
        SM                                : 2100 MHz
        Memory                            : 7001 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : 631.250 mV
    Fabric
        State                             : N/A
        Status                            : N/A
        CliqueId                          : N/A
        ClusterUUID                       : N/A
        Health
            Bandwidth                     : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 2718
            Type                          : G
            Name                          : 
            Used GPU Memory               : 65 MiB
    Capabilities
        EGM                               : disabled

Yes, I am running inside the deepstream7.1 docker