Faster-RCNN engine (TensorRT-8.2) failed to run inference on Jetson TX2 NX

Description

I tried to convert a Faster-RCNN model into TensorRT engine (torch → onnx → trtexec trt) for Jetson TX2 NX. I cross-compiled trtexec and custom plugin from a GTX 1080 machine targeting TX2 NX. After porting all these components onto TX2, trtexec could build an engine, but crashed during inference with an Internal Error (Assertion status == kSTATUS_SUCCESS failed. ). Following a similar procedure, I could successfully deploy my model on GTX 1080 and RTX 2080, but not on Jetson TX2 NX.

Environment

TensorRT Version: 8.2.1 (JetPack 4.6.3)
GPU Type: Jetson TX2 NX
CUDA Version: 10.2
CUDNN Version: 8.0.0
Operating System + Version: Ubuntu 18.04
Python Version: 3.10.8
PyTorch Version: 1.13.1
Torchvision Version: 0.14.1

Relevant Files

I am happy to provide relevant files (e.g., onnx model and TensorRT OSS package, etc.) via DM.

Steps To Reproduce

Context:

My model is a torchvision Faster R-CNN model where I replaced the backbone with ResNet10, and configured the detection head to predict boxes of a single category (plus background). The trained model was first converted into onnx format via torch.onnx.export(). I mainly tested with opset_version==11 (also experimented with other versions, but all led to the same results).

Jetson TX2 is officially compatible up to TensorRT-8.2 (JetPack 4.6.3), which does not natively support RoiAlign in Faster-RCNN. Therefore, I manually added roiAlignPlugin from the official TensorRT OSS release/8.5 and then recompiled relevant .so and trtexec.

Step 1 - Test on GTX 1080
I adapted the codes of roiAlignPlugin and onnx parser from TensorRT OSS 8.5 into my TensorRT OSS 8.2 project (i.e., TRT_OSSPATH). I ran the following commands to build new libnvinfer_plugin.so.8, libnvonnxparser.so.8. and trtexec, etc…

cd $TRT_OSSPATH
mkdir -p build && cd build
cmake … -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=pwd/out - DCUDA_VERSION=11.8 -DGPU_ARCHS=“61”
make -j$(nproc)

where TRT_LIBPATH corresponds to the path of the downloaded TensorRT-8.2.1.8.Linux.x86_64-gnu. I confirmed that the recompiled trtexec and plugin from Step 1 produced correct detection on my GTX 1080 (and even on another RTX 2080 machine).

Step 2 – Cross-compilation targeting Jetson TX2 NX
To deploy my model on Jetson TX2 NX, I chose cross compilation from GTX 1080 following the Example: Ubuntu 18.04 Cross-Compile for Jetson (aarch64) with cuda-10.2 (JetPack) from GitHub - NVIDIA/TensorRT: NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications..

cd $TRT_OSSPATH

mkdir -p build && cd build

cmake … -DCMAKE_TOOLCHAIN_FILE=$TRT_OSSPATH/cmake/toolchains/cmake_aarch64_jetson.toolchain -DTRT_LIB_DIR=$TRT_LIBPATH/lib -DTRT_OUT_DIR=pwd/out -DCUDA_VERSION=10.2 -DCUDNN_LIB=$TX2_CUDA_PATH/lib/libcudnn.so -DCUBLAS_LIB=$TX2_CUDA_PATH/lib/libcublas.so.10 -DCUBLASLT_LIB=$TX2_CUDA_PATH/lib/libcublasLt.so.10 -DCUDA_TOOLKIT_ROOT_DIR=$TX2_CUDA_PATH -DCUDNN_ROOT_DIR=$TX2_CUDNN_PATH -DCUDART_LIB=$TX2_CUDA_PATH/lib/libcudart.so -DCMAKE_CUDA_COMPILER=$TX2_CUDA_PATH/bin/nvcc -DCUDA_INCLUDE_DIRS=$TX2_CUDA_PATH/include -DGPU_ARCHS=“62” -DTRT_PLATFORM_ID=aarch64

make -j$(nproc)

I had to add many more input specifications for a successful build. Eventually I was able to build new libnvinfer_plugin.so.8, libnvonnxparser.so.8. and trtexec targeting Jetson TX2.

Step 3 – Test on Jetson TX2 NX.
I copied the above components onto Jetson device and run trtexec –onnx=model.onnx –saveEngine=model.trt. I obtained the following message. An engine was built with success, but trtexec crashed in the inference stage.

[09/06/2022-13:00:11] [I] === Model Options ===

[09/06/2022-13:00:11] [I] Format: ONNX

[09/06/2022-13:00:11] [I] Model: model.onnx

[09/06/2022-13:00:11] [I] Output:

[09/06/2022-13:00:11] [I] === Build Options ===

[09/06/2022-13:00:11] [I] Max batch: explicit batch

[09/06/2022-13:00:11] [I] Workspace: 16 MiB

[09/06/2022-13:00:11] [I] minTiming: 1

[09/06/2022-13:00:11] [I] avgTiming: 8

[09/06/2022-13:00:11] [I] Precision: FP32

[09/06/2022-13:00:11] [I] Calibration:

[09/06/2022-13:00:11] [I] Refit: Disabled

[09/06/2022-13:00:11] [I] Sparsity: Disabled

[09/06/2022-13:00:11] [I] Safe mode: Disabled

[09/06/2022-13:00:11] [I] DirectIO mode: Disabled

[09/06/2022-13:00:11] [I] Restricted mode: Disabled

[09/06/2022-13:00:11] [I] Save engine:

[09/06/2022-13:00:11] [I] Load engine:

[09/06/2022-13:00:11] [I] Profiling verbosity: 0

[09/06/2022-13:00:11] [I] Tactic sources: Using default tactic sources

[09/06/2022-13:00:11] [I] timingCacheMode: local

[09/06/2022-13:00:11] [I] timingCacheFile:

[09/06/2022-13:00:11] [I] Input(s)s format: fp32:CHW

[09/06/2022-13:00:11] [I] Output(s)s format: fp32:CHW

[09/06/2022-13:00:11] [I] Input build shapes: model

[09/06/2022-13:00:11] [I] Input calibration shapes: model

[09/06/2022-13:00:11] [I] === System Options ===

[09/06/2022-13:00:11] [I] Device: 0

[09/06/2022-13:00:11] [I] DLACore:

[09/06/2022-13:00:11] [I] Plugins:

[09/06/2022-13:00:11] [I] === Inference Options ===

[09/06/2022-13:00:11] [I] Batch: Explicit

[09/06/2022-13:00:11] [I] Input inference shapes: model

[09/06/2022-13:00:11] [I] Iterations: 10

[09/06/2022-13:00:11] [I] Duration: 3s (+ 200ms warm up)

[09/06/2022-13:00:11] [I] Sleep time: 0ms

[09/06/2022-13:00:11] [I] Idle time: 0ms

[09/06/2022-13:00:11] [I] Streams: 1

[09/06/2022-13:00:11] [I] ExposeDMA: Disabled

[09/06/2022-13:00:11] [I] Data transfers: Enabled

[09/06/2022-13:00:11] [I] Spin-wait: Disabled

[09/06/2022-13:00:11] [I] Multithreading: Disabled

[09/06/2022-13:00:11] [I] CUDA Graph: Disabled

[09/06/2022-13:00:11] [I] Separate profiling: Disabled

[09/06/2022-13:00:11] [I] Time Deserialize: Disabled

[09/06/2022-13:00:11] [I] Time Refit: Disabled

[09/06/2022-13:00:11] [I] Skip inference: Disabled

[09/06/2022-13:00:11] [I] Inputs:

[09/06/2022-13:00:11] [I] === Reporting Options ===

[09/06/2022-13:00:11] [I] Verbose: Disabled

[09/06/2022-13:00:11] [I] Averages: 10 inferences

[09/06/2022-13:00:11] [I] Percentile: 99

[09/06/2022-13:00:11] [I] Dump refittable layers:Disabled

[09/06/2022-13:00:11] [I] Dump output: Disabled

[09/06/2022-13:00:11] [I] Profile: Disabled

[09/06/2022-13:00:11] [I] Export timing to JSON file:

[09/06/2022-13:00:11] [I] Export output to JSON file:

[09/06/2022-13:00:11] [I] Export profile to JSON file:

[09/06/2022-13:00:11] [I]

[09/06/2022-13:00:11] [I] === Device Information ===

[09/06/2022-13:00:11] [I] Selected Device: NVIDIA Tegra X2

[09/06/2022-13:00:11] [I] Compute Capability: 6.2

[09/06/2022-13:00:11] [I] SMs: 2

[09/06/2022-13:00:11] [I] Compute Clock Rate: 1.3 GHz

[09/06/2022-13:00:11] [I] Device Global Memory: 3825 MiB

[09/06/2022-13:00:11] [I] Shared Memory per SM: 64 KiB

[09/06/2022-13:00:11] [I] Memory Bus Width: 128 bits (ECC disabled)

[09/06/2022-13:00:11] [I] Memory Clock Rate: 1.3 GHz

[09/06/2022-13:00:11] [I]

[09/06/2022-13:00:11] [I] TensorRT version: 8.2.5

[09/06/2022-13:00:13] [I] [TRT] [MemUsageChange] Init CUDA: CPU +267, GPU +0, now: CPU 285, GPU 1692 (MiB)

[09/06/2022-13:00:13] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 285 MiB, GPU 1720 MiB

[09/06/2022-13:00:13] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 314 MiB, GPU 1749 MiB

[09/06/2022-13:00:13] [I] Start parsing network model

[09/06/2022-13:00:14] [I] [TRT] ----------------------------------------------------------------

[09/06/2022-13:00:14] [I] [TRT] Input filename: model.onnx

[09/06/2022-13:00:14] [I] [TRT] ONNX IR version: 0.0.8

[09/06/2022-13:00:14] [I] [TRT] Opset version: 11

[09/06/2022-13:00:14] [I] [TRT] Producer name: pytorch

[09/06/2022-13:00:14] [I] [TRT] Producer version: 1.13.1

[09/06/2022-13:00:14] [I] [TRT] Domain:

[09/06/2022-13:00:14] [I] [TRT] Model version: 0

[09/06/2022-13:00:14] [I] [TRT] Doc string:

[09/06/2022-13:00:14] [I] [TRT] ----------------------------------------------------------------

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:370: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:396: One or more weights outside the range of INT32 was clamped

[09/06/2022-13:00:14] [I] Finish parsing network model

[09/06/2022-13:00:14] [I] [TRT] ---------- Layers Running on DLA ----------

[09/06/2022-13:00:14] [I] [TRT] ---------- Layers Running on GPU ----------

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Constant_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Constant_1_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Constant_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_17_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_24_output_0 + (Unnamed Layer* 68) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_29_output_0 + (Unnamed Layer* 71) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_34_output_0 + (Unnamed Layer* 74) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_39_output_0 + (Unnamed Layer* 77) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 80) [Constant] + (Unnamed Layer* 82) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_40_output_0 + (Unnamed Layer* 83) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 86) [Constant] + (Unnamed Layer* 88) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_41_output_0 + (Unnamed Layer* 89) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_11_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_13_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_12_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_14_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Constant_output_0_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_11_output_0_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_13_output_0_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_21_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_42_output_0 + (Unnamed Layer* 113) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Constant_43_output_0 + (Unnamed Layer* 116) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_17_output_0_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Max_553 + (Unnamed Layer* 149) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Max_553_4 + (Unnamed Layer* 152) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Cast_6_output_0 + (Unnamed Layer* 155) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Cast_7_output_0 + (Unnamed Layer* 158) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Add_575

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Gather_586

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 186) [Constant] + (Unnamed Layer* 187) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Constant_2_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Constant_output_0_6

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Constant_6_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Constant_1_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_3_output_0 + (Unnamed Layer* 216) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_4_output_0 + (Unnamed Layer* 219) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Constant_4_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Constant_5_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_head.fc6.weight

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_head.fc6.bias + (Unnamed Layer* 238) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_head.fc7.weight

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_head.fc7.bias + (Unnamed Layer* 244) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_predictor.cls_score.weight

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_predictor.cls_score.bias + (Unnamed Layer* 250) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_predictor.bbox_pred.weight

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] roi_heads.box_predictor.bbox_pred.bias + (Unnamed Layer* 255) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_9_output_0 + (Unnamed Layer* 280) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_14_output_0 + (Unnamed Layer* 283) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_19_output_0 + (Unnamed Layer* 286) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_24_output_0 + (Unnamed Layer* 289) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 293) [Constant] + (Unnamed Layer* 295) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_25_output_0 + (Unnamed Layer* 296) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 299) [Constant] + (Unnamed Layer* 301) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_26_output_0 + (Unnamed Layer* 302) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Reshape_3_output_0

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_27_output_0 + (Unnamed Layer* 327) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Constant_28_output_0 + (Unnamed Layer* 330) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Max_553_7 + (Unnamed Layer* 362) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Max_553_8 + (Unnamed Layer* 365) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Cast_6_output_0_9 + (Unnamed Layer* 370) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Cast_7_output_0_10 + (Unnamed Layer* 373) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Squeeze

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Sub

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Div

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Unsqueeze

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Resize

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Gather

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Pad

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Unsqueeze_12

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /transform/Unsqueeze_12_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.0/conv/conv/Conv + /backbone/backbone.0/conv/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.0/pool/MaxPool

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.1/unit1/body/conv1/conv/Conv + /backbone/backbone.1/unit1/body/conv1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.1/unit1/body/conv2/conv/Conv + /backbone/backbone.1/unit1/Add + /backbone/backbone.1/unit1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.2/unit1/body/conv1/conv/Conv + /backbone/backbone.2/unit1/body/conv1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.2/unit1/body/conv2/conv/Conv

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.2/unit1/identity_conv/conv/Conv + /backbone/backbone.2/unit1/Add + /backbone/backbone.2/unit1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.3/unit1/body/conv1/conv/Conv + /backbone/backbone.3/unit1/body/conv1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.3/unit1/body/conv2/conv/Conv

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.3/unit1/identity_conv/conv/Conv + /backbone/backbone.3/unit1/Add + /backbone/backbone.3/unit1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.4/unit1/body/conv1/conv/Conv + /backbone/backbone.4/unit1/body/conv1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.4/unit1/body/conv2/conv/Conv

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /backbone/backbone.4/unit1/identity_conv/conv/Conv + /backbone/backbone.4/unit1/Add + /backbone/backbone.4/unit1/activ/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/head/conv/conv.0/conv.0.0/Conv + /rpn/head/conv/conv.0/conv.0.1/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/head/bbox_pred/Conv || /rpn/head/cls_logits/Conv

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape + /rpn/Transpose

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape_2 + /rpn/Transpose_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape_1_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape_3_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape_4 + /rpn/Reshape_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Flatten + /rpn/Reshape_8

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_19

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Slice

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Slice_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Slice_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Slice_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/TopK

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Div_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Div_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Div_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Div_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Mul_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Mul_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 84) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 90) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_18

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Add_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Add_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 85) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 91) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_20

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_22

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Exp

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Exp_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_25

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Mul_6

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Mul_7

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] PWN(/rpn/Sigmoid)

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Mul_9

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Mul_8

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Sub_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Add_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Sub_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Add_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Cast_464

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_15

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_17

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_16

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_18

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_15_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_16_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_17_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_18_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Squeeze_1 + Unsqueeze_471 + Unsqueeze_472 + NonMaxSuppression_475

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Flatten_1 + /rpn/Reshape_6 + /rpn/Reshape_7

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_23

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_24

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Slice_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Slice_6

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Max

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Max_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Min

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Min_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_28

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_29

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_28_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Unsqueeze_29_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Reshape_11

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] ReduceMax_463

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Add_466

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 167) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Mul_467

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Unsqueeze_468

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Add_469

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Unsqueeze_470

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] NonMaxSuppression_475_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Gather_477

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Squeeze_478

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /rpn/Gather_27

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Cast

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Gather_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Gather_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Gather_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Gather_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/ConstantOfShape

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Sub

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Sub_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Concat_1_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Concat_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Add

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Add_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Gather

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Gather_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Squeeze

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/Cast

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_roi_pool/RoiAlign

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_head/Flatten

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_head/fc6/Gemm

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 239) [ElementWise] + /roi_heads/box_head/Relu

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_head/fc7/Gemm

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 245) [ElementWise] + /roi_heads/box_head/Relu_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_predictor/cls_score/Gemm

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/box_predictor/bbox_pred/Gemm

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 251) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 256) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Softmax

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Slice

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Slice_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Slice_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Slice_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Div

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Div_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Div_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Div_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 297) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 303) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Add_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Add_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 298) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 304) [ElementWise]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Reshape_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Exp

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Exp_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul_7

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Mul_6

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Expand

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Sub_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Add_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Sub_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Add_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_5

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_7

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_6

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_8

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Reshape_6

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_5_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_6_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_7_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_8_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Cast_670

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Slice_6

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Slice_7

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Max

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Max_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Min

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Min_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_12

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_13

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_12_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Unsqueeze_13_output_0 copy

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Reshape_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Slice_8

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Reshape_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] ReduceMax_669

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Add_785

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Add_672

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] (Unnamed Layer* 388) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Mul_673

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Unsqueeze_674

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Add_675

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Unsqueeze_676

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Unsqueeze_677 + Unsqueeze_678 + NonMaxSuppression_681

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] NonMaxSuppression_681_11

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] onnx::Gather_796

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Gather_683

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] Squeeze_684

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Gather_9

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Gather_10

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /roi_heads/Gather_11

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Squeeze_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Squeeze_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Squeeze_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Squeeze_4

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Div_1_output_0 + (Unnamed Layer* 411) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Mul

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Div_1_output_0_17 + (Unnamed Layer* 414) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Mul_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Div_output_0 + (Unnamed Layer* 417) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Mul_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Div_output_0_18 + (Unnamed Layer* 420) [Shuffle]

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Mul_3

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Unsqueeze

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Unsqueeze_1

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Unsqueeze_2

[09/06/2022-13:00:14] [I] [TRT] [GpuLayer] /Unsqueeze_3

[09/06/2022-13:00:15] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +167, GPU +162, now: CPU 613, GPU 2188 (MiB)

[09/06/2022-13:00:17] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +250, GPU +286, now: CPU 863, GPU 2474 (MiB)

[09/06/2022-13:00:17] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.

[09/06/2022-13:00:57] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.

[09/06/2022-13:01:52] [I] [TRT] Detected 1 inputs and 7 output network tensors.

[09/06/2022-13:01:52] [I] [TRT] Total Host Persistent Memory: 26752

[09/06/2022-13:01:52] [I] [TRT] Total Device Persistent Memory: 36447744

[09/06/2022-13:01:52] [I] [TRT] Total Scratch Memory: 512000

[09/06/2022-13:01:52] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 9 MiB, GPU 384 MiB

[09/06/2022-13:01:52] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 205.304ms to assign 15 blocks to 199 nodes requiring 83747843 bytes.

[09/06/2022-13:01:52] [I] [TRT] Total Activation Memory: 83747843

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1115, GPU 2955 (MiB)

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +1, GPU +0, now: CPU 1116, GPU 2955 (MiB)

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +0, GPU +256, now: CPU 0, GPU 256 (MiB)

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 1251, GPU 3092 (MiB)

[09/06/2022-13:01:52] [I] [TRT] Loaded engine size: 137 MiB

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1252, GPU 3094 (MiB)

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1252, GPU 3094 (MiB)

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +136, now: CPU 0, GPU 136 (MiB)

[09/06/2022-13:01:52] [I] Engine built in 101.427 sec.

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 954, GPU 2841 (MiB)

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 954, GPU 2841 (MiB)

[09/06/2022-13:01:52] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +115, now: CPU 0, GPU 251 (MiB)

[09/06/2022-13:01:52] [I] Using random values for input input0

[09/06/2022-13:01:52] [I] Created input binding for input0 with dimensions 1x3x480x640

[09/06/2022-13:01:52] [I] Using random values for output scores

[09/06/2022-13:01:52] [I] Created output binding for scores with dimensions 100

[09/06/2022-13:01:52] [I] Using random values for output labels

[09/06/2022-13:01:52] [I] Created output binding for labels with dimensions 100

[09/06/2022-13:01:52] [I] Using random values for output boxes

[09/06/2022-13:01:52] [I] Created output binding for boxes with dimensions 100x4

[09/06/2022-13:01:52] [I] Starting inference

[09/06/2022-13:01:52] [E] Error[2]: [pluginV2DynamicExtRunner.cpp::execute::115] Error Code 2: Internal Error (Assertion status == kSTATUS_SUCCESS failed. )

[09/06/2022-13:01:52] [E] Error occurred during inference

I also observed that even though Faster-RCNN has three outputs (scores, labels, and boxes as shown toward the end of the above log), trtexec appeared to wrongly detect 7 outputs. I confirmed that when running the same command on my GTX 1080, the engine correctly detected 3 outputs.

[09/06/2022-13:01:52] [I] [TRT] Detected 1 inputs and 7 output network tensors.

I repeated Step 1 on multiple versions of TensorRT OSS, and all worked on GTX 1080 and 2080. I wonder if this could be a bug from JetPack 4.6.3? Could you look into the issue and let us know? I am happy to provide you all relevant files (onnx, TensorRT OSS package, etc) via DM. Many thanks in advance!

Hi,

This looks like a Jetson issue. Please refer to the below samples in case useful.

For any further assistance, we will move this post to to Jetson related forum.

Thanks!

Hi,

Thank you for your prompt reply!

I took a look at the above resources but didn’t find information relevant to my specific error or use case.

Could you help move this post to the Jetson-specific forum?

Many thanks!

Dear @alphadadajuju2,
May I know if this topic need support?

Hi,

The issue had been resolved. Thank you for following up!

Best regards.