ONNX Model Inference on Jetson Nano - Segmentation fault

Description

Trying to code inference for onnx-model (see attachement) on Jetson Nano (C++)
I fixed the Batch size problem but now i get segmentation fault.

Environment

Operating System + Version: JetPack 4.5 with all default CUDA and TensorRT versions!
TensorRT_src_version-RFB-640.onnx (123.1 KB)
TensorRT_src_face_det_alt.cpp (7.0 KB)
TensorRT_CMakeLists.txt (730 Bytes)

Thanks and have a nice day

Hi @toni.sedlmeier,

We recommend you to please post your concern on Jetson related platform to get better help.

Thank you.

Hi,

The model you shared doesn’t look correct. (It contains webpage info rather than the model).
Could you check it and update the model again?

Thanks.

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Looks like i shared a false version of the model.
I tested the script and it passed.
I am using a pretrained model from ONNX Model Zoo. I just want to evaluate some performance properties of TensorRT.
Heres the log from trtexec and the correct model file:version-RFB-640.onnx (1.5 MB)

log.txt (6.2 KB)

Thanks and have a nice day:D
Toni

Hi,

We just checked your new uploaded model with trtexec.

The model can run successfully with TensorRT (JetPack 4.5.1).
Could you double confirm if it also works in your environment?

$ /usr/src/tensorrt/bin/trtexec --onnx=version-RFB-640.onnx
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=version-RFB-640.onnx
[08/24/2021-12:12:37] [I] === Model Options ===
[08/24/2021-12:12:37] [I] Format: ONNX
[08/24/2021-12:12:37] [I] Model: version-RFB-640.onnx
[08/24/2021-12:12:37] [I] Output:
[08/24/2021-12:12:37] [I] === Build Options ===
[08/24/2021-12:12:37] [I] Max batch: 1
[08/24/2021-12:12:37] [I] Workspace: 16 MB
[08/24/2021-12:12:37] [I] minTiming: 1
[08/24/2021-12:12:37] [I] avgTiming: 8
[08/24/2021-12:12:37] [I] Precision: FP32
[08/24/2021-12:12:37] [I] Calibration:
[08/24/2021-12:12:37] [I] Safe mode: Disabled
[08/24/2021-12:12:37] [I] Save engine:
[08/24/2021-12:12:37] [I] Load engine:
[08/24/2021-12:12:37] [I] Builder Cache: Enabled
[08/24/2021-12:12:37] [I] NVTX verbosity: 0
[08/24/2021-12:12:37] [I] Inputs format: fp32:CHW
[08/24/2021-12:12:37] [I] Outputs format: fp32:CHW
[08/24/2021-12:12:37] [I] Input build shapes: model
[08/24/2021-12:12:37] [I] Input calibration shapes: model
[08/24/2021-12:12:37] [I] === System Options ===
[08/24/2021-12:12:37] [I] Device: 0
[08/24/2021-12:12:37] [I] DLACore:
[08/24/2021-12:12:37] [I] Plugins:
[08/24/2021-12:12:37] [I] === Inference Options ===
[08/24/2021-12:12:37] [I] Batch: 1
[08/24/2021-12:12:37] [I] Input inference shapes: model
[08/24/2021-12:12:37] [I] Iterations: 10
[08/24/2021-12:12:37] [I] Duration: 3s (+ 200ms warm up)
[08/24/2021-12:12:37] [I] Sleep time: 0ms
[08/24/2021-12:12:37] [I] Streams: 1
[08/24/2021-12:12:37] [I] ExposeDMA: Disabled
[08/24/2021-12:12:37] [I] Spin-wait: Disabled
[08/24/2021-12:12:37] [I] Multithreading: Disabled
[08/24/2021-12:12:37] [I] CUDA Graph: Disabled
[08/24/2021-12:12:37] [I] Skip inference: Disabled
[08/24/2021-12:12:37] [I] Inputs:
[08/24/2021-12:12:37] [I] === Reporting Options ===
[08/24/2021-12:12:37] [I] Verbose: Disabled
[08/24/2021-12:12:37] [I] Averages: 10 inferences
[08/24/2021-12:12:37] [I] Percentile: 99
[08/24/2021-12:12:37] [I] Dump output: Disabled
[08/24/2021-12:12:37] [I] Profile: Disabled
[08/24/2021-12:12:37] [I] Export timing to JSON file:
[08/24/2021-12:12:37] [I] Export output to JSON file:
[08/24/2021-12:12:37] [I] Export profile to JSON file:
[08/24/2021-12:12:37] [I]
----------------------------------------------------------------
Input filename:   version-RFB-640.onnx
ONNX IR version:  0.0.4
Opset version:    9
Producer name:    pytorch
Producer version: 1.3
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[08/24/2021-12:12:39] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[08/24/2021-12:12:39] [I] [TRT]
[08/24/2021-12:12:39] [I] [TRT] --------------- Layers running on DLA:
[08/24/2021-12:12:39] [I] [TRT]
[08/24/2021-12:12:39] [I] [TRT] --------------- Layers running on GPU:
[08/24/2021-12:12:39] [I] [TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation], (Unnamed Layer* 3) [Convolution] + (Unnamed Layer* 5) [Activation], (Unnamed Layer* 6) [Convolution] + (Unnamed Layer* 8) [Activation], (Unnamed Layer* 9) [Convolution] + (Unnamed Layer* 11) [Activation], (Unnamed Layer* 12) [Convolution] + (Unnamed Layer* 14) [Activation], (Unnamed Layer* 15) [Convolution] + (Unnamed Layer* 17) [Activation], (Unnamed Layer* 18) [Convolution] + (Unnamed Layer* 20) [Activation], (Unnamed Layer* 21) [Convolution] + (Unnamed Layer* 23) [Activation], (Unnamed Layer* 24) [Convolution] + (Unnamed Layer* 26) [Activation], (Unnamed Layer* 27) [Convolution] + (Unnamed Layer* 29) [Activation], (Unnamed Layer* 30) [Convolution] + (Unnamed Layer* 32) [Activation], (Unnamed Layer* 33) [Convolution] + (Unnamed Layer* 35) [Activation], (Unnamed Layer* 36) [Convolution] + (Unnamed Layer* 38) [Activation], (Unnamed Layer* 53) [Convolution] || (Unnamed Layer* 46) [Convolution] || (Unnamed Layer* 39) [Convolution], (Unnamed Layer* 41) [Convolution] + (Unnamed Layer* 43) [Activation], (Unnamed Layer* 48) [Convolution] + (Unnamed Layer* 50) [Activation], (Unnamed Layer* 55) [Convolution] + (Unnamed Layer* 57) [Activation], (Unnamed Layer* 58) [Convolution] + (Unnamed Layer* 60) [Activation], (Unnamed Layer* 51) [Convolution], (Unnamed Layer* 44) [Convolution], (Unnamed Layer* 61) [Convolution], (Unnamed Layer* 64) [Convolution], (Unnamed Layer* 66) [Convolution] + (Unnamed Layer* 68) [ElementWise] + (Unnamed Layer* 69) [Activation], (Unnamed Layer* 98) [Convolution] + (Unnamed Layer* 100) [Activation], (Unnamed Layer* 84) [Convolution] + (Unnamed Layer* 85) [Activation], (Unnamed Layer* 70) [Convolution] + (Unnamed Layer* 71) [Activation], (Unnamed Layer* 72) [Convolution], (Unnamed Layer* 73) [Shuffle] + (Unnamed Layer* 83) [Shuffle], (Unnamed Layer* 86) [Convolution], (Unnamed Layer* 87) [Shuffle] + (Unnamed Layer* 97) [Shuffle], (Unnamed Layer* 101) [Convolution] + (Unnamed Layer* 103) [Activation], (Unnamed Layer* 104) [Convolution] + (Unnamed Layer* 106) [Activation], (Unnamed Layer* 107) [Convolution] + (Unnamed Layer* 109) [Activation], (Unnamed Layer* 110) [Convolution] + (Unnamed Layer* 112) [Activation], (Unnamed Layer* 113) [Convolution] + (Unnamed Layer* 115) [Activation], (Unnamed Layer* 144) [Convolution] + (Unnamed Layer* 146) [Activation], (Unnamed Layer* 130) [Convolution] + (Unnamed Layer* 131) [Activation], (Unnamed Layer* 116) [Convolution] + (Unnamed Layer* 117) [Activation], (Unnamed Layer* 118) [Convolution], (Unnamed Layer* 119) [Shuffle] + (Unnamed Layer* 129) [Shuffle], (Unnamed Layer* 132) [Convolution], (Unnamed Layer* 133) [Shuffle] + (Unnamed Layer* 143) [Shuffle], (Unnamed Layer* 147) [Convolution] + (Unnamed Layer* 149) [Activation], (Unnamed Layer* 150) [Convolution] + (Unnamed Layer* 152) [Activation], (Unnamed Layer* 153) [Convolution] + (Unnamed Layer* 155) [Activation], (Unnamed Layer* 184) [Convolution] + (Unnamed Layer* 185) [Activation], (Unnamed Layer* 170) [Convolution] + (Unnamed Layer* 171) [Activation], (Unnamed Layer* 156) [Convolution] + (Unnamed Layer* 157) [Activation], (Unnamed Layer* 158) [Convolution], (Unnamed Layer* 159) [Shuffle] + (Unnamed Layer* 169) [Shuffle], (Unnamed Layer* 172) [Convolution], (Unnamed Layer* 173) [Shuffle] + (Unnamed Layer* 183) [Shuffle], (Unnamed Layer* 186) [Convolution] + (Unnamed Layer* 187) [Activation], (Unnamed Layer* 188) [Convolution] + (Unnamed Layer* 189) [Activation], (Unnamed Layer* 202) [Convolution] || (Unnamed Layer* 190) [Convolution], (Unnamed Layer* 203) [Shuffle] + (Unnamed Layer* 213) [Shuffle], 342 copy, 388 copy, 428 copy, 458 copy, (Unnamed Layer* 285) [Slice], (Unnamed Layer* 253) [Slice], (Unnamed Layer* 191) [Shuffle] + (Unnamed Layer* 201) [Shuffle], 328 copy, 374 copy, 414 copy, 446 copy, (Unnamed Layer* 225) [Shuffle], (Unnamed Layer* 226) [Softmax], (Unnamed Layer* 227) [Shuffle], (Unnamed Layer* 257) [Constant], (Unnamed Layer* 290) [Constant], PWN(PWN(PWN((Unnamed Layer* 286) [Constant] + (Unnamed Layer* 287) [Shuffle], (Unnamed Layer* 288) [ElementWise]), (Unnamed Layer* 289) [Unary]), (Unnamed Layer* 291) [ElementWise]), (Unnamed Layer* 259) [Constant], PWN(PWN(PWN((Unnamed Layer* 254) [Constant] + (Unnamed Layer* 255) [Shuffle], (Unnamed Layer* 256) [ElementWise]), (Unnamed Layer* 258) [ElementWise]), (Unnamed Layer* 260) [ElementWise]), 468 copy, 474 copy, (Unnamed Layer* 396) [Slice], (Unnamed Layer* 371) [Slice], (Unnamed Layer* 342) [Slice], (Unnamed Layer* 317) [Slice], PWN(PWN((Unnamed Layer* 397) [Constant] + (Unnamed Layer* 398) [Shuffle], (Unnamed Layer* 399) [ElementWise]), (Unnamed Layer* 400) [ElementWise]), PWN(PWN((Unnamed Layer* 343) [Constant] + (Unnamed Layer* 344) [Shuffle], (Unnamed Layer* 345) [ElementWise]), (Unnamed Layer* 346) [ElementWise]), 480 copy, 485 copy,
[08/24/2021-12:12:47] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[08/24/2021-12:13:43] [I] [TRT] Detected 1 inputs and 4 output network tensors.
[08/24/2021-12:13:43] [I] Starting inference threads
[08/24/2021-12:13:46] [I] Warmup completed 53 queries over 200 ms
[08/24/2021-12:13:46] [I] Timing trace has 786 queries over 3.00927 s
[08/24/2021-12:13:46] [I] Trace averages of 10 runs:
[08/24/2021-12:13:46] [I] Average on 10 runs - GPU latency: 3.6301 ms - Host latency: 3.76711 ms (end to end 3.77763 ms, enqueue 1.52328 ms)
...
[08/24/2021-12:13:46] [I] Average on 10 runs - GPU latency: 3.67217 ms - Host latency: 3.81064 ms (end to end 3.82107 ms, enqueue 1.36304 ms)
[08/24/2021-12:13:46] [I] Host Latency
[08/24/2021-12:13:46] [I] min: 3.74438 ms (end to end 3.75781 ms)
[08/24/2021-12:13:46] [I] max: 3.88342 ms (end to end 3.89673 ms)
[08/24/2021-12:13:46] [I] mean: 3.8181 ms (end to end 3.82853 ms)
[08/24/2021-12:13:46] [I] median: 3.82004 ms (end to end 3.82956 ms)
[08/24/2021-12:13:46] [I] percentile: 3.86523 ms at 99% (end to end 3.87598 ms at 99%)
[08/24/2021-12:13:46] [I] throughput: 261.193 qps
[08/24/2021-12:13:46] [I] walltime: 3.00927 s
[08/24/2021-12:13:46] [I] Enqueue Time
[08/24/2021-12:13:46] [I] min: 1.16895 ms
[08/24/2021-12:13:46] [I] max: 2.15942 ms
[08/24/2021-12:13:46] [I] median: 1.38501 ms
[08/24/2021-12:13:46] [I] GPU Compute
[08/24/2021-12:13:46] [I] min: 3.60962 ms
[08/24/2021-12:13:46] [I] max: 3.74109 ms
[08/24/2021-12:13:46] [I] mean: 3.67907 ms
[08/24/2021-12:13:46] [I] median: 3.68103 ms
[08/24/2021-12:13:46] [I] percentile: 3.72363 ms at 99%
[08/24/2021-12:13:46] [I] total compute time: 2.89175 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=version-RFB-640.onnx

Thanks.

Hi everyone,

i fixed the bug and now i get predictions of the right shape.
However, the problem is now with postprocessing.
In the GitHub Model Zoo the tell to do some NMS: models/vision/body_analysis/ultraface at main · onnx/models · GitHub
I found an interesting plugin: TensorRT/plugin/batchedNMSPlugin at master · NVIDIA/TensorRT · GitHub
Can i use this Plugin for my specific purpose?
And if not, is there a NMS Code Snipped for Jetson Platforms?

Thanks and have a nice day!

Toni

Hi,

Since it is open-sourced, you can modify it into your use case.
There is a prebuilt version in Jetson already:

$ ll /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so*
lrwxrwxrwx 1 root root      26  六   6  2020 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so -> libnvinfer_plugin.so.7.1.3
lrwxrwxrwx 1 root root      26  六   6  2020 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7 -> libnvinfer_plugin.so.7.1.3
-rw-r--r-- 1 root root 5630344  六   6  2020 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.1.3

Thanks.