Trtexec stuck on jetson nano while converting onnx to TensorRT

wang_xg · February 3, 2021, 7:58am

I have converted the yolov4.pth to onnx, but when I using trtexec on jetson nono, the process stuck for hours.

> $ trtexec --onnx=yolov4_-1_3_416_416_dynamic.onnx --minShapes=input:1x3x416x416 --optShapes=input:8x3x416x416 --maxShapes=input:8x3x416x416 --workspace=2048  --saveEngine=yolov4-uniform-dynamic-max8.engine --fp16
> &&&& RUNNING TensorRT.trtexec # trtexec --onnx=yolov4_-1_3_416_416_dynamic.onnx --minShapes=input:1x3x416x416 --optShapes=input:8x3x416x416 --maxShapes=input:8x3x416x416 --workspace=2048 --saveEngine=yolov4-uniform-dynamic-max8.engine --fp16
> [02/03/2021-13:56:38] [I] === Model Options ===
> [02/03/2021-13:56:38] [I] Format: ONNX
> [02/03/2021-13:56:38] [I] Model: yolov4_-1_3_416_416_dynamic.onnx
> [02/03/2021-13:56:38] [I] Output:
> [02/03/2021-13:56:38] [I] === Build Options ===
> [02/03/2021-13:56:38] [I] Max batch: explicit
> [02/03/2021-13:56:38] [I] Workspace: 2048 MB
> [02/03/2021-13:56:38] [I] minTiming: 1
> [02/03/2021-13:56:38] [I] avgTiming: 8
> [02/03/2021-13:56:38] [I] Precision: FP32+FP16
> [02/03/2021-13:56:38] [I] Calibration: 
> [02/03/2021-13:56:38] [I] Safe mode: Disabled
> [02/03/2021-13:56:38] [I] Save engine: yolov4-uniform-dynamic-max8.engine
> [02/03/2021-13:56:38] [I] Load engine: 
> [02/03/2021-13:56:38] [I] Builder Cache: Enabled
> [02/03/2021-13:56:38] [I] NVTX verbosity: 0
> [02/03/2021-13:56:38] [I] Inputs format: fp32:CHW
> [02/03/2021-13:56:38] [I] Outputs format: fp32:CHW
> [02/03/2021-13:56:38] [I] Input build shape: input=1x3x416x416+8x3x416x416+8x3x416x416
> [02/03/2021-13:56:38] [I] Input calibration shapes: model
> [02/03/2021-13:56:38] [I] === System Options ===
> [02/03/2021-13:56:38] [I] Device: 0
> [02/03/2021-13:56:38] [I] DLACore: 
> [02/03/2021-13:56:38] [I] Plugins:
> [02/03/2021-13:56:38] [I] === Inference Options ===
> [02/03/2021-13:56:38] [I] Batch: Explicit
> [02/03/2021-13:56:38] [I] Input inference shape: input=8x3x416x416
> [02/03/2021-13:56:38] [I] Iterations: 10
> [02/03/2021-13:56:38] [I] Duration: 3s (+ 200ms warm up)
> [02/03/2021-13:56:38] [I] Sleep time: 0ms
> [02/03/2021-13:56:38] [I] Streams: 1
> [02/03/2021-13:56:38] [I] ExposeDMA: Disabled
> [02/03/2021-13:56:38] [I] Spin-wait: Disabled
> [02/03/2021-13:56:38] [I] Multithreading: Disabled
> [02/03/2021-13:56:38] [I] CUDA Graph: Disabled
> [02/03/2021-13:56:38] [I] Skip inference: Disabled
> [02/03/2021-13:56:38] [I] Inputs:
> [02/03/2021-13:56:38] [I] === Reporting Options ===
> [02/03/2021-13:56:38] [I] Verbose: Disabled
> [02/03/2021-13:56:38] [I] Averages: 10 inferences
> [02/03/2021-13:56:38] [I] Percentile: 99
> [02/03/2021-13:56:38] [I] Dump output: Disabled
> [02/03/2021-13:56:38] [I] Profile: Disabled
> [02/03/2021-13:56:38] [I] Export timing to JSON file: 
> [02/03/2021-13:56:38] [I] Export output to JSON file: 
> [02/03/2021-13:56:38] [I] Export profile to JSON file: 
> [02/03/2021-13:56:38] [I] 
> ----------------------------------------------------------------
> Input filename:   yolov4_-1_3_416_416_dynamic.onnx
> ONNX IR version:  0.0.6
> Opset version:    11
> Producer name:    pytorch
> Producer version: 1.7
> Domain:           
> Model version:    0
> Doc string:       
> ----------------------------------------------------------------
> [02/03/2021-13:56:41] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
> [02/03/2021-13:56:41] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
> [02/03/2021-13:56:41] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
> [02/03/2021-13:56:41] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
> [02/03/2021-13:56:41] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
> [02/03/2021-13:56:42] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
> [02/03/2021-13:56:42] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
> [02/03/2021-13:56:42] [W] [TRT] Output type must be INT32 for shape outputs
> [02/03/2021-13:56:42] [W] [TRT] Output type must be INT32 for shape outputs
> [02/03/2021-13:56:42] [W] [TRT] Output type must be INT32 for shape outputs
> [02/03/2021-13:56:42] [W] [TRT] Output type must be INT32 for shape outputs
> [02/03/2021-14:03:36] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.

I use top command but it shows the cpu usage of trtexec is 0%

9089 mcc 20 0 10.221g 471768 216444 S 0.0 11.6 12:46.87 trtexec

AastaLLL · February 4, 2021, 2:45am

Hi,

In the first time launch, TensorRT will evaluate the model and pick up a fast algorithm based on hardware and layer information.
This procedure takes several minutes and is working on GPU.

You can check the GPU utilization with tegrastats.

$ sudo tegrastats

Thanks.

936214531 · July 27, 2021, 8:22am

–verbose could print the logging into your terminal.And by the way, trtexec process is actually very slow as your shown

Topic		Replies	Views
ONNX to TensorRT model conversion failure on Jetson Nano Jetson Nano tensorrt	7	733	November 6, 2023
Can't execute TRT engine on Jetson Nano Jetson Nano tensorrt , onnx , tf-trt	8	1989	May 24, 2022
Infer time after conversion and ram usage TensorRT tensorrt	12	1224	February 15, 2022
Trtexec stuck,when convert onnx to rt TensorRT tensorrt	2	390	July 1, 2024
Problem converting tensorflow model to TensorRT Jetson Nano tensorrt , tensorflow	5	478	March 26, 2024
Unable to use trtexec to build engine file Jetson Nano tensorrt	3	196	October 1, 2024
&&&& FAILED TensorRT.trtexec [TensorRT v8502] Jetson Orin Nano tensorrt , cuda , ubuntu	7	1374	August 30, 2023
How to infer using tensorRT on jetson nano? Jetson Nano tensorrt , deep-learning	4	1130	October 15, 2021
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1521	July 12, 2022
Low FPS on Jetson Nano using TensorRT Jetson Nano tensorrt , tensorflow	7	1349	August 27, 2020

Trtexec stuck on jetson nano while converting onnx to TensorRT

Related topics