Pruning .onnx and convert to .engine

I already have a ,onnx model exported from yolov5_6.1, it was customed from 5m pretrained, I added a CABlock and used GhostConv instead of Conv.
Now on NX, the inference time is 200ms, NMS takes 6.4ms, My object is to reduce the total time to 10-12ms below.
what I need to prune it, and convert it to tensorrt on Jetson NX?
Jetson NX, JP4.5


Is retraining an option for you?
If yes, you can try our TAO toolkit below to get a pruned model.

Please noted that you can maximize the device performance with the following command:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Then you can convert the model into TensorRT with our trtexec tool directly.

$ /usr/src/tensorrt/bin/trtexec --onnx=[your/model]

For better performance, you can also try to convert it into fp16 (--fp16) or int8 (--int8) model.

on jetson nx, jetpack 4.5
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=best.onnx
[05/06/2022-08:10:29] [I] === Model Options ===
[05/06/2022-08:10:29] [I] Format: ONNX
[05/06/2022-08:10:29] [I] Model: best.onnx
[05/06/2022-08:10:29] [I] Output:
[05/06/2022-08:10:29] [I] === Build Options ===
[05/06/2022-08:10:29] [I] Max batch: 1
[05/06/2022-08:10:29] [I] Workspace: 16 MB
[05/06/2022-08:10:29] [I] minTiming: 1
[05/06/2022-08:10:29] [I] avgTiming: 8
[05/06/2022-08:10:29] [I] Precision: FP32
[05/06/2022-08:10:29] [I] Calibration:
[05/06/2022-08:10:29] [I] Safe mode: Disabled
[05/06/2022-08:10:29] [I] Save engine:
[05/06/2022-08:10:29] [I] Load engine:
[05/06/2022-08:10:29] [I] Builder Cache: Enabled
[05/06/2022-08:10:29] [I] NVTX verbosity: 0
[05/06/2022-08:10:29] [I] Inputs format: fp32:CHW
[05/06/2022-08:10:29] [I] Outputs format: fp32:CHW
[05/06/2022-08:10:29] [I] Input build shapes: model
[05/06/2022-08:10:29] [I] Input calibration shapes: model
[05/06/2022-08:10:29] [I] === System Options ===
[05/06/2022-08:10:29] [I] Device: 0
[05/06/2022-08:10:29] [I] DLACore:
[05/06/2022-08:10:29] [I] Plugins:
[05/06/2022-08:10:29] [I] === Inference Options ===
[05/06/2022-08:10:29] [I] Batch: 1

[05/06/2022-08:10:31] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of ‘std::out_of_range’
** what(): Attribute not found: axes**
Aborted (core dumped)

please info me how to solve?


Attribute not found: axes**

It looks like the error occurs by some non-supported layer definition.
Could you try it with the JetPack 4.6.1 (TensorRT 8.2) to see if it works?



This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.