Converting a custom yolo_model.onnx to int8 engine

Hardware Platform (Jetson / GPU) Orin Nano
DeepStream Version :6.3
JetPack Version (valid for Jetson only) 5.1.2-b104
TensorRT Version 8.5.2-1+cuda11.4
Issue Type Question

I have a working yolo_v4_tiny model onnx file. Running deepstream converts it to fp16-engine, but this works on limits of 6 gb RAM of Jetson Orin Nano and slows/crashes.

I would like to create an int8 file out of model.onnx. What are the steps I should do for the easiest way? Best Reagrds

You can use the below command to convert the ONNX model to an engine file, you need the calibration cache file generated during the calibration process:

trtexec --onnx=<model.onnx> --saveEngine=<model_int8.engine> --int8 --calib=<calibration.cache>
1 Like

Thank you. I will will run it on nvcr.io/nvidia/tensorrt:24.01-py3-igpu. How can I create the ```
calibration.cache? Many Thanks:)

DeepStream is only an inferencing framework. There is no calibration function in DeepStream.

There is int8 calibration API in TensorRT. Developer Guide :: NVIDIA Deep Learning TensorRT Documentation, please scroll to chapter 7.3

For more TensorRT calibration API usage, please refer to TensorRT forum. Latest Deep Learning (Training & Inference)/TensorRT topics - NVIDIA Developer Forums