TX2 NX ONNX Convert TensorRT Engine

My onnx convert to TensorRT, MB size will be larger than the original model.

My object detection model 22 mb (onnx) => 30 mb (tensorRT engine)
My pose estimation model 83 mb (onnx) => 125 mb (tensorRT engine)

I used Xavier NX, model size always is smaller than the original model.
I tested nano, its tensorrt engine size also is larger than the original model.

I find reason. Because nano module and tx2 nx module is default about converting tensorrt with fp32. Therefore, I must first use the following command.

/usr/src/tensorrt/bin/trtexec --onnx=pose_estimation.onnx --fp16 --saveEngine=pose_estimation.onnx_b1_gpu0_fp16.engine

By the way, why I used trtexec to get the size of model, but int8 model is larger than fp16 model? Brcause tx2 nx and nano don’t support int8 model?


The model size is related to the GPU architecture.

TensorRT will choose a faster algorithm based on architecture.
So the serialized file will vary from platform.