Why is the model engine file so much large with TensorRT 7.1 compared to 5.1 or 6.0?


I’m using Jetpack on the Jetson Nano to run a TensorRT version of YOLO. I’m finding that the serialised model engine is much larger with TensorRT 7.1 (Jetpack 4.4 DP) compared to TensorRT 6.0 (Jetpack 4.3) and TensorRT 5.1 (Jetpack 4.2.1), even though the model is the same in each case.

The sizes are as follows:
TensorRT 5.1: 184 MB
TensorRT 6.0: 184 MB
TensorRT 7.1: 302 MB.

I initially thought that perhaps a kFLOAT model was being created instead of the kHALF model that I wanted. However, when I generated a kFLOAT model using TensorRT 7.1 it was 600 MB, so that doesn’t seem to be the cause of what I’m seeing. The performance of three models are similar in the few tests that I have run.

I’d appreciate it if anyone could shed light on this.

I’m no expert, but if its not the precision, then it must be the weights have changed.


The TensorRT engine will differ from the chosen inference algorithm.
Since we introduce many acceleration in the new release, the file size will be different.

Would you mind to share the model and the command to reproduce this issue?
We want to feedback this issue to our internal team first.


Thanks @AastaLLL. The command used to build the model is IBuilder::buildEngineWithConfig().

Is it the serialised plan file that you want or the code that was used to generate the model? The code is based on Nvidia sample code from a trt-yolo sample app that came with the deepstream 3 repository. I don’t think it’s available online any more. It builds up the yolo model from scratch based on a yolo config file. I have made some changes from the original sample though.

Thanks for the information.
We will try it and update more information with you later.


OK - thank you. If it would be helpful I could probably condense my code into a sample app that would build the model under the different TensorRT versions for comparison.