Strange changes in file size when deployed with tensorrt

648976749 · July 13, 2022, 6:41am

Description

When I used the TF-TRT tool to deploy the tensorflow model, I found that when deployed as FP32 precision, FP16 precision, and INT8 precision, the resulting file sizes varied greatly and were all much larger than the original tensorflow files. INT8 precision has the largest deployment file size. In theory, the deployment method with the lowest precision saves the smallest number of weight digits, and the file size should be reduced. I would like to ask a few questions.

Why is the deployment file much larger than the original tensorflow model file?
Why does the lower the deployment accuracy, the larger the file size?

Environment

TensorRT Version: 6.0.1.5
GPU Type: Titan Xp
Nvidia Driver Version: 440.36
CUDA Version: 10.1
CUDNN Version: 7.6.5.32
Operating System + Version: Ubuntu18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 2.3.0

Relevant Files

tensorflow model: 45M
FP32-deploymodel: 87M
FP16-deploymodel: 87M
INT8-deploymodel: 121M

spolisetty · July 13, 2022, 7:36am

Hi,

Looks like you’re using a very old version of the TensorRT, we recommend you to please try on the latest TensorRT version and let us know if you still face this issue.

Thank you.

648976749 · July 13, 2022, 8:22am

I installed jetpack5.0DP on jetson agx xavier, its environment is:
TensorRT Version : 8.4.0.11
CUDA Version : 11.4.4
CUDNN Version : 8.3.2
Operating System + Version : Ubuntu20.04
Python Version (if applicable) : 3.8.10
TensorFlow Version (if applicable) : 2.8.0

After doing the same TF-TRT conversion experiment, the obtained files still have the above two problems. The specific file size after deployment is:
tensorflow model: 45M
FP32-deploymodel: 87M
FP16-deploymodel: 87M
INT8-deploymodel: 99M

Could you please tell me why this happens?

spolisetty · July 13, 2022, 5:13pm

Hi,

We will get back to you on your queries, could you please share with us the issue repro script/model for better debugging here or via DM.

Thank you.

648976749 · July 14, 2022, 8:13am

Thank you for your reply. The tensorflow model file and the deployed model file of the resnet18 classification model I used are in the following files.
tf_model.zip (39.7 MB)
tf_model_FP32.zip (79.2 MB)
tf_model_FP16.zip (79.2 MB)
tf_model_INT8.zip (89.1 MB)

The tf-trt based deployment code and demo image files I used are in the following files：
tf-trt.zip (21.3 KB)

spolisetty · July 15, 2022, 12:42pm

Hi,

Currently, there would be both TF and TRT portions of the network included. That’s why its expected to model size larger.

Even if INT precision is enabled, there is no guarantee that it will be used (TRT is allowed to use Fp16/32 if that is faster). So having the same engine size in all the cases is possible. INT models also save the calibration table, which could further increase the size. We are verifying to confirm.

Thank you.

648976749 · July 18, 2022, 2:11am

Thank you very much for your reply. I have a general understanding of the reason. If there is a relevant detailed description published, please inform me.

spolisetty · July 19, 2022, 5:16am

@648976749,

Thank you so much for bringing this issue to our attention.
Actually, the calibration table is lightweight, There is some other reason for TF-TRT that results in a large saved model size.
We are tracking this issue internally and also created an issue in Tensorflow GIT - Variables saved in converted model · Issue #305 · tensorflow/tensorrt · GitHub.

Note that for the FP32 and FP16 conversion, the model was not built (which means the TRT engines were not saved to disk, they are created on the fly). If we call converter.build(input_fn) (where input_fn can be the same function that was used for calibration) before convert.save() then we shall see FP32 model size >= FP16 model size >= INT8 model size

Thank you.

Topic		Replies	Views
TF-TRT optimzied PB file is too large compared to the first one in the Yolov4 algorithm TensorRT tensorrt , cuda , tensorflow	1	491	June 16, 2021
Error on transferring Tensorflow model to TensorRT TensorRT	1	826	November 13, 2021
Tensorflow to TensorRT converted file is too large TensorRT	6	529	January 13, 2021
why is the calibrated int8 pb file so large? TensorRT	4	812	October 12, 2021
TensorRT encountered issues when converting weights between types and that could affect accuracy TensorRT	7	1780	September 22, 2023
TensorRT INT8 Calibration Issue TensorRT tensorrt , tensorflow	7	2213	May 7, 2021
TF-TRTModel loading time is very slow TensorRT tensorrt , tensorflow	10	1056	September 1, 2023
INT8 calibration causes a significant decrease in accuracy when batch_size is greater than 1 TensorRT tensorrt	6	950	January 15, 2021
Tensorflow 1.7 with TensorRT fails Jetson TX2	13	3821	October 18, 2021
TF_TRT calibrate error TensorRT	9	884	March 13, 2020

Strange changes in file size when deployed with tensorrt

Description

Environment

Relevant Files

Related topics