why is the calibrated int8 pb file so large?

lt23yuan · November 27, 2018, 11:35am

Hi,

I downloaded the calibration sample from here：

https://developer.download.nvidia.com/devblogs/tftrt_sample.tar.xz?spm=a2c4e.11153940.blogcont579985.9.2c9030d0Z0Lock&file=tftrt_sample.tar.xz

I ran the .sh command in the sample, but the generated resnetV150_TRTINT8.pb and resnetV150_TRTINT8Calib.pb are both as large as 205.1MB. I think it is unreasonable as the frozen pb file is just 102.6MB. I am wondering whether someone else met with the same issue.

My system environment is as below:
OS: Ubuntu 16.04
Cuda: 9.0
CuDNN: 7.0.5
Python: 3.5.2
Tensorflow: 1.12.0
TensorRT: 4.0.1.6
GPU: K40

I have two questions about this issue:

What may be the reason for the unexpected calibrated model file size?
I want to check the calibrated weight values in the pb file. It seems that the net parameters are packed into the TrtEngineOp. How can I export the calibrated int8 weights in it?

NVES · November 30, 2018, 9:37pm

Hello,

Per engineering:

regarding the size: Yes. the size is expected. one copy for TRT, one copy for TF Native execution.
Regarding INT8 weights, does TFTRT writes the new .pb file with TRTEngine? If thats the case, It should store INT8 weights.

I am not aware of any method to expose INT8 weights from the engine.

Note: TensorRT from network level API will consume weights in FP32 precision. Once engine is created, weights are compressed to INT8 and stored.

lt23yuan · December 6, 2018, 10:08am

Hello,

Thanks for your kind reply.Your answer helped me a lot!

TFTRT did write a new .pb file with TRTEngine in my test.It can be ran much more efficiently than FP32 version, which means that the int8 calibration has been taken.

farescharfii · January 17, 2020, 2:08pm

Hello,

I just read this post, and I want to ask where can I find other examples of TensorRT.
I just can download this example but can’t access the website that contains this sample!

Thanks for help!