why is the calibrated int8 pb file so large?


I downloaded the calibration sample from here:


I ran the .sh command in the sample, but the generated resnetV150_TRTINT8.pb and resnetV150_TRTINT8Calib.pb are both as large as 205.1MB. I think it is unreasonable as the frozen pb file is just 102.6MB. I am wondering whether someone else met with the same issue.

My system environment is as below:
OS: Ubuntu 16.04
Cuda: 9.0
CuDNN: 7.0.5
Python: 3.5.2
Tensorflow: 1.12.0
GPU: K40

I have two questions about this issue:

  1. What may be the reason for the unexpected calibrated model file size?
  2. I want to check the calibrated weight values in the pb file. It seems that the net parameters are packed into the TrtEngineOp. How can I export the calibrated int8 weights in it?


Per engineering:

  • regarding the size: Yes. the size is expected. one copy for TRT, one copy for TF Native execution.

  • Regarding INT8 weights, does TFTRT writes the new .pb file with TRTEngine? If thats the case, It should store INT8 weights.

I am not aware of any method to expose INT8 weights from the engine.

Note: TensorRT from network level API will consume weights in FP32 precision. Once engine is created, weights are compressed to INT8 and stored.


Thanks for your kind reply.Your answer helped me a lot!

TFTRT did write a new .pb file with TRTEngine in my test.It can be ran much more efficiently than FP32 version, which means that the int8 calibration has been taken.


I just read this post, and I want to ask where can I find other examples of TensorRT.
I just can download this example but can’t access the website that contains this sample!

Thanks for help!