Difference between TRT engine file size for FP16 & 32

Linux version : Ubuntu 16.04 LTS
GPU type : GeForce GTX 1080
nvidia driver version : 396.44
CUDA version : 9.0
CUDNN version : 7.0.5
Python version [if using python] : 3.5.2
Tensorflow version : tensorflow-gpu 1.9
TensorRT version :

I have created TRT engines for Resnet model using python. There’s not much difference in fp16 and fp32 trt_engine file size.

Is this supposed to be like this in TRT 5.0.2 ???

Observed that in TRT 3.0.2 version there’s huge difference in fp16 and fp32 trt_engine files (fp16 is almost half of fp32 in terms of file size for Resnet)

Thanks in advance


It’s hard to quantify if this is “expected”. Engine serialization sizes are affected by many variables (due to variability in layer timings, etc).