TensorflowTRT vs TensorRT RAM usage

Hello all,

here are some observations I have made in working with the TensorRT platform.

I was able to convert several Tensorflow models both to pure TensorRT and Tensorflow-TRT.

I observed a more than double RAM usage at inference on the TFTRT version of the same model.

E.g. ssd_inception_v2 from the model zoo needs around 500MB to run inference in pure TensorRT form, while the TFTRT version requires more than 1.2GB.

I understand that this might be related to tensorflow-internal overhead but is such a large difference expected?

When running on embedded devices RAM usage can be a critical performance metric so I am looking forward to feedback.


TFTRT is running based on Tensorflow, which uses as much memory as it can. Are you sure the entire 1.2GB is used by tensorrt? What is the value for “per_process_gpu_memory_fraction” in both cases?