TensorRT not improving FPS on GTX 1080ti


I am trying to work with TensorRT and Tensorflow.
Pure TF with SSD Inception V2 : around 12fps
TF + TensorRT with SSD Inception V2 : around 12 fps

My config is :
CUDA : V9.0.176
Tensorflow : 1.12 (tried with TF 1.9 and TF 1.8)
NVIDIA Driver : 410.72
TensorRT : 4.1.2-1+cuda9.0
Ubuntu 16.04

Init of my TF graph is :

detection_graph = tf.Graph()
        with detection_graph.as_default():
            od_graph_def = tf.GraphDef()
            with tf.gfile.GFile(MODEL['PATH_TO_CKPT'], 'rb') as fid:
                serialized_graph = fid.read()
                trt_graph = trt.create_inference_graph(
                    max_workspace_size_bytes=1 << 25,
                tf_config = tf.ConfigProto()
                tf_config.gpu_options.allow_growth = True
                self.sess = tf.Session(config=tf_config)
                tf.import_graph_def(trt_graph, name='')

Note that I have this warning :

Engine buffer is full. buffer limit=1, current entries=300, requested batch=1917

When setting max_batch_size to 1917, I’m out of memory.

Thank you,


what type of GPU are you using? Also,it’d help us if you have a small repro that demonstrates the perf comparison between TF and TRT.

Also, what happens if you lower the batch size? how’s performance then?

I use a GTX 1080Ti
I created a small repo with a very simplified version of the code. You can set TRT_mode to true or false. Don’t froget to set PATH_TO_CKPT and PATH_TO_LABELS too.
Link : https://github.com/jcRisch/TMP_NVIDIA/blob/master/DEMO_NVIDIA.py

Also, lowering the batch size doesn’t change anything.
Maybe, my install of trt is wrong (but got no error when using it), is there a way to check the installation ?

Thank you,

Regarding TensorRT installation. I usually recommend trying TensorRT container, which removes a lot of the dependencies. The containers are available from NVIDIA GPU Cloud (NGC), and accounts are free: https://www.nvidia.com/en-us/gpu-cloud/

Also, can you share


NVIDIA Enterprise Support

I downloaded frozen_inference_graph.pb from : https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md (ssd_inception_v2_coco)

mscoco_label_map.pbtxt can be found here : https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_label_map.pbtxt

Thank you,


while we triage this issue. Can you try TRT5GA to see if you see performance increase?


We are working on this, this has been addressed here https://github.com/tensorflow/models/issues/5695

Per Engineering, you can run the SSD network in native TRT, we have a python and a C++ sample demonstrating that. Python sample also has the ability to compare with native TF.

See sampleUffSSD for C++ and python uff_ssd sample.

I tried TRT5GA, but it doesn’t work.
Error :
tensorflow.python.framework.errors_impl.NotFoundError: libnvinfer.so.4: cannot open shared object file: No such file or directory

I think tensorflow asked for TRT4 and found TRT5 :
me@me /opt/TensorRT-$ ls
libnvcaffe_parser.a libnvonnxparser_runtime.so.0
libnvcaffe_parser.so libnvonnxparser_runtime.so.0.1.0
libnvcaffe_parser.so.5 libnvonnxparser_runtime_static.a
libnvcaffe_parser.so.5.0.2 libnvonnxparser.so
libnvinfer_plugin.so libnvonnxparser.so.0
libnvinfer_plugin.so.5 libnvonnxparser.so.0.1.0
libnvinfer_plugin.so.5.0.2 libnvonnxparser_static.a
libnvinfer_plugin_static.a libnvparsers.so
libnvinfer.so libnvparsers.so.5
libnvinfer.so.5 libnvparsers.so.5.0.2
libnvinfer.so.5.0.2 libnvparsers_static.a
libnvinfer_static.a libprotobuf.a
libnvonnxparser_runtime.so libprotobuf-lite.a

Also, when looking at tf/contrib/tensorrt (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/tensorrt), we can see a doc for TRT3. When looking at the official tf installation guide (https://www.tensorflow.org/install/gpu), tf ask for TRT4.

Is there any compatibility table between TRT and TF versions ?

OK for your last comment, I will wait next releases for SSD !

Thanks !

Hello, I think this is a TensorFlow-TensorRT integration issue. Which has been addressed in newer version of TensorFlow. The issue was addressed very recently (https://github.com/tensorflow/models/issues/5695), so I think you’ll have wait for something TF 1.12 or later?