TensorRT not improving FPS on GTX 1080ti

Hello,

I am trying to work with TensorRT and Tensorflow.
Pure TF with SSD Inception V2 : around 12fps
TF + TensorRT with SSD Inception V2 : around 12 fps

My config is :
CUDA : V9.0.176
Tensorflow : 1.12 (tried with TF 1.9 and TF 1.8)
NVIDIA Driver : 410.72
TensorRT : 4.1.2-1+cuda9.0
Ubuntu 16.04

Init of my TF graph is :

detection_graph = tf.Graph()
        with detection_graph.as_default():
            od_graph_def = tf.GraphDef()
            with tf.gfile.GFile(MODEL['PATH_TO_CKPT'], 'rb') as fid:
                serialized_graph = fid.read()
                od_graph_def.ParseFromString(serialized_graph)            
                
                trt_graph = trt.create_inference_graph(
                    input_graph_def=od_graph_def,
                    outputs=tensors,
                    max_batch_size=300,
                    minimum_segment_size=10,
                    max_workspace_size_bytes=1 << 25,
                    precision_mode='FP32',
                    is_dynamic_op=True
                )
                tf_config = tf.ConfigProto()
                tf_config.gpu_options.allow_growth = True
                self.sess = tf.Session(config=tf_config)
                tf.import_graph_def(trt_graph, name='')

Note that I have this warning :

Engine buffer is full. buffer limit=1, current entries=300, requested batch=1917

When setting max_batch_size to 1917, I’m out of memory.

Thank you,

Hello,

what type of GPU are you using? Also,it’d help us if you have a small repro that demonstrates the perf comparison between TF and TRT.

Also, what happens if you lower the batch size? how’s performance then?

Hello,
I use a GTX 1080Ti
I created a small repo with a very simplified version of the code. You can set TRT_mode to true or false. Don’t froget to set PATH_TO_CKPT and PATH_TO_LABELS too.
Link : https://github.com/jcRisch/TMP_NVIDIA/blob/master/DEMO_NVIDIA.py

Also, lowering the batch size doesn’t change anything.
Maybe, my install of trt is wrong (but got no error when using it), is there a way to check the installation ?

Thank you,

Regarding TensorRT installation. I usually recommend trying TensorRT container, which removes a lot of the dependencies. The containers are available from NVIDIA GPU Cloud (NGC), and accounts are free: https://www.nvidia.com/en-us/gpu-cloud/

Also, can you share

frozen_inference_graph.pb
mscoco_label_map.pbtxt

thanks
NVIDIA Enterprise Support

I downloaded frozen_inference_graph.pb from : https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md (ssd_inception_v2_coco)

mscoco_label_map.pbtxt can be found here : https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_label_map.pbtxt

Thank you,
JC

Hello,

while we triage this issue. Can you try TRT5GA to see if you see performance increase?

Hello,

We are working on this, this has been addressed here which network is suitalble for integrate tensorRT with tensorflow(TFTRT) optimiztion ? · Issue #5695 · tensorflow/models · GitHub

Per Engineering, you can run the SSD network in native TRT, we have a python and a C++ sample demonstrating that. Python sample also has the ability to compare with native TF.

See sampleUffSSD for C++ and python uff_ssd sample.

I tried TRT5GA, but it doesn’t work.
Error :
tensorflow.python.framework.errors_impl.NotFoundError: libnvinfer.so.4: cannot open shared object file: No such file or directory

I think tensorflow asked for TRT4 and found TRT5 :
me@me /opt/TensorRT-5.0.2.6/lib$ ls
libnvcaffe_parser.a libnvonnxparser_runtime.so.0
libnvcaffe_parser.so libnvonnxparser_runtime.so.0.1.0
libnvcaffe_parser.so.5 libnvonnxparser_runtime_static.a
libnvcaffe_parser.so.5.0.2 libnvonnxparser.so
libnvinfer_plugin.so libnvonnxparser.so.0
libnvinfer_plugin.so.5 libnvonnxparser.so.0.1.0
libnvinfer_plugin.so.5.0.2 libnvonnxparser_static.a
libnvinfer_plugin_static.a libnvparsers.so
libnvinfer.so libnvparsers.so.5
libnvinfer.so.5 libnvparsers.so.5.0.2
libnvinfer.so.5.0.2 libnvparsers_static.a
libnvinfer_static.a libprotobuf.a
libnvonnxparser_runtime.so libprotobuf-lite.a

Also, when looking at tf/contrib/tensorrt (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/tensorrt), we can see a doc for TRT3. When looking at the official tf installation guide (使用 pip 安装 TensorFlow), tf ask for TRT4.

Is there any compatibility table between TRT and TF versions ?

OK for your last comment, I will wait next releases for SSD !

Thanks !
JC

Hello, I think this is a TensorFlow-TensorRT integration issue. Which has been addressed in newer version of TensorFlow. The issue was addressed very recently (which network is suitalble for integrate tensorRT with tensorflow(TFTRT) optimiztion ? · Issue #5695 · tensorflow/models · GitHub), so I think you’ll have wait for something TF 1.12 or later?