TensorRT not improving FPS on GTX 1080ti

JC-31 · November 13, 2018, 3:05pm

Hello,

I am trying to work with TensorRT and Tensorflow.
Pure TF with SSD Inception V2 : around 12fps
TF + TensorRT with SSD Inception V2 : around 12 fps

My config is :
CUDA : V9.0.176
Tensorflow : 1.12 (tried with TF 1.9 and TF 1.8)
NVIDIA Driver : 410.72
TensorRT : 4.1.2-1+cuda9.0
Ubuntu 16.04

Init of my TF graph is :

detection_graph = tf.Graph()
        with detection_graph.as_default():
            od_graph_def = tf.GraphDef()
            with tf.gfile.GFile(MODEL['PATH_TO_CKPT'], 'rb') as fid:
                serialized_graph = fid.read()
                od_graph_def.ParseFromString(serialized_graph)            
                
                trt_graph = trt.create_inference_graph(
                    input_graph_def=od_graph_def,
                    outputs=tensors,
                    max_batch_size=300,
                    minimum_segment_size=10,
                    max_workspace_size_bytes=1 << 25,
                    precision_mode='FP32',
                    is_dynamic_op=True
                )
                tf_config = tf.ConfigProto()
                tf_config.gpu_options.allow_growth = True
                self.sess = tf.Session(config=tf_config)
                tf.import_graph_def(trt_graph, name='')

Note that I have this warning :

Engine buffer is full. buffer limit=1, current entries=300, requested batch=1917

When setting max_batch_size to 1917, I’m out of memory.

Thank you,

NVES · November 13, 2018, 7:42pm

Hello,

what type of GPU are you using? Also,it’d help us if you have a small repro that demonstrates the perf comparison between TF and TRT.

Also, what happens if you lower the batch size? how’s performance then?

JC-31 · November 15, 2018, 5:04pm

Hello,
I use a GTX 1080Ti
I created a small repo with a very simplified version of the code. You can set TRT_mode to true or false. Don’t froget to set PATH_TO_CKPT and PATH_TO_LABELS too.
Link : https://github.com/jcRisch/TMP_NVIDIA/blob/master/DEMO_NVIDIA.py

Also, lowering the batch size doesn’t change anything.
Maybe, my install of trt is wrong (but got no error when using it), is there a way to check the installation ?

Thank you,

NVES · November 16, 2018, 4:04pm

Regarding TensorRT installation. I usually recommend trying TensorRT container, which removes a lot of the dependencies. The containers are available from NVIDIA GPU Cloud (NGC), and accounts are free: https://www.nvidia.com/en-us/gpu-cloud/

NVES · November 16, 2018, 4:10pm

Also, can you share

frozen_inference_graph.pb
mscoco_label_map.pbtxt

thanks
NVIDIA Enterprise Support

JC-31 · November 16, 2018, 5:36pm

I downloaded frozen_inference_graph.pb from : https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md (ssd_inception_v2_coco)

mscoco_label_map.pbtxt can be found here : https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_label_map.pbtxt

Thank you,
JC

NVES · November 18, 2018, 1:09am

Hello,

while we triage this issue. Can you try TRT5GA to see if you see performance increase?

NVES · November 19, 2018, 5:53pm

Hello,

We are working on this, this has been addressed here which network is suitalble for integrate tensorRT with tensorflow(TFTRT) optimiztion ？ · Issue #5695 · tensorflow/models · GitHub

Per Engineering, you can run the SSD network in native TRT, we have a python and a C++ sample demonstrating that. Python sample also has the ability to compare with native TF.

See sampleUffSSD for C++ and python uff_ssd sample.

JC-31 · November 21, 2018, 1:38pm

I tried TRT5GA, but it doesn’t work.
Error :
tensorflow.python.framework.errors_impl.NotFoundError: libnvinfer.so.4: cannot open shared object file: No such file or directory

I think tensorflow asked for TRT4 and found TRT5 :
me@me /opt/TensorRT-5.0.2.6/lib$ ls
libnvcaffe_parser.a libnvonnxparser_runtime.so.0
libnvcaffe_parser.so libnvonnxparser_runtime.so.0.1.0
libnvcaffe_parser.so.5 libnvonnxparser_runtime_static.a
libnvcaffe_parser.so.5.0.2 libnvonnxparser.so
libnvinfer_plugin.so libnvonnxparser.so.0
libnvinfer_plugin.so.5 libnvonnxparser.so.0.1.0
libnvinfer_plugin.so.5.0.2 libnvonnxparser_static.a
libnvinfer_plugin_static.a libnvparsers.so
libnvinfer.so libnvparsers.so.5
libnvinfer.so.5 libnvparsers.so.5.0.2
libnvinfer.so.5.0.2 libnvparsers_static.a
libnvinfer_static.a libprotobuf.a
libnvonnxparser_runtime.so libprotobuf-lite.a

Also, when looking at tf/contrib/tensorrt (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/tensorrt), we can see a doc for TRT3. When looking at the official tf installation guide (使用 pip 安装 TensorFlow), tf ask for TRT4.

Is there any compatibility table between TRT and TF versions ?

OK for your last comment, I will wait next releases for SSD !

Thanks !
JC

NVES · November 21, 2018, 5:10pm

Hello, I think this is a TensorFlow-TensorRT integration issue. Which has been addressed in newer version of TensorFlow. The issue was addressed very recently (which network is suitalble for integrate tensorRT with tensorflow(TFTRT) optimiztion ？ · Issue #5695 · tensorflow/models · GitHub), so I think you’ll have wait for something TF 1.12 or later?

Topic		Replies	Views
TensorRT 5 and TensorRT 7 conversion discrepancy TensorRT tensorrt , tensorflow	4	505	September 23, 2020
TensorRT model accuracy on different GPUs TensorRT	3	1948	October 3, 2018
TensorRT 3: Faster TensorFlow Inference and Volta Support Technical Blog	16	462	December 8, 2020
Incorrect inference in TensorRT compared to the Tensorflow inference TensorRT tensorrt	3	762	March 10, 2022
Looking for Insight on Disappointing Results Optimizing an Object Detection Network with TensorRT TensorRT	1	886	March 29, 2019
TensorRT Integration Speeds Up TensorFlow Inference Technical Blog	40	801	March 27, 2020
No SpeedUp after TensorRT INT8 (PointNet ++ tensorflow model) TensorRT	6	1253	February 25, 2020
TF-TRT optimization TensorRT tensorrt , tensorflow , jetson-inference	4	4948	June 2, 2021
Tf-trt conversion got killed TensorRT tensorrt , tensorflow , jetson-inference	3	747	April 22, 2021
TF_TRT unsupported Constant Type TensorRT	3	793	January 18, 2019

TensorRT not improving FPS on GTX 1080ti

Related topics