Deployment of models with TensorFlow is much slower on Drive PX2

odexter · September 2, 2019, 11:53am

I am trying to deploy various TensorFlow models (Object Detection, DeepLab) with TensorFlow C++ on the Drive PX2. Performance when deployed with TensorFlow is much slower (almost 4x as slow) than a similar setup on an x86 Linux system with a GTX1060.

Running the TensorRT samples gives good results so I assume that there are some issues with the way TensorFlow is managing the gpu processes. Since we only have TensorRT 4 on the PX2, it seems like these models are not easily converted to uff for deployment with TensorRT C++, if possible at all, which is why I am still trying to work with TensorFlow.

Will greatly appreciate any advice. Thanks.

AastaLLL · September 3, 2019, 5:45am

Hi,

First, have you compiled TensorFlow package with PX2 GPU architecture, which are 6.1 and 6.2?
Here is a topic of compiling TensorFlow from source for your reference:
[url]Tensorflow installation on drive PX2 - General - NVIDIA Developer Forums

Suppose you are meeting some non-supported operations. Here are two possible solutions for you:

1. Implement it with TensorRT plugin layer.
Here is some example to demonstrate the plugin API:
[url]https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#extending[/url]

2. Try TF-TRT.
TF-TRT will try to convert the model into TenorRT and automatically fallback the non-supported operations into TensorFlow.
Check this tutorial for more information: GitHub - NVIDIA-AI-IOT/tf_trt_models: TensorFlow models accelerated with NVIDIA TensorRT

Thanks.

odexter · September 3, 2019, 6:37am

Hi,

Yes, I have compiled TensorFlow according to that post.

That is not preferable but will be my last resort.
I have actually already done this. The difference in inference speed between optimized and non-optimized models is similar to that on an x86 Linux setup, so the issue likely lies with the layers in TensorFlow. It is strange that that are no problems with compiling and running TensorFlow, up until the slow inference speed when actually deploying a model.

Hopefully someone else has come across this issue and managed to discover the underlying cause. Thanks for your suggestions regardless.

AastaLLL · September 23, 2019, 7:35am

Hi,

Sorry for the late update.

Based on the experiment no.2, most of your layers may not be supported by the TensorRT and fallback into the TensorFlow implementation.
Would you mind to check our support matrix for your model first:
[url]Support Matrix :: NVIDIA Deep Learning TensorRT Documentation

Thanks.

Topic		Replies	Views
Correct way of deploying a tensorflow model on TX2? Jetson TX2	16	5850	July 30, 2019
tensorflow model implementation on PX2(round two) DriveWorks	3	1168	May 9, 2018
Custom Layer Tensorflow support for TensorRT 3 General	5	1467	July 30, 2018
DRIVE PX2 Tensorflow installation with TensorRT support DRIVE - Linux	7	1248	October 12, 2021
Issues with TensorRT on Drive PX2 Jetson TX2	2	745	October 18, 2021
Tensorflow on DRIVE PX2 AUTOCHAUFFEUR General	2	884	October 12, 2021
TensorRT Python example code for reference \| no directory for Python as per documentation General	4	805	October 12, 2021
Tensorflow not using GPU on drive PX2 General	5	1734	January 31, 2019
Install TensorFlow on PX2 DRIVE Hardware	12	3783	May 8, 2018
A few questions about TensorRT versioning General	5	1293	October 12, 2021

Deployment of models with TensorFlow is much slower on Drive PX2

Related topics