Performance of Tensorflow (1.5) on Jetson TX2 slower than expected

Hello

I installed the Jetson with Jetpack 3.2 (CUDA 9 and everything)
Than I installed Tensorflow 1.5 on it (build from source the same as the tutorials you find from JetsonHacks or Tensorflow it self)
Wheel files -> https://github.com/MatthiasRoelandts/tensorflow_jetson_tx2
Now I’m trying to do object detection with the ‘ssd_mobilenet_v1_coco’ network.

But the performance is very slow. On my host pc (no GPU) I get almost real-time performance. But on the Jetson TX2, it’s three times slower! (with ‘./sudo jetson_clock.sh’)
With ‘sudo ./tegrastats.sh’ I see that the GPU only runs at 6% during my program.

If I run a demo with TensorRT from Jetson-interference (https://github.com/dusty-nv/jetson-inference) the GPU is used 99% and this is also real-time object detection.
Unfortunately I can’t use TensorRT thanks to the fact that my network contains layers not yet supported by TensorRT.

Does anyone have any suggestions what I can do to improve the performance of my Tensorflow application?

  • Is it possible that I left some kind of debug state on during the tensorflow installation?
  • Is the loading of the images of the camera the bottle neck? (I already tried doing this threaded with no results)
  • Am I missing a tensorflow setting to use the GPU better?

Thanks

Hi,

Here are some previous discussion for this issue :
https://devtalk.nvidia.com/default/topic/1027819/jetson-tx2/object-detection-performance-jetson-tx2-slower-than-expected/

Low GPU utilization is caused by the post-process tf.where operation.
Current workaround is to put the ‘map’ related network to CPU for performance.

Check the link shared above for the example script.

Thanks.

Ok thanks AastaLLL,
I didn’t found that forum topic.
I will check it out!