Performance of Tensorflow (1.5) on Jetson TX2 slower than expected


I installed the Jetson with Jetpack 3.2 (CUDA 9 and everything)
Than I installed Tensorflow 1.5 on it (build from source the same as the tutorials you find from JetsonHacks or Tensorflow it self)
Wheel files ->
Now I’m trying to do object detection with the ‘ssd_mobilenet_v1_coco’ network.

But the performance is very slow. On my host pc (no GPU) I get almost real-time performance. But on the Jetson TX2, it’s three times slower! (with ‘./sudo’)
With ‘sudo ./’ I see that the GPU only runs at 6% during my program.

If I run a demo with TensorRT from Jetson-interference ( the GPU is used 99% and this is also real-time object detection.
Unfortunately I can’t use TensorRT thanks to the fact that my network contains layers not yet supported by TensorRT.

Does anyone have any suggestions what I can do to improve the performance of my Tensorflow application?

  • Is it possible that I left some kind of debug state on during the tensorflow installation?
  • Is the loading of the images of the camera the bottle neck? (I already tried doing this threaded with no results)
  • Am I missing a tensorflow setting to use the GPU better?



Here are some previous discussion for this issue :

Low GPU utilization is caused by the post-process tf.where operation.
Current workaround is to put the ‘map’ related network to CPU for performance.

Check the link shared above for the example script.


Ok thanks AastaLLL,
I didn’t found that forum topic.
I will check it out!