Object detection models are very slow

Hello!

I need do real time object detection with a Jetson TX2 that has the last released jetpack version (4.3) and tensorflow 1.15 in a virtualenv following the step of this site:

I dowloaded the tensorflow api object detection (branch r1.13) and It’s working perfectly

The problem is that the models take too long for the inference. I am testing the ssd mobilenet and takes like 8s with the default images that this repository provide.

I think that the model is using GPU because I just used tensorflow-gpu==1.15 in the instalation and can see my GPU with

tf.test.is_gpu_available()
tf.test.gpu_device_name()

Could someone help me?

Hi camposromeromiguel,

Please try to set the system to maximum performance mode by below script to see if can improve:

sudo jetson_clocks

Or you can experiment with provided demos from jetson-inference github https://github.com/dusty-nv/jetson-inference.

Or refer to other threads if can help:
https://devtalk.nvidia.com/default/topic/1042106/jetson-tx2/how-to-train-a-custom-object-detector-and-deploy-it-onto-jtx2-with-tf-trt-tensorrt-optimized-/post/5407857/#5407857

Hello Kaycc

I setted the system to maximum performance but just improved from 10 to 9 seconds.
I wouldl like share you the notebook that I am using

Moreover, I found this topic which says that some layers of the API is working on CPU and that is why the bad performance on Jetson

https://devtalk.nvidia.com/default/topic/1027819/jetson-tx2/object-detection-performance-jetson-tx2-slower-than-expected/post/5231187/#5231187

Also, I checked the use of my GPU and CPU in the inference moment and the GPU got 30% of work whereas CPU got 100%

Also i runned the examples of the JETSON after Jetpack Installation and it seems okey, so is not a Nvidia drivers problems (sdkmanager used).

Source: https://i.ibb.co/VWpM85m/Screenshot-from-2020-01-21-11-53-43.png

Here i am sharing the use of my CPU and GPU meanwhile i am doing a inference. I modified the code of this way for be sure than inference take long

for i in range(50):
          output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

Hi,

It’s known that TensorFlow may not have optimal performance on the Jetson environment.
If your model is listsed in this tutorial, you can try to convert the model into our TensorRT engine for better performance:
https://github.com/AastaNV/TRT_object_detection

Thanks