Performance difference between jetson-inference & object-detection-tensorrt-example on TX2

Hi,

I am wondering if I am doing this right or now, but I have some Qs about TX2 performance for object detection.

Following this example on jetson-inference (jetson-inference/detectnet-camera-2.md at master · dusty-nv/jetson-inference · GitHub), I saw that TX2 is capable of running/inferencing around 50-70 FPS. (I just followed the example. I didn’t change anything)

On the other hand, if I go through this example - object-detection-tensorrt-example/detect_objects_webcam.py at master · NVIDIA/object-detection-tensorrt-example · GitHub , I got about 1.63 FPS (I used imutil package from pip to measure the FPS).

My main goal for using the above example (SSD_Model) was so that I could use TensorRT model that gets generated from this guide - Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation. The model that I am trying to use is FP16 TensorRT using DetectNet_v2.

As 1.63 FPS seems extremely slow, I wanted to check to see if there is normal or if I implemented something incorrectly.

Could it be that this was due to the fact that the SSD example uses SSD model rather than DetectNet? If I wanted to test on DetectNet, what would I have to do test this out? Also, would you recommend using jetson-inference for production? If you could please help, I would appreciate it.

Thanks you

Hi,

The SSD sample reads the camera via default OpenCV, which is slow since it is a CPU implementation.
It’s recommended to try our Deepstream SDK for a better performance first.

/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_SSD/

Thanks.

@AastaLLL

Thanks for sharing the info.

I looked into DeepStream doc, and I looked through dev forum to figure out how to hook up FLIR Machine Vision USB3 camera + FLIR’s PySpin SDK & DeepStream.

It seems like FLIR cameras are not compatible with DeepStream as they lack GStreamer support, and DeepStream needs USB (accessed through /dev/video0) cameras.

Would you have any other suggestion on what can be done to improve FPS on the TX2 board?

Thank you

@AastaLLL

Is there a way to boost performance without using DeepStream?

Hi,

The bottleneck comes from CPU based camera reader.

To get a better performance, usually it need to fetch the camera data directly to the GPU accessible buffer.
This can save you some memory copy time as well as accelerated it by using GPU for pre-processing.

However, it seems like your camera cannot support GPU accessible buffer (either GPU buffer or pinned CPU buffer).
So it’s recommended to confirm this with your camera vendor first.

Thanks and please let us know the results.