Inference speed optimization on Jetson AGX

Kindly request support for inference speed optimization techniques on Jetson AGX

We are using Jetson AGX Xavier platform.
We are working to speed inference time.

Environment:
Jetson AGX Xavier
GStreamer: 1.14.5
Jetpack: 4.6
CUDA Version: cuda_10.2_r440
Operating System + Version: Ubuntu 18.04.6 LT
TensorRT Version: 8.0.1-1+cuda10.2
Python Version: 3.6.9

We would like to know if there are projects/applications
(for models like YOLOV3, Mobilenet etc…) which demonstrate
performance optimization and provide benchmark data using:

  1. Batching
  2. Streaming
  3. usage of DLA
  4. Layer Fusion

There seems to very less information/applications which showcase
performance optimizations on NVidia Platforms .

Thanks you and Regards

1 Like

Hi,

This topic will be better served posted in the Jetson category. I will go ahead and move it over for you.

Cheers,
Tom K

Hi,

Please check if the below repository can meet your requirement:

You can also find the table tested with the above source below:

Thanks.