Kindly request support for inference speed optimization techniques on Jetson AGX
We are using Jetson AGX Xavier platform.
We are working to speed inference time.
Environment:
Jetson AGX Xavier
GStreamer: 1.14.5
Jetpack: 4.6
CUDA Version: cuda_10.2_r440
Operating System + Version: Ubuntu 18.04.6 LT
TensorRT Version: 8.0.1-1+cuda10.2
Python Version: 3.6.9
We would like to know if there are projects/applications
(for models like YOLOV3, Mobilenet etc…) which demonstrate
performance optimization and provide benchmark data using:
- Batching
- Streaming
- usage of DLA
- Layer Fusion
There seems to very less information/applications which showcase
performance optimizations on NVidia Platforms .
Thanks you and Regards