OpenCV taking high RTSP decoding time while using TensorRT inference

@Fiona.Chen
Please find the below complete information as applicable to our setup.

  • Hardware Platform -GPU (RTX series NVIDIA GPU, 16GB RAM, i5-7500 CPU)
  • DeepStream Version - 5.1.0
  • TensorRT Version - 7.2.3.4
  • NVIDIA GPU Driver Version - driver version - 450.51 with cuda - 11.0, cudnn - 8.0.5
  • Issue Type - questions
  • Requirement details -

We are getting high time for getFrame() (RTSP decoding using cv2.videoCapture()) after building TensorRT and Deepstream.Before building TensorRT:
get_frame() ~ 5ms
After building TensorRT:
get_frame() ~ 20 ms

  • We uninstalled Deepstream.
  • We uninstalled Opencv cuda build and installed opencv cpu version.
  • we downgraded Gstreamer version.

but get_frame() time is still high.
We have observed that initially (for the first 500-600 frames), the time is 1 - 3 ms
but after some time it increases to 30+ms and sometimes 80 ms also. ( maybe this is a memory issue). Can you help us on potential improvement actions for optimizing RTSP decoding time due to high TensorRT memory usage.

Hi @rehan2,

We request you to share the issue reproducible scripts/model for better debugging.

Thank you.

Thanks for getting back @spolisetty
we have converted the pytorch model to TensorRT using TRTroch for inference.
It reduced the prediction time but increased the CPU Usage and memory. Due to this cv2.Videocapture is unable to allocate memory for frames and we are getting high time to read frames.

Hi @rehan2,

We recommend you please share the issue reproducible inference script for better assistance.

Thank you.