How to process audio with my pipeline on the Jetson Xavier NX

I am kind of stuck. What I am trying to accomplish is this:

  1. read in an rtsp stream 30fps 1080p
  2. use jetson-inference to do object detection to determine area of the video I want to zoom in on.
  3. crop to detected area.
  4. resize cropped area to match original rtsp video size keeping aspect ratio, 30fps 1080p
  5. send back out new video feed as rtsp stream 30fps 1080p.

I currently have this working with a few glitches. I am trying to do this on a Jetson Xavier NX
using Python, Gstreamer, jetson-inference, OpenCV, RidgeRun (GstRtspSink). I am now trying to figure out how to get the audio part of this working. I am currently stumped on how to go about getting this to work.

Does anyone have any idea on how i should go about getting audio working in my process since OpenCV does only video?

we don’t have much experience about audio in OpenCV. would suggest you go to OpenCV forum.

For pure gstreamer pipeline, you may refer to this post: