Efficient RTSP video decoding

Our framework requires converting RTSP video to frames and passing them to the local inference server.

I read omxh264dec is being deprecated and tried to use nvv4l2decoder

P.s is an efficient way of converting image? I want to use very less CPU during the process.

working string

#gst_str_1 = ("rtspsrc location=rtsp://admin:123456@192.168.10.51/stream0  latency=0 ! rtph265depay ! h265parse ! omxh265dec ! 'video/x-raw(memory:NVMM) , format=(string)RGBA' ! videoconvert ! appsink")

Full script

import cv2
import numpy as np
 
# Create a VideoCapture object and read from input file
# If the input is the camera, pass 0 instead of the video file name
#gst_str_1 = ("rtsp://192.168.10.78/media/live/2/1")

gst_str_1 = ("rtspsrc location=rtsp://admin:123456@192.168.10.51/stream0  latency=0 ! rtph265depay ! h265parse ! nvv4l2decoder ! nvvidconv ! video/x-raw , format=(string)BGRx ! videoconvert ! appsink")

cap = cv2.VideoCapture(gst_str_1, cv2.CAP_GSTREAMER)

# Check if camera opened successfully
if (cap.isOpened()== False): 
  print("Error opening video stream or file")
 
# Read until video is completed
while(cap.isOpened()):
  # Capture frame-by-frame
  ret, frame = cap.read()
  if ret == True:
 
    # Display the resulting frame
    #cv2.imshow('Frame',frame)
 
    # Press Q on keyboard to  exit
    if cv2.waitKey(25) & 0xFF == ord('q'):
      break
 
  # Break the loop
  else: 
    break
 
cap.release()
 
cv2.destroyAllWindows()

Result

python camera_1080p_51.py 
Failed to query video capabilities: Inappropriate ioctl for device
Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 279 
NVMEDIA: Reading sys.display-size : status: 6 
NvMMLiteBlockCreate : Block : BlockType = 279

Hi,
It is a known issue on r32.1:
https://devtalk.nvidia.com/default/topic/1055288/jetson-nano/can-t-get-accelerated-gstreamer-pipeline-rtspsrc-rtph264depay-h264parse-nvv4l2decoder-to-wo-/post/5349565/#5349565

For running gstreamer with OpenCV, since appsink only accepts CPU buffers, so there is memcpy() to copy video/x-raw(memory:NVMM) to video/x-raw. To reduce CPU loading, suggest you use tegra_multimedia_api instead of OpenCV.

The CPU loading of using omxh264enc and nvv4l2decoder is identical.

Can you provide reference code in python?
https://docs.nvidia.com/jetson/l4t-multimedia/l4t_mm_video_decode_cuda.html

Hi,
tegra_multimedia_api samples are in C/C++. Python is not supported.

Then what is the optimal way in python ?

import cv2
import numpy as np
 
# Create a VideoCapture object and read from input file
# If the input is the camera, pass 0 instead of the video file name
#gst_str_1 = ("rtsp://192.168.10.78/media/live/2/1")

gst_str_1 = ("rtspsrc location=rtsp://admin:123456@192.168.10.51/stream0  latency=0 ! rtph265depay ! h265parse ! nvv4l2decoder ! nvvidconv ! video/x-raw , format=(string)BGRx ! videoconvert ! appsink")

cap = cv2.VideoCapture(gst_str_1, cv2.CAP_GSTREAMER)

# Check if camera opened successfully
if (cap.isOpened()== False): 
  print("Error opening video stream or file")
 
# Read until video is completed
while(cap.isOpened()):
  # Capture frame-by-frame
  ret, frame = cap.read()
  if ret == True:
 
    # Display the resulting frame
    #cv2.imshow('Frame',frame)
 
    # Press Q on keyboard to  exit
    if cv2.waitKey(25) & 0xFF == ord('q'):
      break
 
  # Break the loop
  else: 
    break
 
cap.release()
 
cv2.destroyAllWindows()

Hi,
For running python + OpenCV, your pipeline is optimal. For leveraging GPU, you may take tegra_multimedia_api into consideration.

@Ravik
Could you solve the problem with nvv4l2decoder using python+opencv?

yeah, we use DeepStream

we also build our our gst-streamer python binding , around 20x faster.

@Ravik
1- you use this codes? link
2- can you share your decoder code?
3 - in the deep-stream-python-apps, in the deepstream-imagedata-multistream codes, How I can stop visualization? I want don’t show me any visualization and only do decoder. and I want to use nvv4l2decoder instead of decodebin decoder because i want to use drop-frame-interval option.