Need advice: 4K video capture & writing performance with OpenCV

need some help to improve my frames-per-second performance for my OpenCV motion-detection C++ program.

Here is my setup:

  • Jetpack 4.6
  • new build of OpenCV 4.5.4-dev (see attached for verbose build info)
  • with GStreamer 1.14.5
  • NV Power Mode 6 [20W 2 core]

I want to process an incoming RTSP 4k video stream (3840x2160) at 12 fps from IP camera using hardware-assisted h.264 decode using GStreamer, then OpenCV for motion-detection using contour detection within selected ROI. Draw the bounding box around the detected motion, and then write the video frames out to an .MP4 file using hardware-assisted h.264 (or h.265) encode using GStreamer.

Using Gstreamer Pipeline to reduce CPU load; Pipeline Elements ‘nvv4l2decoder’ use Xavier hardware for h264 frame decoding and ‘nvvidconv’ is hardware accelerated converter.

`//===========================
std::string pipe = “rtspsrc location=rtsp://admin:passwd@192.168.1.34:554/Streaming/Channels/101 ! rtph264depay
! nvv4l2decoder ! video/x-raw(memory:NVMM),format=NV12 ! nvvidconv ! video/x-raw,format=BGRx,width=3840,height=2180
! videoconvert ! appsink”;

cv::VideoCapture capture(pipe,cv::CAP_GSTREAMER);
//===========================
const int frame_width = 3840;
const int frame_height = 2160;
const int frames_per_second = 12;
cv::Size frame_size(frame_width,frame_height);

std::string motion_writer_pipe = “appsrc ! videoconvert ! video/x-raw,format=BGRx,width=3840,height=2160,framerate=12/1 !
nvvidconv ! video/x-raw(memory:NVMM),width=(int)3840,height=(int)2160,format=NV12,framerate=(fraction)12/1 !
nvv4l2h264enc preset-level=0 iframeinterval=60 control-rate=1 bitrate=5000000 !
h264parse ! qtmux ! filesink location =” + savedMotionVideoFullPath;

cv::VideoWriter motion_writer = cv::VideoWriter(motion_writer_pipe,cv::CAP_GSTREAMER,frames_per_second,frame_size,true);

//===========================
`
All is working, to a degree, but the fps performance is way too low, and the outputted MP4 video quality is not always acceptable. All of the motion-detection-related OpenCV C++ statements use the CPU, since there are no corresponding GPU statements for the majority of the code. So the motion-detection portion of the program is all CPU.

When all motion-detection code is removed and simply read in the RTSP video stream and then write it out using above GStreamer code, the CPU is 80+% and the resulting MP4 video often has many compression artifacts as little blurry squares in large areas of the written frame. (See attached cropped photo captures of a portion of the frame).

Is this poor performace (low fps, unacceptable encoding) due to not enough CPU speed or limits of the XavierNX itself?

What kind of hardware system do I need to successfully process 4k video with OpenCV and achieve high-quality MP4 output?

Thanks for all your help…. much appreciated.
Dave
OpenCV 4.5.rtf (8.3 KB)


Hi,
Since BGR is not supported by hardware converter(VIC engine), so there is additional software conversion/data copy on CPU while running with OpenCV. We would suggest use DeepStream SDK to get optimal solution. You can install the package through SDKManager and it is at

/opt/nvidia/deepstream/deepstream-6.0

Documents are in
NVIDIA Metropolis Documentation

For running with OpenCV, the performance bottleneck is very likely to be in CPU capability. You can execute sudo tegrastats to check system loading.