Problem merging two camera streams using nvcompositor and forwarding merged video to the Darknet

Hi, I am trying to merge two RPi HD camera streams into one and inputting the resulting stream to the Darknet. Sorry in advance, if some technical terms or concepts are wrong, I am new to using GStreamer and video pipelines.

At this moment:

  • using a single camera works.
  • merging the streams and previewing with nvoverlaysink works.

Here is an example of a pipeline that works for previewing.

gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! "video/x-raw(memory:NVMM)",width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! 'video/x-raw(memory:NVMM)', width=320, height=360 ! comp. nvarguscamerasrc sensor-id=1 ! "video/x-raw(memory:NVMM)",width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! 'video/x-raw(memory:NVMM)', width=320, height=360 ! comp. nvcompositor name=comp sink_0::xpos=0 sink_0::ypos=0 sink_0::width=320 sink_0::height=360 sink_1::xpos=320 sink_1::ypos=0 sink_1::width=320 sink_1::height=360 ! videoconvert ! 'video/x-raw(memory:NVMM)', format=RGBA ! nvoverlaysink

The problem arises when I change it to appsink and input it to the darknet.

./darknet detector demo cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights "nvarguscamerasrc sensor-id=0 ! video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! video/x-raw(memory:NVMM), width=320, height=360 ! comp. nvarguscamerasrc sensor-id=1 ! video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! video/x-raw(memory:NVMM), width=320, height=360 ! comp. nvcompositor name=comp sink_0::xpos=0 sink_0::ypos=0 sink_0::width=320 sink_0::height=360 sink_1::xpos=320 sink_1::ypos=0 sink_1::width=320 sink_1::height=360 ! videoconvert ! video/x-raw(memory:NVMM), format=RGBA ! appsink"

GST_ARGUS: Setup Complete, Starting captures for 0 seconds
GST_ARGUS: Starting repeat capture requests.
CONSUMER: Producer has connected; continuing.
CONSUMER: Producer has connected; continuing.
[ WARN:0] global /tmp/build_opencv/opencv/modules/videoio/src/cap_gstreamer.cpp (1761) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module nvarguscamerasrc1 reported: Internal data stream error.
GST_ARGUS: Cleaning up
CONSUMER: Done Success
GST_ARGUS: Done Success
CONSUMER: Done Success
GST_ARGUS: Cleaning up
GST_ARGUS: Done Success
[ WARN:0] global /tmp/build_opencv/opencv/modules/videoio/src/cap_gstreamer.cpp (888) open OpenCV | GStreamer warning: unable to start pipeline
[ WARN:0] global /tmp/build_opencv/opencv/modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
[ERROR:0] global /tmp/build_opencv/opencv/modules/videoio/src/cap.cpp (142) open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.4.0) /tmp/build_opencv/opencv/modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can’t find starting number (in the name of file): nvarguscamerasrc sensor-id=0 ! video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! video/x-raw(memory:NVMM), width=320, height=360 ! comp. nvarguscamerasrc sensor-id=1 ! video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! video/x-raw(memory:NVMM), width=320, height=360 ! comp. nvcompositor name=comp sink_0::xpos=0 sink_0::ypos=0 sink_0::width=320 sink_0::height=360 sink_1::xpos=320 sink_1::ypos=0 sink_1::width=320 sink_1::height=360 ! videoconvert ! video/x-raw(memory:NVMM), format=RGBA ! appsink in function ‘icvExtractPattern’

Video-stream stopped!
Video-stream stopped!
Video-stream stopped!

In the case of single-camera input into the Darknet, everything is ok as well.

./darknet detector demo cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights "nvarguscamerasrc ! video/x-raw(memory:NVMM),width=1280, height=720, framerate=30/1, format=NV12 ! nvvidconv ! video/x-raw, format=BGRx, width=640, height=360 ! videoconvert ! video/x-raw, format=BGR ! appsink"

What I could narrow down is that there is possibly a problem with “nvcompositor” and “videoconverter” converting the RGBA to BGR format. Whenever I try to modify the pipeline to output BGR format to app sink, I encounter an error. There is no problem if I don’t use “nvcompositor”.

WARNING: erroneous pipeline: could not link videoconvert0 to appsink0, videoconvert0 can’t handle caps video/x-raw(memory:NVMM), format=(string)BGR

Any help or tips on how to solve this would be appreciated.

Video after merging the two streams.

Hi,
Please check if you can run this command:

$ gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=320, height=360' ! comp. nvarguscamerasrc sensor-id=1 ! 'video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=320, height=360' ! comp. nvcompositor name=comp sink_0::xpos=0 sink_0::ypos=0 sink_0::width=320 sink_0::height=360 sink_1::xpos=320 sink_1::ypos=0 sink_1::width=320 sink_1::height=360 ! 'video/x-raw(memory:NVMM)' ! nvvidconv ! video/x-raw,format=RGBA ! videoconvert ! video/x-raw,format=BGR ! fakesink

If yes, please try the string:
nvarguscamerasrc sensor-id=0 ! video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! video/x-raw(memory:NVMM), width=320, height=360 ! comp. nvarguscamerasrc sensor-id=1 ! video/x-raw(memory:NVMM),width=640, height=720, framerate=30/1, format=NV12 ! nvvidconv ! video/x-raw(memory:NVMM), width=320, height=360 ! comp. nvcompositor name=comp sink_0::xpos=0 sink_0::ypos=0 sink_0::width=320 sink_0::height=360 sink_1::xpos=320 sink_1::ypos=0 sink_1::width=320 sink_1::height=360 ! video/x-raw(memory:NVMM) ! nvvidconv ! video/x-raw,format=RGBA ! videoconvert ! video/x-raw,format=BGR ! appsink

Thank you very much. I tried both commands and it solves the errors. Now I can run the Darknet with both cameras.
But now there is a new problem. The video in the preview window is extremely slow and looks like it stops updating after a few frames. Meanwhile, Darknet is outputting an average of ~13fps object detection.

Is there any performance penalty in using "videoconvert "?

videoconvert ! video/x-raw,format=BGR ! appsink

Or it is an issue with a non-standard resolution?

Hi,
Please refer to
[Gstreamer] nvvidconv, BGR as INPUT - #2 by DaneLLL
BGR format is not supported by most hardware engines and need to utilize CPU. May not bring good performance on Jetson Nano.

We have DeepStream SDK for running deep learning inference. By default it demonstrates ResNet10 and

/opt/nvidia/deepstream/deepstream-5.1/sources/objectDetector_FasterRCNN:
config_infer_primary_fasterRCNN.txt   labels.txt                        README
deepstream_app_config_fasterRCNN.txt  nvdsinfer_custom_impl_fasterRCNN

/opt/nvidia/deepstream/deepstream-5.1/sources/objectDetector_SSD:
config_infer_primary_ssd.txt   nvdsinfer_custom_impl_ssd
deepstream_app_config_ssd.txt  README

/opt/nvidia/deepstream/deepstream-5.1/sources/objectDetector_Yolo:
config_infer_primary_yoloV2_tiny.txt   deepstream_app_config_yoloV3.txt
config_infer_primary_yoloV2.txt        labels.txt
config_infer_primary_yoloV3_tiny.txt   nvdsinfer_custom_impl_Yolo
config_infer_primary_yoloV3.txt        prebuild.sh
deepstream_app_config_yoloV2_tiny.txt  README
deepstream_app_config_yoloV2.txt       yolov3-calibration.table.trt7.0
deepstream_app_config_yoloV3_tiny.txt

Would suggest check these demonstrations and replace with Darknet.