Detection on MJPEG stereocamera failure

• Hardware Platform (Jetson / GPU)
• DeepStream Version 5.0
• JetPack Version 4.4
• Issue Type: question

Hi, I’m having a problem with adding stereo camera as a source for deepstreamer. Camera output is MJPEG with 1 frame for two cameras. I need to crop one of them and sink to nvinfer plugin.

Camera params:

Camera
v4l2-ctl -d /dev/video1 --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
	Index       : 0
	Type        : Video Capture
	Pixel Format: 'MJPG' (compressed)
	Name        : Motion-JPEG
		Size: Discrete 1280x480
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 2560x960
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 640x240
			Interval: Discrete 0.033s (30.000 fps)

This is working sample of cropping pipeline

gst-launch-1.0 v4l2src device=/dev/video1 io-mode=2 blocksize=800000 ! image/jpeg, width=2560, height=960 ! nvjpegdec ! "video/x-raw, format=I420, framerate=30/1" ! nvvidconv interpolation-method=0 ! "video/x-raw, width=2560, height=960, format=I420" ! queue ! nvvidconv left=0 right=1280 top=120 bottom=840 ! "video/x-raw(memory:NVMM), width=1280, height=720, pixel-aspect-ratio=1/1, format=(string)I420" ! nvegltransform ! nveglglessink

Working pipeline with added plugins for inference and picture saving for mp4 video

gst-launch-1.0 filesrc location= /opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_720p.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer config-file-path= /opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/config_infer_primary_facedetectir.txt batch-size=1 unique-id=1 ! nvvideoconvert ! dsexample full-frame=0 ! nvdsosd ! nvegltransform ! nveglglessink

But when I try to capture from camera I fail

gst-launch-1.0 v4l2src device=/dev/video1 io-mode=2 blocksize=800000 ! image/jpeg, width=2560, height=960 ! nvjpegdec DeepStream=1 ! "video/x-raw, format=I420, framerate=30/1" ! nvvidconv interpolation-method=0 ! "video/x-raw, width=2560, height=960, format=NV12" ! queue ! nvvidconv left=0 right=1280 top=120 bottom=840 ! "video/x-raw(memory:NVMM), width=1280, height=720, pixel-aspect-ratio=1/1, format=(string)NV12" ! queue ! m.sink_0 nvstreammux name=m batch-size=1 batched-push-timeout=40000 width=1280 height=720 live-source=TRUE ! queue ! nvvidconv ! queue ! nvinfer config-file-path= /opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/config_infer_primary_facedetectir.txt batch-size=1 unique-id=1 ! nvvideoconvert ! dsexample full-frame=0 ! nvdsosd ! nvegltransform ! nveglglessink

Output is:

Summary
Using winsys: x11 
WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:04.868973002 19081   0x558dbfcd50 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1577> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.0/samples/models/tlt_pretrained_models/facedetectir/resnet18_facedetectir_pruned.etlt_b1_gpu0_fp16.engine
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1         3x240x384       
1   OUTPUT kFLOAT output_bbox/BiasAdd 4x15x24         
2   OUTPUT kFLOAT output_cov/Sigmoid 1x15x24         

0:00:04.869113943 19081   0x558dbfcd50 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1681> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.0/samples/models/tlt_pretrained_models/facedetectir/resnet18_facedetectir_pruned.etlt_b1_gpu0_fp16.engine
0:00:04.880207400 19081   0x558dbfcd50 INFO                 nvinfer gstnvinfer_impl.cpp:311:notifyLoadModelStatus:<nvinfer0> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/config_infer_primary_facedetectir.txt sucessfully
Pipeline is live and does not need PREROLL ...
Got context from element 'eglglessink0': gst.egl.EGLDisplay=context, display=(GstEGLDisplay)NULL;
Setting pipeline to PLAYING ...
New clock: GstSystemClock
nvbuf_utils: dmabuf_fd -1 mapped entry NOT found
nvbuf_utils: Can not get HW buffer from FD... Exiting...
Caught SIGSEGV
^CSpinning.  Please run 'gdb gst-launch-1.0 19081' to continue debugging, Ctrl-C to quit, or Ctrl-\ to dump core.
handling interrupt.
Interrupt: Stopping pipeline ...
Execution ended after 0:00:03.796977163
Setting pipeline to PAUSED ...
Setting pipeline to READY ...

Hi,
Please try the pipeline and see if video preview is shown:

gst-launch-1.0 v4l2src device=/dev/video1 io-mode=2 blocksize=800000 ! image/jpeg, width=2560, height=960 ! nvv4l2decoder mjpeg=1 ! nvoverlaysink

It the pipeline runs fine, please refer to

deepstream-5.0\sources\apps\sample_apps\deepstream-image-decode-test

It is multifilesrc in the sample. For the usecase, you may customize it to v4l2src and try.

1 Like

Thanks for answer.
I have unusual noise with your pipeline


It seems like pipeline is having issues with format

Hi,
The JPEGs should be in YUV422. Please try
https://elinux.org/Jetson/L4T/r32.4.x_patches
[GSTREAMER]Prebuilt lib for decoding YUV422 MJPEG through nvv4l2decoder

1 Like

I have a similar hardware and I’m using this pipeline :
"pipeline": "gst-launch-1.0 -v v4l2src device=/dev/video0 ! avdec_mjpeg ! videoconvert ! video/x-raw,format=RGB,height=480,framerate=30/1 ! appsink name = acquired_image "
Then I just use CUDA crop function in Isaac SDK to split the feed

I’d like to change my question to:
Why first pipeline is working perfectly fast and second shows only 1 or 2 frames?

Fast:
gst-launch-1.0 filesrc location= /opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/test.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/config_infer_primary_facedetectir.txt batch-size=1 unique-id=1 ! nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so ! nvmultistreamtiler rows=1 columns=1 width=1280 height=720 ! nvvideoconvert ! nvdsosd ! nvoverlaysink

Slow:
gst-launch-1.0 v4l2src device=/dev/video1 io-mode=2 blocksize=800000 ! image/jpeg, width=1280, height=480 ! nvv4l2decoder mjpeg=1 ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=480 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream-5.0/samples/configs/tlt_pretrained_models/config_infer_primary_facedetectir.txt batch-size=1 unique-id=1 ! nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so ! nvmultistreamtiler rows=1 columns=1 width=1280 height=480 ! nvvideoconvert ! nvdsosd ! nvoverlaysink

Hi,
You may try with sync=0:

... ! nvoverlaysink sync=0
1 Like

That was unexpectedly powerful solution, thank you.

As a suggestion, I’d like to see an information about incompatibility of nvvidconv with any deepstream plugin much more clear for new developers. I wasted several months trying to convert video before inference. And i saw only 1 message about this problem https://forums.developer.nvidia.com/t/nvvidconv-vs-nvvideoconvert-ds-4-0-nano/78966

I’d like something like this in Deepstream tutorial
Do not use nvvidconv with any Deepstream plugin

Is there any news about video flipping with nvvideoconvert?
My stereocamera left and right cams are rotated to mirror each other

For now I’m making python app and as i see I’ll have to use NvBufSurface API like in this topic Jetson Nano CSI Raspberry Pi Camera V2 upside down video and it’s a bit frustrating while trying to make fast prototype