RTSP GStreamer Pipeline split into h264 and raster

Hi, I’m struggling to get this gstreamer pipeline working effectively. Any help would be greatly appreciated.

Our source is an RTSP camera streaming h264 4K at 10fps. We want to split this stream into two streams, one will be saved as video, the other will be resized and saved as a raster for further processing. How can we optimise this and ensure the branches will be in sync - that for every frame there is a matching h264 frame and a matching raster frame?

This is our original pipeline:

gst-launch-1.0 rtspsrc location="rtsp://192.168.1.91:554/av0_0" is-live=true protocols=tcp ! rtph264depay  ! tee name=t ! queue name=q1 max-size-buffers=10 max-size-bytes=0 max-size-time=0 silent=true ! h264parse config-interval=-1 ! video/x-h264,alignment=au,stream-format=byte-stream ! filesink location="/tmp/sourceToH26413305392323547781603.tmp" sync=false t. ! queue name=q2 max-size-buffers=10 max-size-bytes=0 max-size-time=0 silent=true ! h264parse ! nvv4l2decoder ! 'video/x-raw(memory:NVMM)' ! nvvidconv interpolation-method=Smart ! 'video/x-raw(memory:NVMM),width=512,height=288' ! nvvidconv interpolation-method=Smart ! video/x-raw,width=512,height=288 ! videorate ! video/x-raw,framerate=10/1 ! videobox autocrop=true ! video/x-raw,width=512,height=288 ! videoconvert ! video/x-raw,format=BGR ! filesink location="/tmp/sourceToRaster2123237059607176449.tmp" sync=false

I thought we could optimise by performing the h264parse before the tee and including sync=true - as follows, but on some systems, the pipeline completely stalls (this seems to have something to do with the new placement of h264parse):

gst-launch-1.0 rtspsrc location="rtsp://192.168.1.91:554/av0_0" is-live=true protocols=tcp ! rtph264depay ! h264parse config-interval=-1 ! tee name=t ! queue name=q1 max-size-buffers=10 max-size-bytes=0 max-size-time=0 silent=true ! video/x-h264,alignment=au,stream-format=byte-stream ! filesink location="/tmp/sourceToH26413305392323547781603.tmp" sync=true t. ! queue name=q2 max-size-buffers=10 max-size-bytes=0 max-size-time=0 silent=true ! nvv4l2decoder ! 'video/x-raw(memory:NVMM)' ! nvvidconv interpolation-method=Smart ! 'video/x-raw(memory:NVMM),width=512,height=288' ! nvvidconv interpolation-method=Smart ! video/x-raw,width=512,height=288 ! videorate ! video/x-raw,framerate=10/1 ! videobox autocrop=true ! video/x-raw,width=512,height=288 ! videoconvert ! video/x-raw,format=BGR ! filesink location="/tmp/sourceToRaster2123237059607176449.tmp" sync=true

I also note that there’s a drift that occurs over time between the two branches. Either one runs ahead of the other (or the one lags behind the other) - which makes sense given one branch is computational more expensive. This increases over time until it’s eventually minutes apart. For our processing we need to match the h264 frame to the raster frame, so this out of sync behaviour is detrimental.

I’ve included the forced videorate in the raster portion to try ensure we always receive 10fps because we perform processing later on which can only handle 1 frame every 100ms. If more than 1 frame is received under 100ms over an extended period of time, the processing pipeline eventually backs up with frames.

Any input on this at all would be highly appreciated.

Please provide complete information as applicable to your setup. Thanks
Hardware Platform (Jetson / GPU)
Jetson Nano
DeepStream Version
N/A
JetPack Version (valid for Jetson only)
4.4 (L4T 32.4.3)
TensorRT Version
7.1.3.0-1+cuda10.2
Issue Type( questions, new requirements, bugs)
Questions

Thank you

What do you mean by “be in sync”?

If I had to display the output video from the two branches side-by-side I’d want them to stay in sync and not have one lag too far behind the other one.

How will you display them? Sending the video to another place and play with your own player?

So our actual use case is different from just displaying them. We pass the raster to a motion processing algorithm and if it detects motion we capture the frames from the h264 feed. If these two are out of sync, then this is a problem as the motion detection misses the actual motion footage. Does this explain it better?

For The live stream is quite different to the local file. If you save your raster as local file, the timestamp may be changed. The key for synchronization is the timestamp. You need to design your own mechanism to manage the timestamps with your data and algorithms.

If your key requirement is to capture the frame while the motion is detected, you can capture the frame when the motion is detected. Please refer to /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-image-meta-test sample, the persons in the video can be detected by the nvinfer model and the frame of the person can be saved according to the detector output metadata.