Nvv4l2h264enc gstreamer pipeline delay

Hello everyone,

i am using Jetson Orin NX 16 for traffic light detection. For this purpose i connected a RealSense camera which publishes images into a ROS2 topic, which is converted into RTSP stream - here is a link if someone is interested or need more information how its done https://github.com/45kmh/image2rtsp.git. When i am running it at my PC the delay is about 40-50ms with this gstreamer pipeline:

appsrc name=imagesrc do-timestamp=true min-latency=0 max-latency=0 max-bytes=1000 is-live=true ! videoconvert ! video/x-raw,framerate=10/1 ! x264enc tune=zerolatency bitrate=1000 ! key-int-max=30 ! video/x-h264, profile=baseline ! rtph264pay name=pay0 pt=96

The problem is that i am experiencing a relatively big delay in about 90ms on Jetson with nvv4l2h264enc. As i am new to this technoligy, i am assuming that my Jetson gstreamer pipeline could be modified in some way to make the delay smaller:

appsrc name=imagesrc do-timestamp=true min-latency=0 max-latency=0 max-bytes=1000 is-live=true ! videoconvert ! videorate ! nvvidconv ! video/x-raw(memory:NVMM), framerate=10/1 ! nvv4l2h264enc ! h264parse ! rtph264pay name=pay0 pt=96

Thanks in advance for the any information!

Update:
i have also tried the default pipeline without nvv4l2h264enc and it runs with 40-50ms latency as well. Various configurations of the nvv4l2h264enc depending part from the second pipeline doesn’t provide any sufficient latency decrease, thus my assumption is that there is a bottleneck in the transition from CPU to GPU. Is it possible to work with appsrc directly with GPU? I have found only one example for jpeg format, but in my case i am pushing a buffer with raw RGB8 images.

Hi, @dmalad,
Have you tried to remove the videoconvert element?
So its like:

appsrc name=imagesrc do-timestamp=true min-latency=0 max-latency=0 max-bytes=1000 is-live=true ! nvvidconv ! video/x-raw(memory:NVMM), framerate=10/1 ! nvv4l2h264enc ! h264parse ! rtph264pay name=pay0 pt=96

The video rate and videoconvert only process CPU buffers and are not accelerated, so that adds a bit of latency.
You can also add queues to try and do simultaneus processing like:

appsrc name=imagesrc do-timestamp=true min-latency=0 max-latency=0 max-bytes=1000 is-live=true ! nvvidconv ! video/x-raw(memory:NVMM), framerate=10/1 ! queue ! nvv4l2h264enc ! h264parse ! rtph264pay name=pay0 pt=96

Regards,
Andres
Embedded SW Engineer at RidgeRun
Contact us: support@ridgerun.com
Developers wiki: https://developer.ridgerun.com/
Website: www.ridgerun.com

Hello, @andres.artavia,

First of all, many thanks for your help! I tried to get rid of the videoconvert element, but in this case, after I try to access the stream, the following error appears:

[rtsp @ 0x56341c37e900] method DESCRIBE failed: 503 Service Unavailable
[ WARN:0] global ./modules/videoio/src/cap_gstreamer.cpp (2075) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module source reported: Unhandled error
[ WARN:0] global ./modules/videoio/src/cap_gstreamer.cpp (1053) open OpenCV | GStreamer warning: unable to start pipeline
[ WARN:0] global ./modules/videoio/src/cap_gstreamer.cpp (616) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
[ERROR:0] global ./modules/videoio/src/cap.cpp (164) open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.5.4) ./modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): rtsp://192.168.73.161:8554/rs in function 'icvExtractPattern'

It seems to me that functionality of nvvidconv doesn’t allow to convert ROS2 sensor messages, thus its more ROS related problem. Probably i need to convert my images into jpeg format first…

Hi,
Yeah for default ROS you need host buffers, I belive you can play around with NitROS, to share hw buffers. But I personally dont have experience with it. I have shared images using this sample code in case it helps:

Regards,
Andres

1 Like

Thanks! Now the problem is clearer to me. It doesn’t seem like I can solve it quickly because it heavily relies on the hardware architecture I am currently using. The source of image topics is a RealSense camera ROS2 Wrapper. Applying NitRos to the Wrapper or writing a new node would be a very time-consuming task. Therefore, I will leave it for later and use the CPU pipeline with 40ms.