The first two suggestions are easily understood for me. But the last one is a little confused. Where should I insert this queue ? And the second and third suggestions will reduce the quality of video, am I right ?
The first one does not affect video quality, the second and third might. For the third one, you can insert queues between any given two elements on the pipeline. It is recommendable to put them before and after computation intensive elements that operate on CPU. ‘rtph264depay’, ‘h264parse’, and 'videoconvert’would be examples on your current pipeline.
Maybe it could work if you have a gstreamer pipeline feeding a v4l2loop node with BGRx, and then use V4L2 API from opencv to read it in BGRx, but I haven’t tried that (I’m away from any jetson now). Well, not sure it helps for latency.
Have you tried to decrease rtspsrc property latency ?
You may also try to set a framerate.
In case it’s not done, be sure to enable all cores with nvpmodel MAXN (m0) and boost clocks with jetson_clocks.
You could also develop a C++/python appsink and manage buffers yourself, and launch the pipeline outside openCV. Again, without using opencCV gstcapture.
Is there a public repository with gstOpenCv ? I failed to find more than the pdf you’ve sent. Also seen some repos with similar name but looks many years old.
I’d say that main point is the final processing requirement…If it requires BGR (or RGB) processing as most of opencv algorithms expect, then you have to make the conversion, and I think it would be better to do this with videoconvert in gstreamer (it may execute on a different core with queue) than in opencv. AFAIK, there is no YUV (I420 nor NV12) to BGR conversion available with cuda in opencv, and the cpu cv::cvtColor would just be a bit slower than gstreamer videoconvert. Grabbing YUV frames in opencv would be efficient for YUV processing only.
There is also some code published by @dusty_nv to do conversion from YUV into BGR with CUDA, but it isn’t straight forward for beginners.
Sadly, RidgeRun’s GstOpenCV is not open source, you can contact us if you are interested in a license. However, the process to implement your own element is not that difficult. There is already a GStreamer base class that we use for our plugin:
For some unknown reason, it takes longer to decode with the hardware decoder than a single-threaded software decoder regardless of the flags we’ve threw at it but YMMV.
@kelsius@nvidias@DaneLLL
your method is working very well! i pull rtsp stream almost no latency!
but sadly didn’t use NVDEC hardware, when i use NVDEC hardware decoding, whatever is set params, it got big latency!,
so ,what is the problem?? can any NVIDIA’s people explain it ? PLEASE!