Video decoder frames latency between first frame inserted and first frame extracted

GiorgioPalladini · December 19, 2016, 4:06pm

Hello everybody!

I’m new to this forum so I hope I’ve submitted my questions into the right place (if not please let me know wich is the correct section).

I use HW decoders to decode both h264 and MJPEG streams coming from video surveillance cameras.
My example “acquisition pipe” works the same as the “cudaDecodeD3D9” example, which comes along with the SDK (tipically in C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\3_Imaging\cudaDecodeD3D9), except for the VideoSource object. So we correctly use the VideoParser and the VideoDecoder.

Everything is working perfectly with h264 intra and not intra frame streams (gop > 1) and with MJPEG streams.

The only thing I’ve noticed is that it takes some frames to receive the first HandlePictureDisplay callback even if my stream gop is equal to 1 (MJPEG or H264 intra). The first frame passed to the parsed and to the decoder is always a key-frame or I-frame even ig gop > 1.

For the first frame passed to the parser and to the decoder I set CUvideopacket = CUVID_PKT_TIMESTAMP | CUVID_PKT_DISCONTINUITY, for the further frames to CUVID_PKT_TIMESTAMP, and for the last frame to CUVID_PKT_TIMESTAMP | CUVID_PKT_ENDOFSTREAM. In this scenario I was not able to get the first HandlePictureDisplay before submitting other frames to the decoder.

I also tried to set the CUVID_PKT_ENDOFSTREAM flag to the first frame and the first frame comes out before submitting the next one but it only works in some circumstances (sorry but I was not able to identify them).

So, I assumed that an entire GOP has to be decoded before receiving the first HandlePictureDisplay.

is it correct to assume that?
Are all NVidia HW decoders working with this behavior?
Is there any way to let the first frame comes out (HandlePictureDisplay) before submitting the second frame?

Thank you in advance, I hope someone can enlighten me on this subject.
Regards!

Giorgio

RodeBiet · January 24, 2017, 7:53pm

I have the same “problem”. I have live streams as well. I have to put in 5 frames before I get the first out. Is there a way to lower this latency?

I actually use ffmpeg for decoding, it has NVDECODE integrated. If i use the ffmpeg software h.264 decoder, it return the first frame right after I pass the first frame. If I use h264_cuvid, it takes 4 extra frames to pass before it returns the first frame.

For testing I added a one second sleep between pass a frame (avcodec_send_packet(…)) and getting a decoded frame (frameFinished = avcodec_receive_frame(…)). The results are the same.

Thanks in advance.

philipl · January 24, 2017, 11:50pm

Internally, the hardware is pipelined, and yields maximum performance when handling multiple frames at different stages of the decoding process. I expect that the hardware requires the pipeline to be filled before it will start returning anything, and that this is independent of GOP or I-frames.

I would expect that setting ENDOFSTREAM would force it to decode the single frame (I don’t know what problems you saw, but they should be separate and solvable), but I’d also expect the performance to be bad - worse than realtime bad? I don’t know, but noticeable, to be sure.

RodeBiet · January 25, 2017, 4:17pm

Is there somebody from NVIDIA that can give an defintive answer on this? It will be much appreciated. Thanks in advance.

tarichuo · June 3, 2019, 9:35am

I have the same problem with SDK-9.0.22 that the first 5 frames are buffered. Is there somebody than can explain this?

Thanks in advance.

mandar_godse · June 14, 2019, 6:00am

Hello.

We would suggest to follow below steps to reduce latency during decoding process:

In case you know apriori(and sure of it) that your application has exactly 1 frame of data, you should set CUvideopacketflags::CUVID_PKT_ENDOFPICTURE in cuvidParseVideoData(). This flag signals the underlying driver to start decode immediately.
Can you try setting ulMaxDisplayDelay = 0?
For the streams you are evaluating is “num_reorder_frames” in VUI set to zero? These are some syntax elements which can force the parser to introduce latency.

We are hoping that #1 and #2 should solve the problem. If it doesn’t, please share the bit-stream with us. For H264, will you be using I frame only stream?

Thanks.

tarichuo · June 15, 2019, 10:33am

Hello Mandar!

I tried both setting CUvideopacketflags::CUVID_PKT_ENDOFPICTURE in cuvidParseVideoData() and setting ulMaxDisplayDelay to 0, there is still a 5-frame latency as before.
I’m not familiar with “num_reorder_frames”, but there is no latency with CPU decoding.

By the way, if I set CUvideopacketflags::CUVID_PKT_ENDOFSTREAM in cuvidParseVideoData(), the decoder output immediately. But it only works with I frame.

mandar_godse · June 20, 2019, 11:44am

Hi.
Can you share the failing stream with us to analyze? And, also help me understand what use case this is.

Thanks.

tarichuo · June 21, 2019, 2:46am

Hi mandar,

Sorry for the late reply. Actually, I’m working with a live stream through RTP from an IP camera. I record a video from it and below is ther URL.

External Media

Thanks.

trild-vietnam · September 28, 2021, 12:04am

Hi mandar_godse. Do you have any update on this topic? I have a single frame H264. and decode this frame. setting like your suggest but need to push other data second data (even that NULL data, size 0) to get the first decoded data.

Topic		Replies	Views
Decoding problem when feeding CUvideodecoder manually Video Processing & Optical Flow	13	1821	March 28, 2018
Issue with m_bForce_zero_latency(force_zero_latency) option NvDecode.cpp Video Processing & Optical Flow decoder , video	8	1205	December 12, 2022
Video SDK decoder or encoder have always 5 frames Buffer DPB buffer or some other frame buffer Video Processing & Optical Flow	10	2938	June 24, 2022
Low latency with h264_cuvid in ffmpeg General Topics and Other SDKs	0	1120	July 29, 2020
NVDEC How to multihread decoding for better performance (lower latency)? Video Processing & Optical Flow camera	3	2091	April 21, 2020
Receive decoded frame after first call to parser (NVDEC) Video Processing & Optical Flow	1	1113	May 21, 2018
Flushing CUDA decoder for live H264 stream decode CUDA Programming and Performance	3	2591	May 6, 2015
cuvidDecodePicture decode video frame sync Video Processing & Optical Flow	4	2498	June 27, 2017
Can start to parse video in specific frame(not the begin) of the video with codec library? Video Processing & Optical Flow	17	2008	October 12, 2021
Low Latency Decoding Issue Video Processing & Optical Flow	11	1582	September 11, 2018

Video decoder frames latency between first frame inserted and first frame extracted

Related topics