Low latency decoding with NVCUVID

I am using an NVIDIA GTX 970 video card, under Ubuntu 14.04 64 bits.

I am trying to do low latency decoding with NVCUVID. I would like to achieve zero-frame latency, but, currently, it seems that the parser always introduces a latency of one frame.

I am using the NVCUVID library to decode a raw H264 stream made of only IDR frames (with an encoder configured for zero latency). In order to decode, I am feeding the parser (configured with MaxDisplayDelay=0 and ErrorThreshold=100) with several NAL units that cover an entire video frame.

When calling cuvidParseVideoData() the first time (giving it a CUVIDSOURCEDATAPACKET pointing to a buffer with the NAL units of the first IDR frame), the callback pfnDecodePicture() is not called. I need to either feed the parser another frame, or mark the buffer as “end of stream”. I would like the first call to cuvidParseVideoData() to call pfnDecodePicture() instead of waiting for the next call.

Is there a way to prevent the CUVID H264 parser from buffering one frame before transmitting the picture data to the decode callback?

Thank you!

(while always marking the source data packet as “end of stream” should work for streams with only IDR frames, I intend to use P frames afterwards)

You’ll want to be careful about not buffering decoded frames, because this is used to improve device performance. That being said, I’m also interested in reducing the latency of decoding video. Have you gotten any leads?

I am curious: What kind of use cases require this kind of low latency decoding, and what are latency requirements in actual units of time, i.e. number of milliseconds?

I am using GTX860M on Windows 8.1 and have the same issue with NVCUVID.
CUvideoparser makes one frame latency.
Does anyone know how to achieve zero frame latency with NVCUVID?