NVDEC performance

I’m using NVDEC to decode single HEVC stream and each packet contains single frame data. Sending frames to decoder one after another yields 10msec average for frame processing. When sending frames to decoder with delay of ~7msec between frames the single frame processing becomes 30msec. What is going on? It looks like CUDA decides to go to “sleep mode” .