cudaDecodeGL, jerky, choppy, hesitation, execution. Pauses briefly every few frames.

When running the sample/3_Imaging/cudaDecodeGL, the output pauses every few frames for about 1/2second. Any idea what is going on? I’ve tried running this both on a Tesla K40c and NVS 810 with the same result. I’m using Ubuntu 16.04 and the nvidia-378.13 driver and cuda 8.0.34-1.

The output with the -displayvideo option looks like this:

[CUDA/OpenGL Video Decode]
Command Line Arguments:
argv[0] = ./cudaDecodeGL
argv[1] = -displayvideo
[cudaDecodeGL]: input file: <../../../../3_Imaging/cudaDecodeGL/data/plush1_720p_10s.m2v>
	VideoCodec      : MPEG-2
	Frame rate      : 30000/1001fps ~ 29.97fps
	Sequence format : Progressive
	Coded frame size: [1280, 720]
	Display area    : [0, 0, 1280, 720]
	Chroma format   : 4:2:0
	Bitrate         : 14116kBit/s
	Aspect ratio    : 16:9


argv[0] = ./cudaDecodeGL
argv[1] = -displayvideo

> Device 0: <      Tesla K40c >, Compute SM 3.5 detected
reshape() glViewport(0, 0, 1280, 720)
>> initGL() creating window [1280 x 720]
> Using CUDA/GL Device [1]: NVS 810
> Using GPU Device: NVS 810 has SM 5.0 compute capability
  Total amount of global memory:     2046.5000 MB
>> modInitCTX<NV12ToARGB_drvapi64.ptx > initialized OK
>> modGetCudaFunction< CUDA file:              NV12ToARGB_drvapi64.ptx >
   CUDA Kernel Function (0x01a00680) = <   NV12ToARGB_drvapi >
>> modGetCudaFunction< CUDA file:              NV12ToARGB_drvapi64.ptx >
   CUDA Kernel Function (0x01a09d00) = <     Passthru_drvapi >
  Free memory:     1624.2148 MB
> VideoDecoder::cudaVideoCreateFlags = <1>Use CUDA decoder

setTextureFilterMode(GL_NEAREST,GL_NEAREST)
ImageGL::CUcontext = 0151cd10
ImageGL::CUdevice  = 00000001
reshape() glViewport(0, 0, 1280, 720)
[cudaDecodeGL] - [Frame: 0016, 00.0 fps, frame time: 93034766336.00 (ms) ]
[cudaDecodeGL] - [Frame: 0032, 16.1 fps, frame time: 62.11 (ms) ]
[cudaDecodeGL] - [Frame: 0048, 15.5 fps, frame time: 64.37 (ms) ]
[cudaDecodeGL] - [Frame: 0064, 15.3 fps, frame time: 65.50 (ms) ]
[cudaDecodeGL] - [Frame: 0080, 16.1 fps, frame time: 62.11 (ms) ]
[cudaDecodeGL] - [Frame: 0096, 15.6 fps, frame time: 63.96 (ms) ]
[cudaDecodeGL] - [Frame: 0112, 57.6 fps, frame time: 17.35 (ms) ]
[cudaDecodeGL] - [Frame: 0128, 15.3 fps, frame time: 65.51 (ms) ]
[cudaDecodeGL] - [Frame: 0144, 15.0 fps, frame time: 66.63 (ms) ]
[cudaDecodeGL] - [Frame: 0160, 15.3 fps, frame time: 65.50 (ms) ]
[cudaDecodeGL] - [Frame: 0176, 15.8 fps, frame time: 63.25 (ms) ]
[cudaDecodeGL] - [Frame: 0192, 16.3 fps, frame time: 61.23 (ms) ]
[cudaDecodeGL] - [Frame: 0208, 56.1 fps, frame time: 17.82 (ms) ]
[cudaDecodeGL] - [Frame: 0224, 16.4 fps, frame time: 60.98 (ms) ]
[cudaDecodeGL] - [Frame: 0240, 16.4 fps, frame time: 60.99 (ms) ]
[cudaDecodeGL] - [Frame: 0256, 16.4 fps, frame time: 60.99 (ms) ]
[cudaDecodeGL] - [Frame: 0272, 16.4 fps, frame time: 60.99 (ms) ]
[cudaDecodeGL] - [Frame: 0288, 16.7 fps, frame time: 59.86 (ms) ]
[cudaDecodeGL] - [Frame: 0304, 55.4 fps, frame time: 18.06 (ms) ]
[cudaDecodeGL] - [Frame: 0320, 16.4 fps, frame time: 60.98 (ms) ]

[cudaDecodeGL] statistics
	 Video Length (hh:mm:ss.msec)   = 00:00:18.067
	 Frames Presented (inc repeats) = 329
	 Average Present Rate     (fps) = 18.21
	 Frames Decoded   (hardware)    = 329
	 Average Rate of Decoding (fps) = 18.21

and without the -displayvideo flag

[CUDA/OpenGL Video Decode]
Command Line Arguments:
argv[0] = ./cudaDecodeGL
[cudaDecodeGL]: input file: <../../../../3_Imaging/cudaDecodeGL/data/plush1_720p_10s.m2v>
	VideoCodec      : MPEG-2
	Frame rate      : 30000/1001fps ~ 29.97fps
	Sequence format : Progressive
	Coded frame size: [1280, 720]
	Display area    : [0, 0, 1280, 720]
	Chroma format   : 4:2:0
	Bitrate         : 14116kBit/s
	Aspect ratio    : 16:9


argv[0] = ./cudaDecodeGL

> Device 0: <      Tesla K40c >, Compute SM 3.5 detected
reshape() glViewport(0, 0, 1280, 720)
>> initGL() creating window [1280 x 720]
> Using CUDA/GL Device [1]: NVS 810
> Using GPU Device: NVS 810 has SM 5.0 compute capability
  Total amount of global memory:     2046.5000 MB
>> modInitCTX<NV12ToARGB_drvapi64.ptx > initialized OK
>> modGetCudaFunction< CUDA file:              NV12ToARGB_drvapi64.ptx >
   CUDA Kernel Function (0x0253d660) = <   NV12ToARGB_drvapi >
>> modGetCudaFunction< CUDA file:              NV12ToARGB_drvapi64.ptx >
   CUDA Kernel Function (0x02546ce0) = <     Passthru_drvapi >
  Free memory:     1624.2148 MB
> VideoDecoder::cudaVideoCreateFlags = <1>Use CUDA decoder

setTextureFilterMode(GL_NEAREST,GL_NEAREST)
ImageGL::CUcontext = 02059d10
ImageGL::CUdevice  = 00000001
reshape() glViewport(0, 0, 1280, 720)
[cudaDecodeGL] - [Frame: 0016, 00.0 fps, frame time: 93034774528.00 (ms) ]
[cudaDecodeGL] - [Frame: 0032, 333.4 fps, frame time: 3.00 (ms) ]
[cudaDecodeGL] - [Frame: 0048, 393.5 fps, frame time: 2.54 (ms) ]
[cudaDecodeGL] - [Frame: 0064, 16.0 fps, frame time: 62.37 (ms) ]
[cudaDecodeGL] - [Frame: 0080, 456.6 fps, frame time: 2.19 (ms) ]
[cudaDecodeGL] - [Frame: 0096, 598.3 fps, frame time: 1.67 (ms) ]
[cudaDecodeGL] - [Frame: 0112, 363.3 fps, frame time: 2.75 (ms) ]
[cudaDecodeGL] - [Frame: 0128, 355.8 fps, frame time: 2.81 (ms) ]
[cudaDecodeGL] - [Frame: 0144, 15.9 fps, frame time: 62.73 (ms) ]
[cudaDecodeGL] - [Frame: 0160, 530.3 fps, frame time: 1.89 (ms) ]
[cudaDecodeGL] - [Frame: 0176, 376.9 fps, frame time: 2.65 (ms) ]
[cudaDecodeGL] - [Frame: 0192, 388.8 fps, frame time: 2.57 (ms) ]
[cudaDecodeGL] - [Frame: 0208, 511.7 fps, frame time: 1.95 (ms) ]
[cudaDecodeGL] - [Frame: 0224, 865.2 fps, frame time: 1.16 (ms) ]
[cudaDecodeGL] - [Frame: 0240, 15.5 fps, frame time: 64.34 (ms) ]
[cudaDecodeGL] - [Frame: 0256, 402.8 fps, frame time: 2.48 (ms) ]
[cudaDecodeGL] - [Frame: 0272, 412.8 fps, frame time: 2.42 (ms) ]
[cudaDecodeGL] - [Frame: 0288, 403.8 fps, frame time: 2.48 (ms) ]
[cudaDecodeGL] - [Frame: 0304, 755.1 fps, frame time: 1.32 (ms) ]
[cudaDecodeGL] - [Frame: 0320, 600.6 fps, frame time: 1.67 (ms) ]

[cudaDecodeGL] statistics
	 Video Length (hh:mm:ss.msec)   = 00:00:04.668
	 Frames Presented (inc repeats) = 329
	 Average Present Rate     (fps) = 70.47
	 Frames Decoded   (hardware)    = 329
	 Average Rate of Decoding (fps) = 70.47

Thanks!

The problem appears to be this section of code in FrameQueue.cpp

// Spins until frame becomes available or decoding
// gets canceled.
// If the requested frame is available the method returns true.
// If decoding was interrupted before the requested frame becomes
// available, the method returns false.
bool
FrameQueue::waitUntilFrameAvailable(int nPictureIndex)
{
    while (isInUse(nPictureIndex))
    {
        sleep(1);   // Decoder is getting too far ahead from display

        if (0 != bEndOfDecode_)
        {
            return false;
        }
    }

    return true;
}

The “sleep(1)” is causing the pause. There as got to be a better way of handling this than just sleeping for 1 second. Can an interrupt from the decoder be used? Has anyone done this before?

Thanks