CUDA Video Decoder Linux

Hi all,

I recently tested the API Video Decoder on Windows and it works well but I would like to run on my Linux platform.
In the documentation, it is stated that the VideoSource and the VideoParser doesn’t work on Linux and I should implement it myself.

My configuration:

  • Ubuntu 12.04 LTS 32bits
  • NVIDIA Driver 304.43
  • CUDA toolkit 4.2

Note that I am a beginner with the CUDA Video Decoder.

In my configuration, I can call methods “cuvidCreateVideoParser()” and “cuvidParseVideoData()”.
Does that mean that the VideoParser works on Linux? Otherwise, why not?

After some works, I replace the VideoSource by my own and I keep the VideoParser from windows.
I could make an executable that takes a bitstream, gives to the parser and then map/unmap the frame.

What I want is to retrieve each frame in the format IYUV. But when I get the NV12 frame from the GPU memory and I convert it in IYUV (deinterlace UV), I have a lot of problems.

  1. The first frame is correctly converted on video 960x560 but not on video 384x288.
  2. It seems that only the I-frame are correctly converted.

My questions are:

  1. How can I retrieve each frame from the GPU memory? For the moment I use
g_pVideoDecoder->mapFrame(oDisplayInfo.picture_index, &pDecodedFrame[active_field], &nDecodedPitch, &oVideoProcessingParameters);
cuMemcpyDtoH(decoded_frame, pDecodedFrame[active_field], nDecodedPitch * g_nVideoHeight * 3 / 2);

... split the decoded_frame in 3 channels Y U V ...


Is it correct?

  1. Why the I-Frames are correctly decoded and not the others? I use the same code!


  1. Why the first frame 960x540 is correctly decoded and not the first frame 384x288? The map is not the same on GPU with lower resolution?

First frame 384x288:

A part of my code:

//copy decoded frame NV12 from device to host
cuMemcpyDtoH(decoded_frame, pDecodedFrame[active_field], nDecodedPitch * g_nVideoHeight * 3 / 2);
			chU = channelU;
			chV = channelV;
			//channel Y
for(i = 0; i < g_nVideoHeight; i++)
	T = decoded_frame + (i*nDecodedPitch);
	memcpy((channelY+(i*g_nVideoWidth)), T, g_nVideoWidth);
//channel U V
for(i = 0; i < g_nVideoHeight/2; i++)
	T = decoded_frame + (g_nVideoHeight*nDecodedPitch) + i * nDecodedPitch;
	for (j = 0; j < g_nVideoWidth; j+=2, T+=2, chU++, chV++)
			*chU = T[0]; 
			*chV = T[1];
if ( fpOut )
	fwrite( channelY, 1, g_nVideoWidth * g_nVideoHeight, fpOut );
	fwrite( channelU, 1, (g_nVideoWidth/2) * (g_nVideoHeight/2), fpOut );
	fwrite( channelV, 1, (g_nVideoWidth/2) * (g_nVideoHeight/2), fpOut );
// unmap video frame
// release the frame, so it can be re-used in decoder

Thanks in advance.