CUDA Video Decoder Linux

Hi all,

I recently tested the API Video Decoder on Windows and it works well but I would like to run on my Linux platform.
In the documentation, it is stated that the VideoSource and the VideoParser doesn’t work on Linux and I should implement it myself.

My configuration:

  • Ubuntu 12.04 LTS 32bits
  • NVIDIA Driver 304.43
  • CUDA toolkit 4.2

Note that I am a beginner with the CUDA Video Decoder.

In my configuration, I can call methods “cuvidCreateVideoParser()” and “cuvidParseVideoData()”.
Does that mean that the VideoParser works on Linux? Otherwise, why not?

After some works, I replace the VideoSource by my own and I keep the VideoParser from windows.
I could make an executable that takes a bitstream, gives to the parser and then map/unmap the frame.

What I want is to retrieve each frame in the format IYUV. But when I get the NV12 frame from the GPU memory and I convert it in IYUV (deinterlace UV), I have a lot of problems.

  1. The first frame is correctly converted on video 960x560 but not on video 384x288.
  2. It seems that only the I-frame are correctly converted.

My questions are:

  1. How can I retrieve each frame from the GPU memory? For the moment I use
g_pVideoDecoder->mapFrame(oDisplayInfo.picture_index, &pDecodedFrame[active_field], &nDecodedPitch, &oVideoProcessingParameters);
cuMemcpyDtoH(decoded_frame, pDecodedFrame[active_field], nDecodedPitch * g_nVideoHeight * 3 / 2);

... split the decoded_frame in 3 channels Y U V ...

g_pVideoDecoder->unmapFrame(pDecodedFrame[active_field]);

Is it correct?

  1. Why the I-Frames are correctly decoded and not the others? I use the same code!
    I-Frame:
    External Media

P-Frame:

  1. Why the first frame 960x540 is correctly decoded and not the first frame 384x288? The map is not the same on GPU with lower resolution?

First frame 384x288:
External Media

A part of my code:

//copy decoded frame NV12 from device to host
cuMemcpyDtoH(decoded_frame, pDecodedFrame[active_field], nDecodedPitch * g_nVideoHeight * 3 / 2);
			chU = channelU;
			chV = channelV;
			//channel Y
for(i = 0; i < g_nVideoHeight; i++)
{
	T = decoded_frame + (i*nDecodedPitch);
	memcpy((channelY+(i*g_nVideoWidth)), T, g_nVideoWidth);
}
//channel U V
for(i = 0; i < g_nVideoHeight/2; i++)
{
	T = decoded_frame + (g_nVideoHeight*nDecodedPitch) + i * nDecodedPitch;
	for (j = 0; j < g_nVideoWidth; j+=2, T+=2, chU++, chV++)
	{
			*chU = T[0]; 
			*chV = T[1];
	}
}
		
if ( fpOut )
{
	fwrite( channelY, 1, g_nVideoWidth * g_nVideoHeight, fpOut );
	fwrite( channelU, 1, (g_nVideoWidth/2) * (g_nVideoHeight/2), fpOut );
	fwrite( channelV, 1, (g_nVideoWidth/2) * (g_nVideoHeight/2), fpOut );
}		
// unmap video frame
g_pVideoDecoder->unmapFrame(pDecodedFrame[active_field]);
// release the frame, so it can be re-used in decoder
g_pFrameQueue->releaseFrame(&oDisplayInfo);
g_DecodeFrameCount++;

Thanks in advance.