NVDEC decoded frame - trying a zero copy to NV12 d3d11 texture

Hi all, new dedicated thread.

I’m receiving CUVIDPARSERDISPINFO packets from nvdec, what I want to do now is a zero copy to d3d11 texture for rendering. The goal is to not overload CPU.

Test scenario: 40 indipendent h264 streams, 704x576@15fps

What I’ve managed to do so far: cuGraphicsD3D11RegisterResource two different textures (R8_UNorm + R8G8_UNorm, created at begin), then cuGraphicsMapResources, cuGraphicsSubResourceGetMappedArray (one time) cuMemcpy2DAsync and cuGraphicsUnmapResources for both. This way I got my complete frame on screen, but average CPU usage is about 38%

If I get rid of the chrominance texture, CPU drops to 25%.

What I’m trying to do: I’ve created a single NV12 texture, cuGraphicsD3D11RegisterResource it, then trying to copy decoded nv12 frame into the mapped array in a single shot.

If I leave CUDA_MEMCPY2D unchanged, I obviously obtain only the luma plane, a greenish frame.
Each attempts to modify these params leads to a CUDA_ERROR_INVALID_VALUE from the following cuMemcpy2DAsync.

How to modify params to perform a single shot copy?

CUDA_MEMCPY2D m = { 0 };
m.srcMemoryType = CU_MEMORYTYPE_DEVICE;
m.srcDevice = dpSrcFrame;
m.srcPitch = nSrcPitch;
m.dstMemoryType = CU_MEMORYTYPE_ARRAY;
m.dstArray = dstArray;
m.WidthInBytes = m_nWidth;
m.Height = m_nLumaHeight;

Thanks,

Please can I have assistance on my request? Is that possible someway?

Are you creating and mapping a new texture for each frame? You can’t do that - it’s too slow and burns too much CPU as you’ve noticed. You have to map one texture (or a finite set of textures if you need more than one) and re-use it for each frame. And when I say “one texture”, I mean one per plane, etc.

No, I’m creating two textures, once, one for the luminance plane and the other one for the chroma plane, so I need to do 2 cuMemcpy2DAsync.

I mean, are you creating new textures for each video frame that is passed to you by the decoder? That is what you can’t do. You need a fixed pool of textures allocated up front and re-use them over and over.

No, I’m not creating new textures for each video frame.

Right, so you need to change your code to reuse textures for each video frame.