Hi all, new dedicated thread.
I’m receiving CUVIDPARSERDISPINFO packets from nvdec, what I want to do now is a zero copy to d3d11 texture for rendering. The goal is to not overload CPU.
Test scenario: 40 indipendent h264 streams, 704x576@15fps
What I’ve managed to do so far: cuGraphicsD3D11RegisterResource two different textures (R8_UNorm + R8G8_UNorm, created at begin), then cuGraphicsMapResources, cuGraphicsSubResourceGetMappedArray (one time) cuMemcpy2DAsync and cuGraphicsUnmapResources for both. This way I got my complete frame on screen, but average CPU usage is about 38%
If I get rid of the chrominance texture, CPU drops to 25%.
What I’m trying to do: I’ve created a single NV12 texture, cuGraphicsD3D11RegisterResource it, then trying to copy decoded nv12 frame into the mapped array in a single shot.
If I leave CUDA_MEMCPY2D unchanged, I obviously obtain only the luma plane, a greenish frame.
Each attempts to modify these params leads to a CUDA_ERROR_INVALID_VALUE from the following cuMemcpy2DAsync.
How to modify params to perform a single shot copy?
CUDA_MEMCPY2D m = { 0 };
m.srcMemoryType = CU_MEMORYTYPE_DEVICE;
m.srcDevice = dpSrcFrame;
m.srcPitch = nSrcPitch;
m.dstMemoryType = CU_MEMORYTYPE_ARRAY;
m.dstArray = dstArray;
m.WidthInBytes = m_nWidth;
m.Height = m_nLumaHeight;
Thanks,