Understanding NVDecodeD3d11 sample

w.kelly · November 19, 2017, 4:31am

Hi,

I’m trying to understand he pipeline by which a decoded video frame is displayed.
My understanding is:

cuvidMapVideoFrame uses pictureIndex of decoded frame to fetch device pointer (pDecodedFrame[active_field]) and pitch
CUDA kernel is launched to convert NV12 pDecodedFrame[active_field] to pre-allocated g_pRgba device array
cuMemcpy2D is used to copy g_pRgba to g_backBufferArray that is mapped to pTexture_[active_field] (which is a 2DTexture)
context->CopyResource is used to copy from pTexture_[active_field] to pBackBuffer
(which is buffer 0 of the swap chain).
The swap chain then presents the next buffer

I don’t understand why steps 3 & 4 are necessary. Why can’t the CUDA kernel write directly to the target back buffer/texture? Seem’s like unnecessary copying. I’m sure there’s a good reason, I’m just new to CUDA and Direct3D.

It’d be great if there was a bare bones C video decode D3D11 example.
All the c++ object orientation makes it difficult to see the essential sequence and flow of data between the key API calls.

Cheers, Wayne.