Using CUDA to modify video frames on GPU

I am examining the AppTrans example from Nvidia Video SDK examples. The example uses the bUseDeviceFrame=true when initializing the decoder.

This leads me to think that the decoder must return a struct with a reference to some device memory.

I would like to run a CUDA kernel on this memory, but I am unsure on how to proceed as I dont know the structure. Its just handled as a uint8_t pointer.

Where is the struct defined so I can get a handle to the device memory?

Kind regards


uint8_t pointer is the handler (offset in device memory). You must use GPU pointers in most CUDA (or NPP) functions and sometimes you can mix CPU pointer and GPU poiner like in cuMemcpy2DAsync():

bzero(&m, sizeof(m));
m.srcMemoryType = CU_MEMORYTYPE_HOST;
m.srcHost = ...; // CPU_MEMORY_VM_POINTER
m.srcPitch = ...;

m.dstDevice = ...; // GPU_MEMORY_VM_POINTER
m.dstPitch = ...;

m.WidthInBytes = ...;
m.Height =...;
CUDA_API_CALL(cuMemcpy2DAsync(&m, 0));

see standard memory model
see unified memory model

for in stream CUDA (NPP) editing check also second answer