I am doing a video encode demo with CUDA 6.5, C/C++, SM3.0.The encode work uses NvEncCreateInputBuffer api to allocate input buffer . I want to reuse this buffer as kernel output memory when kernel finish re-size operation , so the output of re-size can be encoded directly which implements zero-copy by using cudaHostRegister. The process is:
//runtime api set Device
//encode driver API init cuda and get context by cuCtxGetCurrent
//init encoder
//allocate input buffer by NvEncCreatInputBuffer
//get pointer of input buffer
//using cudaHostRegister change the pointer to mapped pinned memory <-----but this failed because invalid value
What’s the exact type of memory return by NvEncCreatInputBuffer? The memory of can be used in host side, because the ordinary demo copy the raw data using ::memcpy.The address of the memory looks like not on the heap(much lower than the addressing using ::malloc and much closer to the stack address). Is the zero-copy can be implemented by this way?