Allocating Shared Buffers with OpenCV3?

Hi All,

I’ve been working with OpenCV3 and CUDA, and I may get access to a TX1 soon. I wanted to see if using using unified buffers /Shared Memory (using the UMA, I assume) would give any sort of performance increase on that device, but I’m not really sure how to allocate shared buffers and use them as Mats.

I see in the documentation that there’s a data structure called CudaMem. That structure has the option to allocated zero-copy memory, but to be honest I don’t know what header file to include in order to gain access to that type. So I guess my first question is, what do I have to include to get access to the CudaMem type?

Assuming I do find the header and can allocate a SHARED CudaMem object, what would be the protocol for using it? I see that I can create a GpuMat header that maps CPU memory to GPU hardware, so I assume it would go something like this

using namespace cv::cuda;
...

// Allocate shared memory
CudaMem data(CudaMem::SHARED); 

// Use h_DataHeader in host operations
cv::Mat h_DataHeader = data.CreateMatHeader(); 

// Use d_DataHeader in device operations
GpuMat d_DataHeader = data.createGpuMatHeader();

But I’m not really sure if this is the case / if this would require any sort of device synchronization, like it does using buffers allocated via cudaMallocManaged.

Finally, I’m not sure how / if Shared CudaMem objects would work with streams, or if those are reserved for page locked memory. Does anyone know anything about this?

Thanks for your help. I know I asked a lot, but if you have any input on the above I’d love to hear it. Also feel free to yell at me for not being specific enough.

  • John

EDIT: stackoverflow user talonmies got mad at me for calling them shared buffers, which is fair since it’s easily confused with shared memory used within kernel calls. I suppose I meant “zero-copy” or “unified” memory. My mistake.

Does anyone know this? I can probably figure out the rest myself, I just really need to know what to include in order to get access to the CudaMem type. Was is removed from the API?

Bump.

I also question if OpenCV can take advantage of CUDA’s unified memory, and if so, how?