I’ve come across a puzzling issue with processing videos from OpenCV. I’ve managed to get gstreamer and OpenCV playing nice together, to a point.
The point is, I’m getting “invalid argument” errors from CUDA calls when attempting to do very basic stuff with the video frames.
What I think is happening is: the gstreamer video decoder pipeline is set to leave frame data in NVMM memory (and I admit I’m not completely certain what the status of that is, with respect to host or device). I can package that as a GpuMat
, and basic filters like blurring work fine, so it looks like CUDA can access and process that memory in place. However, if I make another GpuMat and attempt to copy the video frame to it, or even clone it, I get an argument error. Since OpenCV’s clone
/copyTo
explicitly uses cudaMemcpyDeviceToDevice
, I am wondering if cudaMemcpy2D
decides that the NVMM is not actually device memory, even though to all intents and purposes it will work as if it is.
Does this make sense? What actually is the status of the hardware DMA buffers with respect to host/device memory? It’s not exactly clear (to me) from the documentation.
I’ve checked most of the other issues: for example, the DMA buffer GpuMat is continuous, which is unusual for GpuMat, and I explicitly make a continuous one to copy into. And it is significant that even .clone()
on the GpuMat is failing.
Does anyone know the exact status of copying memory from DMA buffers with cudaMemcpy2D?
I don’t really want to leave the data in a Mat and then upload/download – the whole point of all this work is to make the video data transport as minimal as possible and maximize throughput.