Problems with cudaMemcpy3D


I’m having a hard time getting cudaMemcpy3D and cudaMemcpy3DPeer to do what I want to do. The function always fails with “error 11 invalid argument”. Rather than showing you my many many versions of the code to try to get it running I’m asking for a working example somewhere that does not involve cuda arrays but simple (pitched) pointers. I could not find anything in the SDK or on the web, they always use cudaMalloc3DArray (for textrues) instead of cudaMalloc3D. I want to have regular (pitched) memory, not a cuda array and do 2D/3D communication using the DMA engines directly instead of writing a kernel to do it.

Thanks in advance!

I had the same error with cudamemcpy3d. How do you configure the CUDA extent? I found that the width is not set in bytes but in element?

Let us know if it’s your case.


Thanks for you reply pQB.

I don’t think this is it. The documentation states that width of extent must be specified in bytes. This makes sense, since the entire cuda API (except texture stuff) is type agnostic.
Do you have cudaMemcpy3D working in your code without cuda arrays? Could you post your configuration and call to cudaMemcpy3D? Or somebody else? There must be someone who has this function working in his code.

Thanks in advance!

I opened a bug with the Nvidia CUDA team. Will report any results that come from this.