When the destination array is smaller than the source array, cudaMemcpy3D() seems to check the source indices against the size of the destination array, potentially resulting in an error even if all indices involved in the copy operation fit into the allocated ranges. The attached file demonstrates the problem. Am I missing something obvious, or is this indeed a bug in cudaMemcpy3D?
My system:
Linux openSUSE-11.0 x86_64
Kernel 2.6.25.11
GeForce 8800 GTX
NVidia driver 177.13
CUDA-2.0beta2
Could somebody at NVidia please comment on this? This issue actually renders the cudaMemcpy3D function useless when copying data between CUDA arrays of different size.
I just checked with the new Linux driver 177.80 (openSUSE-11.0, x86_64), cudaMemcpy3D() is still broken. Is this a deprecated function? If it is, what should be used instead?