How to use cudaMemcpy3D and cudaMemcpy3DParms in Cuda Fortran

Hello everyone,

I’m using Cuda Fortran to accelerate an existing application written in Fortran. I would like to use the API function cudaMemcpy3DAsync to asynchronously copy part of a 3D array. I see in the Cuda Fortran programming guide that this function uses a data structure called cudaMemcpy3DParms, unlike cudaMemcpy2D/cudaMemcpy2DAsync which are more straightforward. However, I haven’t been able to find any documentation on this data structure in Cuda Fortran nor any examples of anyone using cudaMemcpy3DAsync in Cuda Fortran; the page in the API documentation is in C/C++ only. Am I missing something in the docs about how the API data structures work in Fortran? And does anyone have any examples/snippets on how to use this function/data structure in Fortran, or suggestions for another way to asynchronously copy part of a 3D array?


We don’t really support cudaMemcpy3D in CUDA Fortran. It is very awkward to use. Are you using contiguous slices of the 3D data? If so, you can use one of the other methods we support.