How to copy an array slice in cuda fortran

Hi,

I’m using CUDA Fortran to build an application and I’m running into a problem.

Assume there are two arrays arr_a(m,n) and arr_b(m) with m and n being their dimensions. I want to copy a column of arr_a, e.g., arr_a(1:m,1), to arr_b. Can this achieved by using cudamemcpy and how?

Furthermore, assume there are arr_c(m,n) on device 0 and arr_d(m,n) on device 1, is it possible to copy a column of arr_c (e.g., arr_c(1:m,2)) to a column of arr_d (e.g., arr_d(1:m,3))?

Thanks a lot!

This is possible using cudaMemcpy2D and cudaMemcpy2DPeer, but I cannot give you an example in Fortran.
Use the address of the first element of the column you want to copy, set copy width to the size of 1 element, and source pitch to the number of bytes that each row occupies.

Thank you! I was thinking about memcpy2D too, but I am a little confused about how to implement it.