how to use cudamemcpy3dasync?

steve.xu · April 4, 2012, 9:08am

hi,everyone
I am trying to implement asynchronous data transfer between gpu and cpu in cuda fortran. My data is a 3D array, which means i should use cudamemcpy3dasync. But the cuda fortran reference is too simple and i donot know how to fill in the “cudaMemcpy3DParms” structure. Anybody has any experience about how to perform asynchronous data transfer of 3D array??
By the way, if i want to copy a 4D array to gpu asynchronously, must i split it into many 3D arrays? or are there other alternative methods?

Thanks!

MatColgrove · April 4, 2012, 5:47pm

Hi Steve,

I don’t have an example off hand but could pull one together. Though, it’s probably not necessary to use the 3D functions. If you are copying the entire array, you can simply use cudaMemCpyAsync. Fotran arrays are contiguous so just copy it as a 1-D array with a size of NML. Same could be done with a 4-D array.

Mat

steve.xu · April 10, 2012, 11:21am

Thanks Mat.
Actually i am trying to implement a program that can perform asynchronous data transfering and kernel execution. I think i have to divide my data (which is a 4D array) into severel parts, and each part of the data can be transfered in different stream and the kernel in the same stream can then be executed. Can i also use cudaMemCpyAsync to do this???

MatColgrove · April 11, 2012, 4:03pm

Can i also use cudaMemCpyAsync to do this?

Sure, provided that you block the data so it’s in contiguous sections. Otherwise, you will need to use the 3D routine.

Mat

steve.xu · April 16, 2012, 7:37am

Thanks Mat!
I just cannot find some examples about how to use cudaMemCpyAsync3D. Surely i need to copy parts of a 3D array (say A(N1,N2,N3) )each time to overlap communication and computation. For example, i need to copy A(N1/2,N2/2,N3/2) ,and then launch a kernel, and then copy another part of A and then execute the kernel.
Can i use cudaMemCpyAsync to do this or How to do this by using cudaMemCpyAsync3D??

Topic		Replies	Views
Async GPU Data Tranfer with CUDA Fortran Legacy PGI Compilers	1	1946	January 31, 2011
How to use cudaMemcpy3D and cudaMemcpy3DParms in Cuda Fortran nvc, nvc++ and nvfortran	1	697	November 2, 2022
cudaMemcpy3Dasync - explicit 3d array copy Legacy PGI Compilers	1	1073	February 7, 2019
CUDA Fortran 3D pitched memory transfers nvc, nvc++ and nvfortran cuda	2	413	May 16, 2023
Asynchronous Memory Copy in CUDA Fortran Legacy PGI Compilers	2	6233	June 4, 2010
[SOLVED] Copy data from 1D to 3D with cudaMemcpy3D CUDA Programming and Performance	3	2535	December 2, 2016
Help with cudaMemcpyAsync for transferring 3D arrays in CUDA nvc, nvc++ and nvfortran	0	28	January 7, 2025
Quickest way to transfer arrays from Fortran to CUDA device? CUDA Programming and Performance	0	558	March 7, 2011
How to use pinned memory for 3D arrays in CUDA? CUDA Programming and Performance	2	46	December 26, 2024
2D array with memcopy2D and Kernel usage CUDA Programming and Performance	4	1300	January 19, 2016

how to use cudamemcpy3dasync?

Related topics