memcpy for arrays allocated on the device in openacc

_and · November 23, 2017, 2:45pm

I need to use memcpy for moving the arrays allocated on the gpu. i can not use std::memcpy because it “has no acc routine” (compiler output). My code is like (Particle is a structure of 2 floats)

const int GL=100000;
Particle particles[GL];
int cp01[2][GL];
#pragma acc declare create(particles,cp01)
…

i read that cudaMemcpy can be used with openacc and try to make a call from the host (cp is a host variable)

#pragma acc data copy(cp)
{
cudaMemcpy(&particles[cp01[0][0]],&particles[cp01[1] [0]],cp*sizeof(Particle),cudaMemcpyDeviceToDevice);
}

i use the header

#include <cuda_runtime.h>
for using CUDA and build the project as
cmake …/src -DCMAKE_CXX_COMPILER=pgc++ -DCMAKE_CXX_FLAGS=“-acc -Minfo=all -Mcuda=llvm”
The program compiles, but does not work, it hangs with no output in the console line. How to move arrays allocated on the device (using cudaMemcpy or there is some other way)?

MatColgrove · November 27, 2017, 7:31pm

Hi @and,

Since “cudaMemcpy” is a host side call where you want to pass in the device pointers, you’ll want to use a “host_data” directive. No need to copy “cp” since you’ll want to use the host value. Also make sure the host values of “cp01” are current. Something like the following:

#pragma acc host_data use_device(particles) 
 { 
 cudaMemcpy(&particles[cp01[0][0]],&particles[cp01[1] [0]],cp*sizeof(Particle),cudaMemcpyDeviceToDevice); 
 }

Hope this helps,
Mat

Topic		Replies	Views
cudaMemcpy fails copying ACC variable to CUF variable Legacy PGI Compilers	3	3383	August 8, 2013
A simple memcpy implementation in openacc Legacy PGI Compilers	1	4409	April 17, 2015
not able to use cudaMemcpy() in openacc Legacy PGI Compilers	1	1873	November 28, 2017
Data transfer from device to host Legacy PGI Compilers	2	3809	July 10, 2014
Issue with acc_memcpy_device Legacy PGI Compilers	3	2226	August 19, 2019
OpenACC: Copying array, returned by a function, to device nvc, nvc++ and nvfortran	10	1124	August 11, 2021
Acc_malloc() in Fortran to avoid host allocation nvc, nvc++ and nvfortran	4	1025	August 8, 2024
Calling CUDA kernel from within OpenACC clause, device pointer passing Legacy PGI Compilers	2	2972	December 20, 2019
Confusion whilst copying from host to device Legacy PGI Compilers	2	2454	July 4, 2012
How to memcpy from static global deivce array to static global deivce array. CUDA Programming and Performance	4	739	October 13, 2014

memcpy for arrays allocated on the device in openacc

Related topics