memcpy for arrays allocated on the device in openacc

I need to use memcpy for moving the arrays allocated on the gpu. i can not use std::memcpy because it “has no acc routine” (compiler output). My code is like (Particle is a structure of 2 floats)

const int GL=100000;
Particle particles[GL];
int cp01[2][GL];
#pragma acc declare create(particles,cp01)

i read that cudaMemcpy can be used with openacc and try to make a call from the host (cp is a host variable)

#pragma acc data copy(cp)
{
cudaMemcpy(&particles[cp01[0][0]],&particles[cp01[1] [0]],cp*sizeof(Particle),cudaMemcpyDeviceToDevice);
}

i use the header

#include <cuda_runtime.h>
for using CUDA and build the project as
cmake …/src -DCMAKE_CXX_COMPILER=pgc++ -DCMAKE_CXX_FLAGS="-acc -Minfo=all -Mcuda=llvm"
The program compiles, but does not work, it hangs with no output in the console line. How to move arrays allocated on the device (using cudaMemcpy or there is some other way)?

Hi @and,

Since “cudaMemcpy” is a host side call where you want to pass in the device pointers, you’ll want to use a “host_data” directive. No need to copy “cp” since you’ll want to use the host value. Also make sure the host values of “cp01” are current. Something like the following:

#pragma acc host_data use_device(particles) 
 { 
 cudaMemcpy(&particles[cp01[0][0]],&particles[cp01[1] [0]],cp*sizeof(Particle),cudaMemcpyDeviceToDevice); 
 }

Hope this helps,
Mat