I need to use memcpy for moving the arrays allocated on the gpu. i can not use std::memcpy because it “has no acc routine” (compiler output). My code is like (Particle is a structure of 2 floats)
const int GL=100000;
Particle particles[GL];
int cp01[2][GL];
#pragma acc declare create(particles,cp01)
…
i read that cudaMemcpy can be used with openacc and try to make a call from the host (cp is a host variable)
#pragma acc data copy(cp)
{
cudaMemcpy(&particles[cp01[0][0]],&particles[cp01[1] [0]],cp*sizeof(Particle),cudaMemcpyDeviceToDevice);
}
i use the header
#include <cuda_runtime.h>
for using CUDA and build the project as
cmake …/src -DCMAKE_CXX_COMPILER=pgc++ -DCMAKE_CXX_FLAGS="-acc -Minfo=all -Mcuda=llvm"
The program compiles, but does not work, it hangs with no output in the console line. How to move arrays allocated on the device (using cudaMemcpy or there is some other way)?