There are several postings on this topic but I could not find exchanges that addressed the problem I have. I have several 3-dimensional arrays that describe chemical concentrations in a C++ class, and are updated on a time step basis in an initial value problem. I would like to use the HPC SDK to run sections of the code on GPUs.
I’ve created a stripped-down version of the offending module:
void UpdateArrays(double*** A_T_array, double*** AX_array) {
#pragma acc kernels
{
#pragma acc loop independent collapse(3)
for (int jx = 0; jx < MaxPoints.x; jx++)
for (int jy = 0; jy < MaxPoints.y; jy++)
for (int jz = 0; jz < MaxPoints.z; jz++) {
// diffusion of free ATM
AX_array[jx][jy][jz]
= A_T_array[jx][jy][jz];
}
}
}
Here is the relevant portion of the compile log:
EulerIntegrationMethod::UpdateArrays(double ***, double **):
502, Generating implicit copyin(this[:]) [if not already present]
505, Loop is parallelizable
506, Loop is parallelizable
507, Accelerator restriction: size of the GPU copy of A_T_array,AX_array is unknown
Loop is parallelizable
Generating Tesla code
505, #pragma acc loop gang, vector(128) collapse(3) / blockIdx.x threadIdx.x /
506, / blockIdx.x threadIdx.x collapsed /
507, / blockIdx.x threadIdx.x collapsed */
507, Generating implicit copyout(AX_array[i1][i2][:]) [if not already present]
Generating implicit copyin(A_T_array[i1][i2][:]) [if not already present]
nvc++ -acc -Minfo=accel -Mlarge_arrays -O3 -o SiteRepairATMC_1.X SiteRepairATMC_1.X.o -g -fopenmp -L/home/bkeister/g++/lib -lphys -lconfig++ -L /usr/local/lib -lm
When I run the executable, I get the following:
call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
I ran cuda-memcheck on the executable and get the following:
========= CUDA-MEMCHECK
========= Invalid global read of size 8
========= at 0x00000d28 in /home/bkeister/g++/app/SiteRepair-GPU/SiteRepairATMC_1.X.cc:511:EulerIntegrationMethod::UpdateArrays_507_gpu(double***, double***)
========= by thread (48,0,0) in block (0,0,0)
========= Address 0x00000000 is out of bounds
I thought this might be a matter of running out of memory, but I ran the code with a dynamic array size of [7][3][7] and still got the error.
There is another function involving two similar arrays for which I don’t get this runtime error.
I’m new to this, not sure what I’ve missed.