Hello, I am attempting to copy a dynamically allocated 2D array over to the GPU. The actual code is distributed among over 3,700 lines of C++, so I’ll attempt to summarize my issue.
Data is allocated progressively throughout the code in a manner similar to:
//in class definition unsigned int **mapCoarseToFinerRows; //somewhere else in the code mapCoarseToFinerRows = (unsigned int**)malloc(sizeof(unsigned int*) * k); //in yet another part of the code mapCourseToFinerRows[level] = (unsigned int*)malloc(sizeof(unsigned int) * (num_coarse_vtxs + 1));
When I attempt to copyin the data doing a simple:
#pragma acc data copyin(mapCourseToFinerRows[k][num_coarse_vtxs + 1])
I get an internal compiler error:
PGC++-S-0000-Internal compiler error. pragma: bad ilmopc 307 (csrk.cpp: 192)
Installing a new version of the compiler is not possible because I am compiling my code on the Alabama Supercomputer Authority’s Dense Memory Cluster, so I just need to work around this. I have attempted to manually move the memory with
acc_memcpy_to_device but I am encountering a runtime error:
FATAL ERROR: variable in data clause is partially present on the device: name=_T30787688_39593 file:/mnt/beegfs/home/uahpal001/spmv/acc-spmv-csrk/csrk.cpp _ZN10CSRk_GraphC1ElllPjS0_PfSsSsSsbiPi line:364
The error pointing to line 364 seems to correspond to the following ACC directive found on line 360:
#pragma acc enter data copyin(r_vec[N + 1], c_vec[NNZ], val[NNZ], N, NNZ, this)
Which is where I copy in all of the data that I can copy in using the pragmas.
Is there an easier workaround for this? What am I doing wrong?
EDIT: In the case that going back to an older PGI compiler solves this issue, here are all the versions of PGI installed on the Dense Memory Cluster:
pgi/14.1 pgi/14.9 pgi/15.3 pgi/15.10 pgi/16.5 pgi/17.5 pgi/17.7 pgi/18.1 pgi/18.5 pgi/18.10 pgi/19.5 pgi/20.1