acc update device with arrays

Hi,

I’m porting part of a code with OpenMP 4.5 to OpenACC. Data is initialized and allocated on the device (doesn’t re-allocate each time the loop executes) and updated values will be updated to device. But modifying some existing macros to achieve loop-level parallelism is causing issues:

Example of OpenMP version:

 
    const int *_th_plo = loVect();                              \
    const int *_th_plen = length();                           \
    T* _th_p = dptr;                                                    \
    const int _ns = (ns);                                             \
    const int _nc = (nc);                                             \
    _Pragma("omp target update to(_th_p[_ns*_th_plen[2]:(_ns+_nc)*_th_plen[2]])") \
    _Pragma("omp target teams distribute parallel for collapse(3)")     \
    for(int _n = _ns; _n < _ns+_nc; ++_n) {              \
        for(int _k = 0; _k < _b_len[2]; ++_k) {             \
            for(int _j = 0; _j < _b_len[1]; ++_j) {            \
                int nR = _n; nR += 0;                                 \

OpenACC version:

_Pragma("acc update device(_th_p[_ns*_th_plen[2]:(_ns+_nc)*_th_plen[2]])") \
_Pragma("acc parallel loop collapse(3)")						   \

I get the following error:

“…/…/Src/C_BaseLib/BaseFab.H”, line 1460: error: expected a “]”
ForAllThisBNN(T,bx,ns,num)
^
detected during:
instantiation of “void BaseFab::performSetVal(T, const Box &,
int, int) [with T=int]” at line 1074
instantiation of “void BaseFab::setVal(T) [with T=int]” at line
334 of “…/…/Src/C_BaseLib/FabArray.cpp”

And will only compile with:

_Pragma("acc update device(_th_p[_ns:(_ns+_nc])") \
_Pragma("acc parallel loop collapse(3)")		  \

But since it hasn’t updated correctly, causes illegal address error:

call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
call to cuMemFreeHost returned error 700: Illegal address during kernel execution

Thanks,
Zahra

Hi Zahra,

My guess is that there’s something wrong with how the macro is getting preprocessed but what I’m not sure. Can you post a small reproducer?

If not, try just preprocessing the file (-P) and look at the generated post-processed file (.i) to see how the macro was outputted. It might give some clues. Note that if you have the “-o ” option, remove it else the post-processed file will output to the this file.

I find it odd that the second version worked given you’ve missing an ending parentheses around “(_ns+_nc])”.

Note that the illegal address error may be due to something else, such as needing to add “present(_th_p)” to the “parallel” directive. Though without an example of the code, I can’t be sure.

-Mat