acc update device with arrays


I’m porting part of a code with OpenMP 4.5 to OpenACC. Data is initialized and allocated on the device (doesn’t re-allocate each time the loop executes) and updated values will be updated to device. But modifying some existing macros to achieve loop-level parallelism is causing issues:

Example of OpenMP version:

    const int *_th_plo = loVect();                              \
    const int *_th_plen = length();                           \
    T* _th_p = dptr;                                                    \
    const int _ns = (ns);                                             \
    const int _nc = (nc);                                             \
    _Pragma("omp target update to(_th_p[_ns*_th_plen[2]:(_ns+_nc)*_th_plen[2]])") \
    _Pragma("omp target teams distribute parallel for collapse(3)")     \
    for(int _n = _ns; _n < _ns+_nc; ++_n) {              \
        for(int _k = 0; _k < _b_len[2]; ++_k) {             \
            for(int _j = 0; _j < _b_len[1]; ++_j) {            \
                int nR = _n; nR += 0;                                 \

OpenACC version:

_Pragma("acc update device(_th_p[_ns*_th_plen[2]:(_ns+_nc)*_th_plen[2]])") \
_Pragma("acc parallel loop collapse(3)")						   \

I get the following error:

“…/…/Src/C_BaseLib/BaseFab.H”, line 1460: error: expected a “]”
detected during:
instantiation of “void BaseFab::performSetVal(T, const Box &,
int, int) [with T=int]” at line 1074
instantiation of “void BaseFab::setVal(T) [with T=int]” at line
334 of “…/…/Src/C_BaseLib/FabArray.cpp”

And will only compile with:

_Pragma("acc update device(_th_p[_ns:(_ns+_nc])") \
_Pragma("acc parallel loop collapse(3)")		  \

But since it hasn’t updated correctly, causes illegal address error:

call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
call to cuMemFreeHost returned error 700: Illegal address during kernel execution


Hi Zahra,

My guess is that there’s something wrong with how the macro is getting preprocessed but what I’m not sure. Can you post a small reproducer?

If not, try just preprocessing the file (-P) and look at the generated post-processed file (.i) to see how the macro was outputted. It might give some clues. Note that if you have the "-o " option, remove it else the post-processed file will output to the this file.

I find it odd that the second version worked given you’ve missing an ending parentheses around “(_ns+_nc])”.

Note that the illegal address error may be due to something else, such as needing to add “present(_th_p)” to the “parallel” directive. Though without an example of the code, I can’t be sure.