Hello,
I am porting a code to use OpenMP target offload, and came across some surprising behaviour.
This small test code:
#include <stdio.h>
int main() {
double *ptr = reinterpret_cast<double*>(0xdeadbeef);
printf("ptr=%p\n", ptr);
#pragma omp target parallel for simd is_device_ptr(ptr) if(true)
for(int i = 0; i < 1; ++i) {
printf("ptr=%p\n", ptr);
}
#pragma omp target parallel for simd if(true) is_device_ptr(ptr)
for(int i = 0; i < 1; ++i) {
printf("ptr=%p\n", ptr);
}
return 0;
}
outputs
NVHPC
ptr=0xdeadbeef
ptr=0xdeadbeef
ptr=(nil)
on my system, which has NVHPC 21.9, CUDA 11.4.2 and V100 GPUs.
The same code compiled with clang++ 12 always prints non-null, and I could not find any mention in the OpenMP spec of the clause ordering being meaningful in this case.
If you agree that this is a bug then it would be great to get it fixed. In the meantime, we can just prefer the if
-second ordering that gives the correct answer.
Best, Olli