Bug: NVHPC 25.3 and checking unallocated Fortran arrays in OpenMP target loops

Hi,

Flagging what seems to be a bug in the latest compilers.

In some of our code, we often check whether an array is allocated in loops to control some of the loop’s behaviour. But, I’m finding that in nvfortran 25.3, if the array being checked is unallocated, I get undesired behaviour - either the loop not working properly or getting a cuda segfault error. Below is an MRE:

program test

    implicit none
    integer, allocatable:: a(:)
    integer:: i, b(10)
    logical:: testarr(10)

    b(:) = 1
    testarr = .true.

    !$omp target loop map(tofrom: b, testarr) map(to: a) map(tofrom: testarr)
    do i = 1, 10
        testarr(i) = allocated(a)
        b(i) = b(i) + 2*b(i)
    enddo

    write(*, *) all(b(:) == 3)
    write(*, *) all(testarr)

end program test

The result I get on my laptop gpu (4060) (mint 22.1) is

F # should be T
T # should be F

Same result for my local HPC (V100s - Rocky8), both compiled with

nvfortran -mp=gpu -O0

when running with NV_ACC_NOTIFY=2, I get:

upload CUDA data  file=/home/edwardy/test.f90 function=test line=12 device=0 threadid=1 variable=b(:) bytes=40
upload CUDA data  file=/home/edwardy/test.f90 function=test line=12 device=0 threadid=1 variable=testarr(:) bytes=40
download CUDA data  file=/home/edwardy/test.f90 function=test line=18 device=0 threadid=1 variable=testarr(:) bytes=40
download CUDA data  file=/home/edwardy/test.f90 function=test line=18 device=0 threadid=1 variable=b(:) bytes=40

For older versions, the code behaves as I would expect (i.e., I get T, F).

Thanks for the report edoy. I was able to reproduce the issue here and have sent a report, TPR#37397, to engineering for further investigation.

-Mat

1 Like

Thanks matt!