Hi,
Flagging what seems to be a bug in the latest compilers.
In some of our code, we often check whether an array is allocated in loops to control some of the loop’s behaviour. But, I’m finding that in nvfortran 25.3, if the array being checked is unallocated, I get undesired behaviour - either the loop not working properly or getting a cuda segfault error. Below is an MRE:
program test
implicit none
integer, allocatable:: a(:)
integer:: i, b(10)
logical:: testarr(10)
b(:) = 1
testarr = .true.
!$omp target loop map(tofrom: b, testarr) map(to: a) map(tofrom: testarr)
do i = 1, 10
testarr(i) = allocated(a)
b(i) = b(i) + 2*b(i)
enddo
write(*, *) all(b(:) == 3)
write(*, *) all(testarr)
end program test
The result I get on my laptop gpu (4060) (mint 22.1) is
F # should be T
T # should be F
Same result for my local HPC (V100s - Rocky8), both compiled with
nvfortran -mp=gpu -O0
when running with NV_ACC_NOTIFY=2
, I get:
upload CUDA data file=/home/edwardy/test.f90 function=test line=12 device=0 threadid=1 variable=b(:) bytes=40
upload CUDA data file=/home/edwardy/test.f90 function=test line=12 device=0 threadid=1 variable=testarr(:) bytes=40
download CUDA data file=/home/edwardy/test.f90 function=test line=18 device=0 threadid=1 variable=testarr(:) bytes=40
download CUDA data file=/home/edwardy/test.f90 function=test line=18 device=0 threadid=1 variable=b(:) bytes=40
For older versions, the code behaves as I would expect (i.e., I get T, F
).