Undefined reference to `__pgi_ieee_is_nan_dev_r8'

Hi,
I meet a strange problem with nvfortran (nvhpc/23.7 and nvhpc/24.1 tested). Using present clause in a kernel breaks the resolution of ieee_is_nan call at link time:

This code does not compile:

program bide

use, intrinsic :: ieee_arithmetic

implicit none

double precision, allocatable, dimension(:) :: A
integer :: i
logical :: ok = .true.

allocate(A(1024))

A = -99.0d0

print*,ieee_is_nan(A(1))
!$acc enter data copyin(A)
!$ACC parallel loop present(A) reduction(.and.:ok)
do i=1,1024
   ok=ok .and. ieee_is_nan(A(i))
enddo

end program bide

nvfortran -acc bide.f90
bide.f90:19: undefined reference to `__pgi_ieee_is_nan_dev_r8’

While this code (with implicit data movement) is working:


program bide

use, intrinsic :: ieee_arithmetic

implicit none

double precision, allocatable, dimension(:) :: A
integer :: i
logical :: ok = .true.

allocate(A(1024))

A = -99.0d0

print*,ieee_is_nan(A(1))
!$ACC parallel loop reduction(.and.:ok)
do i=1,1024
   ok=ok .and. ieee_is_nan(A(i))
enddo

end program bide

(This small piece of code is inpired from a previous thread, but not showing the same problem)

Patrick

In order to use the intrinsics on the device with OpenACC, the compiler needs to inline the intrinsic. Though different presentations of the array can effect if the inlining succeeds of not, which is likely the case here. I have an open issue report for MINLOC which has a similar behavior.

The work around is to add the flag “-cuda” to enable CUDA Fortran, which has interfaces for these, thus allowing the intrinsic to get inlined correctly.

% nvfortran -acc test.F90
/usr/bin/ld: /tmp/nvfortranR-Kkeb2cW5v50.o: in function `MAIN_':
/local/home/mcolgrove/test.F90:19: undefined reference to `__pgi_ieee_is_nan_dev_r8'
pgacclnk: child process exit status 1: /usr/bin/ld
% nvfortran -acc test.F90 -cuda
%
1 Like

Hi Mat,

OK, I better understand the problem. Yes, adding the -cuda option works but this option do not seems to work with additionnal options. On my laptop:

(base) bash-4.4$ mpifort -acc -cuda bide.f90 
(base) bash-4.4$ mpifort -O2 -g -acc=noautopar,gpu,host -gpu=cc75,lineinfo \
                                             -Minfo=accel  -cuda  bide.f90
bide:
     14, Generating enter data copyin(a(:))
     17, Generating present(a(:))
         Generating NVIDIA GPU code
         18, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
             Generating reduction(.and.:ok)
     17, Generating implicit copy(ok) [if not already present]
/tmp/nvfortranGSmcGJC5Z9gG.o : Dans la fonction « MAIN_ » :
/home/begou/BOULOT/YALES2/aqat-gpu/R1_ARRAYS/ENLARGE-NOKEEP/bide.f90:19 : référence indéfinie vers « __pgi_ieee_is_nan_dev_r8 »
pgacclnk: child process exit status 1: /usr/bin/ld

Any idea ?

For peole who run in the same problem with ieee_is_nan intrinsic in a Kernel, the workaround is that by definition a NAN is not equal to itself. So it is possible to replace:

ok=ok .and. ieee_is_nan(A(i))

by

ok=ok .and. (A(i) /= A(i))

Patrick

It’s the “host” sub-option that’s interfering give CUDA Fortran can’t be applied to host code.

You’ll need to compile to only target the GPU or use your work-around.