PGI_ACC_TIME kills application

Hello,

when I set PGI_ACC_TIME=1, my application ends with

[csc27:32581] *** Process received signal ***
[csc27:32581] Signal: Floating point exception (8)
[csc27:32581] Signal code: Invalid floating point operation (7)
[csc27:32581] Failing at address: 0x7ff51cbda5fd
[csc27:32581] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7ff5ee666340]
[csc27:32581] [ 1] /opt/pgi/linux86-64/2016/cuda/7.0/lib64/libcupti.so(+0xaf5fd)[0x7ff51cbda5fd]
[csc27:32581] [ 2] /opt/pgi/linux86-64/2016/cuda/7.0/lib64/libcupti.so(+0xaefd1)[0x7ff51cbd9fd1]
[csc27:32581] [ 3] /opt/pgi/linux86-64/2016/cuda/7.0/lib64/libcupti.so(+0xaf2ef)[0x7ff51cbda2ef]
[csc27:32581] [ 4] /opt/pgi/linux86-64/2016/cuda/7.0/lib64/libcupti.so(+0x2cafc9)[0x7ff51cdf5fc9]
[csc27:32581] [ 5] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7ff5ee65e182]
[csc27:32581] [ 6] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff5edbf547d]
[csc27:32581] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 32581 on node csc27 exited on signal 8 (Floating point exception).
--------------------------------------------------------------------------

With PGI_ACC_TIME=0, everything works as expected.

The code with the !$acc-directive:

!$acc kernels      
      DO itmp2 = 1, 8
         DO itmp1 = 1, KLON
            ZZ=SQRT  (PUU(itmp1,itmp2))
            ZXD=PGB(itmp1,itmp2,1)+ZZ*(PGB(itmp1,itmp2,2)+ZZ)
            ZXN=PGA(itmp1,itmp2,1)+ZZ*(PGA(itmp1,itmp2,2))
            PTT(itmp1,itmp2)=ZXN/ZXD
         ENDDO
      ENDDO
!$acc end kernels

Output of compiler:

mpif90   -c -I ../CBS -I /opt/pgi/linux86-64/2016/mpi/openmpi/include -Mcuda -Minfo=accel -module ../CODE -acc -c -byteswapio -O2 -Mvect=nosse  -Ktrap=fp -Mr8 -g ../CODE/lwtt.f
lwtt:
    103, Generating copyin(puu(:klon,:8),pgb(:klon,:,:),pga(:klon,:,:))
         Generating copyout(ptt(:klon,:8))
    104, Loop is parallelizable
    105, Loop is parallelizable
         Accelerator kernel generated
         Generating Tesla code
        104, !$acc loop gang, vector(4) ! blockidx%y threadidx%y
        105, !$acc loop gang, vector(32) ! blockidx%x threadidx%x

Hi hendrun,

Unfortunately I haven’t seen this before so don’t know what’s wrong. If you can, please send a reproducing example to PGI Customer Service (trs@pgroup.com) and we can take a look.

Thanks,
Mat

Hi Mat,

this is an example code:

PROGRAM REMORG
      
      IMPLICIT NONE
      
      REAL     :: ZXD, ZXN, ZZ
      REAL     :: PGA(100,100,2), PGB(100,100,2)
      REAL     :: PTT(100,100), PUU(100,100)      
      INTEGER  :: ITMP1, ITMP2
      INTEGER  :: IERROR

      WRITE(*,*) 'Start'

      CALL MPI_INIT(IERROR)

      DO itmp2 = 1, 100
         DO itmp1 = 1, 100
            PUU(itmp1,itmp2) = 1.
            PGA(itmp1,itmp2,1) = 1.
            PGA(itmp1,itmp2,2) = 1.
            PGB(itmp1,itmp2,1) = 1.
            PGB(itmp1,itmp2,2) = 1.
         ENDDO
      ENDDO
      
!$acc kernels
      DO itmp2 = 1, 100
         DO itmp1 = 1, 100
            ZZ=SQRT  (PUU(itmp1,itmp2))
            ZXD=PGB(itmp1,itmp2,1)+ZZ*(PGB(itmp1,itmp2,2)+ZZ)
            ZXN=PGA(itmp1,itmp2,1)+ZZ*(PGA(itmp1,itmp2,2))
            PTT(itmp1,itmp2)=ZXN/ZXD
         ENDDO
      ENDDO
!$acc end kernels

      CALL MPI_FINALIZE(IERROR)

      WRITE(*,*) 'Finish'
            
      END PROGRAM REMORG

For compiling:

mpif90 -I /opt/pgi/linux86-64/2016/mpi/openmpi/include -Minfo=accel -acc -Ktrap=fp remorg.f

And this is how I execute the program:

mpirun -np 1 a.out

With PGI_ACC_TIME=0, program runs.
With PGI_ACC_TIME=1, program fails.

Thanks,
hendrun

Hi Hendrun,

It looks like the NVIDIA Cupti profiling library is throwing a FPE. You’ll need to remove the “-Ktrap=fp” flag from your compile flags to produce a profile.

I added an issue report (TPR#22434) and sent it to engineering for further investigation.

Thanks!
Mat

TPR 22434 - OpenACC: Using “-Ktrap=fp” with PGI_ACC_TIME triggers FPE in libcupti

Is fixed in the current 16.7 release.

thanks,
dave