I tried to compile some code and I got a massive block of output and I’m not quite sure what to make of it.
pgfortran -Mcuda=cuda8.0 -o CRAFTest CRAFT_CLEAN.f90
pgfortran-Fatal-/software/pgi-el7-x86_64/linux86-64/17.3/bin/pgf902 TERMINATED by signal 11
Arguments to /software/pgi-el7-x86_64/linux86-64/17.3/bin/pgf902
/software/pgi-el7-x86_64/linux86-64/17.3/bin/pgf902 /tmp/pgfortranjuqiBR5H-RmF.ilm -fn CRAFT_CLEAN.f90 -opt 1 -terse 1 -inform warn -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -quad -x 59 4 -tp haswell -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -x 120 0x200 -astype 0 -x 70 0x40000000 -x 124 1 -x 189 0x8000 -y 163 0xc0000000 -x 189 0x10 -y 189 0x4000000 -x 137 1 -x 121 0x400 -x 180 0x4000000 -x 176 0x100 -cudacap 30 -cudacap 35 -cudacap 50 -cudacap 60 -cudaver 8.0 -cmdline ‘+pgfortran CRAFT_CLEAN.f90 -Mcuda=cuda8.0 -o CRAFTest’ -asm /tmp/pgfortranPuqi7A8Rea1X.s
My assumption is the thing that actually did it is the signal 11 but as for what to make of that I’m not sure. My assumption is that is a segmentation fault but I’m not sure how to find where that fault is? I can provide the full code should that be helpful, but it’s rather sizable.
The back-end compiler (pgf902) seg faulted so the error is a compiler issue.
Are you able to provide a reproducing example? If so, I can check to see if it’s been fixed already, and if not, can add a problem report.
Since this is the 17.3 compiler, there’s a good change that it’s been fixed since then, so you might consider trying a more recent version.
I’ve provided the code I’ve got that’s causing it, although I’m not sure what part of it is causing the issue so I can do some testing to see if I can isolate where the problem is. I’ll also contact the guys running to cluster to see if we’ve got a newer version of the compiler or could install one.
The actual code is CRAFT_CLEAN, the other ones are input files for running it.
Thank you for your assistance.
CRAFT.zip (11.6 KB)
I believe the issue lies somewhere within divdiff. I’m not quite sure where though. I should mention I did not write divdiff, so if a deep dive into it is needed I may be a little slow to respond.
I’ve included DivDiff here
attributes(device) function DIVDIF(F,A,NN,X,MM)
integer NN,N,MM,M,MPLUS,IX,IY,MID,NPTS,IP,L,ISUB,J,I,MMAX, errorNumber
real*8 :: SUM,DIVDIF
real*8 :: A(NN),F(NN),T(20),D(20),X
! TABULAR INTERPOLATION USING SYMMETRICALLY PLACED ARGUMENT POINTS.
! START. FIND SUBSCRIPT IX OF X IN ARRAY A.
IF( (NN.LT.2) .OR. (MM.LT.1) ) GO TO 20
! (SEARCH INCREASING ARGUMENTS.)
IF(X.GE.A(MID)) GO TO 2
GO TO 3
! (IF TRUE.)
3 IF(IY-IX.GT.1) GO TO 1
GO TO 7
! COPY REORDERED INTERPOLATION POINTS INTO (T(I),D(I)), SETTING
! *EXTRA* TO TRUE IF M+2 POINTS TO BE USED.
GO TO 9
IF((1.LE.ISUB).AND.(ISUB.LE.N)) GO TO 10
! (SKIP POINT.)
GO TO 11
! (INSERT POINT.)
11 IF(IP.LT.NPTS) GO TO 8
! REPLACE D BY THE LEADING DIAGONAL OF A DIVIDED-DIFFERENCE TABLE, SUP-
! PLEMENTED BY AN EXTRA LINE IF *EXTRA* IS TRUE.
DO 14 L=1,M
IF(.NOT.EXTRA) GO TO 12
DO 13 J=L,M
! EVALUATE THE NEWTON INTERPOLATION FORMULA AT X, AVERAGING TWO VALUES
! OF LAST DIFFERENCE IF *EXTRA* IS TRUE.
DO 15 L=1,M
20 errorNumber = 1
errorNumber = 2
101 errorNumber = 3
102 errorNumber = 4
END function divdif
Is this the same version of the code you used? I tried compiling but see the error:
% pgf90 -c -Mcuda CRAFT_CLEAN.f90
PGF90-F-0004-Unable to open MODULE file gpufunctions.mod (CRAFT_CLEAN.f90: 7)
PGF90/x86-64 Linux 17.3-0: compilation aborted
The problem being that you have the “GPUFunctions” module declared after the main program where it’s used. I fixed this, but ran into a bunch of syntax errors. Also, you still have the procedure pointers in there, which leads me to believe that this may be an earlier version of the code?
Oh dear, my bad, that is a very old version. I apologise.
CRAFT.zip (12.2 KB)
Thanks Cattaneo, I’m now able to reproduce the error. It does still occur with our current 20.9 compilers so I’ve added a problem report, TPR #29228, and sent it engineering.
The problem is that the front-end compiler is not catching the erroneous use of the COMMON Block within device code which then causes the back-end compiler to go down a bad code path. Device data can not be contained in COMMON Blocks and should be converted to module variables. I’ve fix the code (attached), and it now compiles.
CRAFT_CLEAN.zip (10.9 KB)
The code doesn’t run since you’re trying to read from unit 3, but you don’t have unit 3 opened, at the start of the program. I didn’t look into this and presume there are other errors as well. Though hopefully getting you past the COMMON Block issues will help.
Thank you Mat, that did do it. I cannot stress enough my gratitude.