Old divdif code not working

I return to haunt the halls of the nvidia forums.
I’m afraid I have a problem with a FORTRAN recipe from a book and I can’t seem to figure it out. For some reason there is a GOTO jump that occurs when running on CPU that just refuses to happen when on GPU. I’ve included a small (150 line) file that contains the function in question, the inputs it takes and the order the calls occur in (albeit very scaled down). The output is NaN which is very much not supposed to be the case, and the only thing I can find that differs is that “GOTO 8” doesn’t occure. I would greatly appreciate any help with this.

CRAFT_DIVDIF.f90 (3.2 KB)

I usually compile it as such nvfortran -cuda -v -o CRAFTisolate CRAFT_DIVDIF.f90. I believe I’m using nvhpc/21.7.

Hi Cattaneo,

Welcome back. The NaNs are due to “T(I)” and “T(ISUB)” being equal values, hence there’s a divide by zero.

The code does execute the GOTOs, including 8, but it appears that the “L=L+1” isn’t getting generated in the device code. Unclear why, but I added a issue report, TPR #32414, and sent it to engineering for review.

Thanks for the report,
Mat

Are you sure? When I run it on my machine with loads of prints it doesn’t print right under the if statement either? Or do you mean that the whole section itself isn’t getting translated.

Also, good to see you again Mat.

What I did was add two print statements before and after the IF:

  L=0
  GO TO 9
8 L=-L
  print *, "L1=",L
  IF(L.GE.0) THEN   
        L=L+1
  endif
  print *, "L2=",L
9 ISUB=IX+L

The output prints the L1 line, but not L2:

% nvfortran -cuda CRAFT_DIVDIF.f90 -Kieee -V22.7 -gpu=keep,nollvm; a.out
 is this starting?
            3
 starting DSA on block            1 thread            1
            1            1
 L1=            0
            2            1
 L1=            0
            3            1
 L1=            0
            4            1
 SUM 1:                       NaN
 x:                       NaN

Looking at the generated CUDA code (i.e. add “-gpu=nollvm,keep”), the if case is there but the increment of L and the L2 print statement are missing:

y16 = 0;
__pgf90io_print_init(_p_2, 0, _p_1, ((signed char*)(&y16)), &e17);
__pgf90io_sc_ch_ldw((signed char*)("L1="), 14, 3LL, e17);
__pgf90io_sc_i_ldw(n_l, 25, e17);
__pgf90io_ldw_end(e17);
if(((n_l)<0)) goto _BB_16;

! The increment and my added print statements are missing 

goto _BB_16;
_BB_16: ;
n_isub = (n_ix+n_l);

I am using 22.7, but just checked 21.7 and see the same behavior. Now I can’t be sure this is indeed the root issue, so will need a compiler engineer to investigate.

-Mat

Neat. Glad something is weird and I’m not just going crazy. In the meantime I’ll try a different interpolation recipe.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.