putative bug in the 14.7 fortran compiler

I would like to draw your attention to a putative bug in the current compiler release which produces very subtile effects with my code.The test program is an excerpt from the affected code that can reproduce the following error:

The array ‘efdd’ is iteratively used for calculations an shall be nullified at the beginning of each iteration (iloop).
So the write statement after the call to ‘init’ should give zeroes at any time.
Compiling the code with

pgf90 -O0 test.f -o test
pgf90 -O1 test.f -o test

does so. However

pgf90 -O2 test.f -o test

does not. In this case the values from ‘efdd(j,i)=dble(ij)’ will be displayed.
The observed misbehavior can be suppressed by uncommenting 'write(6,
)‘hello’’ right after ‘efdd(is,iv)=0.d0’ or by using the -Kieee switch at compile time.

Since ‘efdd’ is explicitly filled with values each iteration the nullification in ‘init’ must be executed no matter which optimization level is chosen.

Older versions of the compiler are not affected (12.5, 13.2) as well as gfortran.

      program test

      implicit double precision(a-h,o-z)

      common/block1/iloop
      common/block2/efdc(35,3),efdd(35,3)
      common/block3/f(35,3)
      common/block4/nsp(6)

      nm=4
      nsp(1)=nm
      nsp(2)=2*nm
      nsp(3)=nm
      nsp(4)=nm

      efdd=0.0d0

      do iloop=1,3

        call init
        write(6,*)efdd

        do i=1,3
          do j=1,10
            efdd(j,i)=dble(i*j)
          end do
        end do
!       write(6,*)efdd     

      end do


      end program test

      subroutine init
      implicit double precision(a-h,o-z)
      common/block1/iloop
      common/block2/efdc(35,3),efdd(35,3)
      common/block3/f(35,3)
      common/block4/nsp(6)


      write(6,*)'iloop= ',iloop
      is=0
      do 10 i1=1,3
        do 20 i2=1,nsp(i1)
          is=is+1
          do 30 iv=1,3
            efdd(is,iv)=0.d0
!           write(6,*)'hello'
            if(iloop.eq.1) then
              efdc(is,iv)=0.d0
              f(is,iv)=0.d0
            endif
 30       continue
 20     continue
 10   continue

      return

      end

This was tested on a system with:

output from pgf90 --version:
I would like to draw your attentaion to a putative bug in pgf90 14.7
pgf90 14.7-0 64-bit target on x86-64 Linux -tp piledriver
The Portland Group - PGI Compilers and Tools

output from uname -srv:
Linux 3.0.101-0.35-default #1 SMP Wed Jul 9 11:43:04 UTC 2014 (c36987d)

MD Opteron™ Processor 6380


Best regards

BWB

Hi BWB,

Thanks for the report. I was able to recreate the error here and have sent in a report to our engineers (TPR#20810).

Note that the error looks to be specific to Piledriver’s 128bit SIMD operations. Adding the flag “-Mvect=simd:256” to swtich to 256bit SIMD or disabling vectorization (-Mnovect) works around the issue,

Best Regards,
Mat

Yes, both options ‘-Mvect=simd:256’ and ‘-Mnovect’ avoid the observed effect. Additionally I cross checked the code on a machine with AMD Opteron™ Processor 6176 (istanbul). The code there works as expected.
The original code the snippet above was taken from unfortunately worked as designed and gracefully stopped after a maximum number of iterations but did not crash. So debugging took quite a while. Which compiler options to choose for this issue are not easy to guess.
To give other users a hint I would like to summarize:

  1. Fortran codes might be affected from the observed issue when compiled with the 14.7 fortran compiler for and being run on a piledriver system.
  2. There is a workaround (for the cost of performance)
  3. This shall be fixed so that no additional options are necessary in future compiler versions.

Do you agree?

Best regards

BWB

Hi BWB,

The problem occurred with some new conditional vectorization code and will be fixed in the up coming 14.9 release.

  1. Fortran codes might be affected from the observed issue when compiled with the 14.7 fortran compiler for and being run on a piledriver system.

Yes, though the issue is relatively narrow.

  1. There is a workaround (for the cost of performance)

Yes, there is a work around, but I don’t know the performance cost.

  1. This shall be fixed so that no additional options are necessary in future compiler versions.

Yes, in the next release (14.9)

  • Mat

This has been fixed in the current 14.9 release.