error using sum in F90

Bill_Atkinson · November 11, 2005, 5:20pm

Hi,

I’m running into a problem with some code that makes use of the “sum” intrinsic in F90. The original code is quite long, but I’ve cut and pasted a shorter version which reproduces the error. I’ve compiled this code using pgf90 with no optimization on opterons (64 bit) running both SUSE 9.1 and Gentoo, & get the same behaviour. I’ve also compiled with ifort on a pentium and the code works fine.

The problem is one of the “mysterious” kind, where an unrelated command (in this case, a write statement to the screen) changes the final result of sum command.

The test code, shown below, consists of a subroutine which builds an array (it works fine) and the main body where the defective sum occurs. The goal of the code is to calculate and print the array m_inv. For the test case m_inv(:,1,1) should equal m_inv(:,2,2).

module type_defs
  integer, parameter :: float=kind(1.d0)
  complex (float), parameter :: i=(0d0,1d0)
contains

  !***********************************************************************
  subroutine make_h_index(ns,nbrs,n,h_index)
    !***********************************************************************
    implicit none
    integer, intent (in) :: ns,nbrs,n(2)
    integer, intent (out) :: h_index(ns,nbrs)
    integer :: i0,i1,i2

    ! Make the index function which contains
    ! the nearest neighbours of the site i2.
    ! This matrix describes the ns x ns
    ! upper-left block of the total superconducting
    ! matrix.
    do i0 = 1,n(1)
       do i1 = 1,n(2)
          i2 = i0 + (i1-1)*n(1)

          h_index(i2,1) = i2    ! on-site

          if (i0 /= n(1)) then  ! neighbour in +x direction
             h_index(i2,2) = i2 + 1
          else
             h_index(i2,2) = 1 + (i1-1)*n(1)
          endif

          if (i0 /= 1) then  ! neighbour in -x direction
             h_index(i2,3) = i2 - 1
          else
             h_index(i2,3) = n(1) + (i1-1)*n(1)
          endif

          if (i1 /= n(2)) then  ! neighbour in +y direction
             h_index(i2,4) = i2 + n(1)
          else
             h_index(i2,4) = i0 
          endif

          if (i1 /= 1) then  ! neighbour in -x direction
             h_index(i2,5) = i2 - n(1)
          else
             h_index(i2,5) = i0 + (n(2)-1)*n(1)
          endif

       enddo
    enddo

    ! Add next nearest and next-next nearest neighbours
    do i0 = 1,ns
       h_index(i0,6) = h_index(h_index(i0,2),4)
       h_index(i0,7) = h_index(h_index(i0,3),4)
       h_index(i0,8) = h_index(h_index(i0,3),5)
       h_index(i0,9) = h_index(h_index(i0,2),5)
       h_index(i0,10) = h_index(h_index(i0,2),2)
       h_index(i0,11) = h_index(h_index(i0,3),3)
       h_index(i0,12) = h_index(h_index(i0,4),4)
       h_index(i0,13) = h_index(h_index(i0,5),5)
    enddo

  end subroutine make_h_index

end module type_defs


! MAIN PROGRAM


!********************************************************
program find_M_inv
  use type_defs
  implicit none
  integer, parameter :: n=10,ns=n*n,ns2=2*ns,nbrs=13

  integer :: h_index(ns,nbrs),nn(2)
  real (float) :: H(ns,nbrs),pi
  complex (float) :: Z(ns2,ns2)
  complex (float) :: m_inv(ns2,2,2)
  integer :: i0,i1,i2,i3,isn
  complex (float) :: htk(ns,nbrs)

  pi = 4d0*atan(1d0)
  nn = (/ n,n /)

  ! Make the arrays needed to perform the inverse-mass calculation.
  call make_h_index(ns,nbrs,nn,h_index)

  htk = 0d0
  htk(:,6:9) = 0.5d0
  Z = 0d0
  do i1 = 1,ns
     do i2 = 1,ns
        Z(i1,i2) = exp(i*2*pi*real(i2*i1)/real(ns))/sqrt(real(ns))
        Z(i1+ns,i2+ns) = Z(i1,i2)
     end do
  end do


  ! Calculate m_inv
  !
  m_inv = 0d0
  do i1 = 1,ns2
     do i0 = 1,ns2
        i2 = mod(i0-1,ns)+1
        i3 = ns*((i0-1)/ns)
        isn = -(-1)**((i0-1)/ns)

        ! **************
        ! When I uncomment this write statment, I get the correct values for
        ! M_inv(:,2,2).  Obviously this makes no sense.
        !write(6,*) abs(sum(Z(h_index(i2,4:5)+i3,i1)*htk(i2,4:5)))

        ! xx component:
        m_inv(i1,1,1) = m_inv(i1,1,1) + isn*conjg(Z(i0,i1))*( & 
             sum(Z(h_index(i2,2:3)+i3,i1)*htk(i2,2:3))&
             +sum(Z(h_index(i2,6:9)+i3,i1)*htk(i2,6:9))&
             )

        ! yy component:
        m_inv(i1,2,2) = m_inv(i1,2,2) + isn*conjg(Z(i0,i1))*( &
             !
             ! When I comment out the next line, I get the correct answer for m_inv.
             ! It shouldn't make any difference since htk(i2,4:5) = 0
             sum(Z(h_index(i2,4:5)+i3,i1)*htk(i2,4:5))&
             !
             ! Alternatively, If I write the above line explicitly as
             ! Z(h_index(i2,4)+i3,i1)*htk(i2,4) &
             ! + Z(h_index(i2,5)+i3,i1)*htk(i2,5) &
             ! I get the correct answer for m_inv
             !
             +sum(Z(h_index(i2,6:9)+i3,i1)*htk(i2,6:9))&
             )
     end do
     ! These two number should always be equal:
     write(6,*) real(m_inv(i1,1,1)),real(m_inv(i1,2,2))
     ! This code seems to return m_inv(:,1,1) correctly, but not m_inv(:,2,2).
     ! For reference, the last elements of the array should be
     !     m_inv(ns2,1,1)=m_inv(ns2,2,2)=2
     
  end do

end program find_M_inv

MatColgrove · November 12, 2005, 4:16pm

Hi Bill,

While I’m not positive, it looks like a bug related to how we’re using temporay arrays to store the intermediate values. On Monday, I’ll file a techincal problem report (TPR) and have one our compiler engineers look at it.

Thank you for bring it to our attension.

Mat

MatColgrove · November 14, 2005, 7:53pm

Hi Bill,

I’ve assigned this issue to TPR#3673 and listed you as the contact.

Mat

Bill_Atkinson · November 17, 2005, 6:14pm

Hi Mat,

Thanks for the help. I have follow up question. I tend to vectorize things quite a bit in my codes, rather than writing explicit do-loops for everything. Do you have any sense of which F90 intrinsics might be affected by this? Is it possible that, for example, sum(A(:,1)) is safe, while sum(A(1,:)) is unsafe? What about sin(A(:,1)) and sin(A(1,:))?

Thanks,
Bill

MatColgrove · November 22, 2005, 5:36pm

Hi Bill,

The compiler engineer who is looking at this does not have it quite narrowed down yet, but thinks it has to do with an index being off with a compiler generated temporary array. Of course, this still prelimary so things can change.

We don’t know if this is an isolated case or something fundamental. It could cause problems with other intrinsics but without understanding exactly what’s causing the bug, I can’t speculate how it would effect other code.

Once I know more, I’ll post it.

Mat

man74cio · November 25, 2010, 1:11pm

Hi Mat and Bill,
I was having the same problem with sum() on
an AMD Opteron™ Processor 6134.
My program works in simple and double precision.
For me, the error in sum comes out only when the array size is big and ONLY
simple precision.
The work around I uses is to split the sum in two parts.

Marco

MatColgrove · November 29, 2010, 9:39pm

Hi Marco,

TPR#3673 was fixed back in the 6.1-1 release (December 2005) so your issue is most likely unrelated. Are you able to post or send PGI Customer Service (trs@pgroup.com) a reproducing example?

For me, the error in sum comes out only when the array size is big and ONLY
simple precision.

How big is big? If the arrays is over 2GB are you compiling with “-Mlarge_arrays”?

Thanks,
Mat