problems with omp flush and optimization flags

the following code hangs using more than 1 thread and -O2 (or more) -mp optimation flag.
The same problem occurr in a “real life” fortran 95 code:
Lpeter

pgf90 -V
pgf90 10.3-0 64-bit target on x86-64 Linux -tp gh-64
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved.

REAL FUNCTION FN1(I)
INTEGER I
FN1 = I * 2.0
RETURN
END FUNCTION FN1

REAL FUNCTION FN2(A, B)
REAL A, B
FN2 = A + B
RETURN
END FUNCTION FN2

PROGRAM A21
use omp_lib !INCLUDE “omp_lib.h” ! or USE OMP_LIB
INTEGER ISYNC(256)
REAL WORK(256)
REAL RESULT(256)
INTEGER IAM, NEIGHBOR
!$OMP PARALLEL PRIVATE(IAM, NEIGHBOR) SHARED(WORK, ISYNC)
IAM = OMP_GET_THREAD_NUM() + 1
ISYNC(IAM) = 0
!$OMP BARRIER
! Do computation into my portion of work array WORK(IAM) = FN1(IAM)
! Announce that I am done with my work.
! The first flush ensures that my work is made visible before
! synch. The second flush ensures that synch is made visible.

!$OMP FLUSH(WORK,ISYNC)
ISYNC(IAM) = 1
!$OMP FLUSH(ISYNC)

! Wait until neighbor is done. The first flush ensures that
! synch is read from memory, rather than from the temporary
! view of memory. The second flush ensures that work is read
! from memory, and is done so after the while loop exits.
IF (IAM .EQ. 1) THEN
NEIGHBOR = OMP_GET_NUM_THREADS()
ELSE
NEIGHBOR = IAM - 1
ENDIF
DO WHILE (ISYNC(NEIGHBOR) .EQ. 0)
!$OMP FLUSH(ISYNC)
END DO
!$OMP FLUSH(WORK, ISYNC)
RESULT(IAM) = FN2(WORK(NEIGHBOR), WORK(IAM))
write(,) result(iam)
!$OMP END PARALLEL
END PROGRAM A21

Hi lpeter,

This is a known issue (TPR#17688). The problem is that “ISYNC” is not being set a volatile. Hence the compiler performs an optimization where “ISYNC(NEIGHBOR)” is being moved to a register and not updated after each iteration of the do loop. This causes the DO loop to enter an infinite loop.

Unfortunately, there isn’t a good way to fix this. In later versions of the OpenMP standard, the “FLUSH(list)” directive (i…e. FLUSH with a list) has been deprecated. The standard now specifies that " FLUSH(list)" should be evaluated as just a “FLUSH” directive. Hence, in order support this either all variables must be made volatile and severely impact the optimization that can be performed, or replace “FLUSH” with “BARRIER” which will also severely impact performance.

For this code, the best work around would be to simply remove this synchronization code and use a single BARRIER directive before global memory is read.

For example:

% cat ompt.f90 
REAL FUNCTION FN1(I)
INTEGER I
FN1 = I * 2.0
RETURN
END FUNCTION FN1

REAL FUNCTION FN2(A, B)
REAL A, B
FN2 = A + B
RETURN
END FUNCTION FN2

PROGRAM A21
use omp_lib !INCLUDE "omp_lib.h" ! or USE OMP_LIB
INTEGER ISYNC(256)
REAL WORK(256)
REAL RESULT(256)
INTEGER IAM, NEIGHBOR, TST
WORK=1.0
!$OMP PARALLEL PRIVATE(IAM, NEIGHBOR, TST) SHARED(WORK, ISYNC)
IAM = OMP_GET_THREAD_NUM() + 1
WORK(IAM) = REAL(IAM)
!$OMP BARRIER

IF (IAM .EQ. 1) THEN
NEIGHBOR = OMP_GET_NUM_THREADS()
ELSE
NEIGHBOR = IAM - 1
ENDIF
RESULT(IAM) = FN2(WORK(NEIGHBOR), WORK(IAM))
write(*,*) IAM, '+', NEIGHBOR, '=', result(iam)
!$OMP END PARALLEL

END PROGRAM A21 

% pgf90 -fast -mp ompt.f90 ; a.out
            5 +            4 =    9.000000    
            6 +            5 =    11.00000    
            4 +            3 =    7.000000    
            2 +            1 =    3.000000    
            3 +            2 =    5.000000    
            8 +            7 =    15.00000    
            1 +            8 =    9.000000    
            7 +            6 =    13.00000

Thanks,
Mat

thank’s !
your code solve the problem…I assume that portland compiler did the right thing…nevertheless gfortran and some other proprietary compiler (:=) did not experiment the same problem and you should able to run with and without optimization flags…
Regards

Lpeter

an update
it seems that setting the volatile attribute to isync variable fix the problem without changing the code. Is fortran 2003 I mean and many compilers support it

Regards

Lpeter