Hello. I’m trying to compile module, that uses acc directives. Compilation fails with error.
Here is the module source(i’ve cut everything unrelated to error, that appears in full version of source)
module FOURIN_acc
use cudafor
real, allocatable, device :: FSPC(:)
integer, private :: NND2C=-1,NXC=-1,NSGNC = 0
contains
SUBROUTINE FOURTR(FIN,FOUT,TEMP,NPOINT,NLOG2)
real, device :: FIN(NPOINT),FOUT(NPOINT),TEMP(NPOINT)
NPP1=NPOINT+1
!$acc region
DO 3 K=1,KLIM
DO 3 J=2,JLIM,2
!$acc do independent
DO I=1,2
TEMP(JLIM*K+J-JLIM-1) = TEMP(JLIM*K+J-JLIM-1) +
* FIN(JLIM*K+J+I+NPOINT-JLIM-2)*FSPC(KLIM*J-2*KLIM+I)
EndDo
3 CONTINUE
!$acc end region
C
!$acc region
DO 35 K=1,KLIM
DO 35 J=2,JLIM,2
!$acc do independent
DO I=1,2
F = FSPC(KLIM*J-2*KLIM+I+NPOINT)
TEMP(JLIM*K+J-JLIM) = TEMP(JLIM*K+J-JLIM) +
* FIN(JLIM*K+J+I+NPOINT-JLIM-2)*F
End Do
35 CONTINUE
!$acc end region
RETURN
END
end module FOURIN_acc
Here is compilation result:
pgf95 -Mcuda=cuda3.2,cc11 -ta=nvidia,cc11 -tp=amd64 -Minfo -c fourin.for
PGF90-S-0000-Internal compiler error. load of zero symbol 0 (fourin.for: 38)
PGF90-S-0000-Internal compiler error. load of zero symbol 0 (fourin.for: 38)
PGF90-S-0000-Internal compiler error. load of zero symbol 0 (fourin.for: 38)
PGF90-S-0000-Internal compiler error. load of zero symbol 0 (fourin.for: 38)
fourtr:
15, Complex loop carried dependence of 'temp' prevents parallelization
16, Complex loop carried dependence of 'temp' prevents parallelization
18, Loop is parallelizable
Accelerator kernel generated
15, !$acc do seq
16, !$acc do seq
Using register for 'fspc'
18, !$acc do parallel, vector(2) ! blockidx%x threadidx%x
26, Complex loop carried dependence of 'temp' prevents parallelization
27, Complex loop carried dependence of 'temp' prevents parallelization
29, Loop is parallelizable
Accelerator kernel generated
26, !$acc do seq
27, !$acc do seq
Using register for 'fspc'
29, !$acc do parallel, vector(2) ! blockidx%x threadidx%x
0 inform, 0 warnings, 4 severes, 0 fatal for fourtr
I’m using PGI Fortran 11.3