Errors while compiling a CUDA fortran code in pgi-10.8

Hi,
I am getting following errors when I try to compile my codes. I am using pgi-10.8, because I am using a cluster where this has been installed. I compiled the codes in emulation mode on my PC using pgi-10.9, but it compiles well.


[krane@honest4 ~/fourier1]$ make
pgfortran -fast -c fouriercalc1.cuf -o fouriercalc1.o
/tmp/pgcudafor4dKfOpJRBZP8.gpu(70): error: expected an identifier

/tmp/pgcudafor4dKfOpJRBZP8.gpu(72): error: expected an identifier

/tmp/pgcudafor4dKfOpJRBZP8.gpu(72): error: expected an expression

/tmp/pgcudafor4dKfOpJRBZP8.gpu(145): error: expected an expression

/tmp/pgcudafor4dKfOpJRBZP8.gpu(151): error: expected an expression

5 errors detected in the compilation of “/tmp/pgnvd2eKfI3wGGbSz.nv0”.
PGF90-F-0000-Internal compiler error. pgnvd job exited with nonzero status code 0 (fouriercal
c1.cuf: 181)
PGF90/x86-64 Linux 10.8-0: compilation aborted
make: *** [fouriercalc1.o] Error 2

Following is the file which is giving errors


! CUDA Fortran Source File
! Generated by PGI Visual Fortran®
! 10/21/2010 2:39:16 PM
!
module fouriercalc
Use ewald
Use xyz
Use sp
Use realcalc
Use cudafor
!Storing the constant arrays in constant memory of GPUs
implicit none
contains

!!!
!Starting with the Kernel subroutine

attributes(global) subroutine fourier_kernel(KX, KY, KZ, EXPX, EXPY, EXPZ, SUMQEXPV,&
EXPX_T, EXPY_T, EXPZ_T, SUMQEXPV_T, de, iflag )
implicit none
!integer, value :: Nionpm
!integer, value :: Nkvec, kxmax, kymax, kzmax
integer :: iflag
integer :: k, i
real :: dethread
real :: de(Nkvecdev)
!real :: xii(Nionpm), yii(Nionpm), zii(Nionpm)
integer :: KX(Nkvecdev), KY(NKvecdev), KZ(Nkvecdev)
!real :: ax, ay, az
!real :: charge(Nionpm)
!real :: alpha
complex :: EXPX(Nionpmdev, 0:kxmaxdev), EXPY(Nionpmdev, -kymaxdev:kymaxdev), EXPZ(Nionpmdev, -kzmaxdev:kzmaxdev)
complex :: EXPX_T(Nionpmdev, 0:kxmaxdev), EXPY_T(Nionpmdev, -kymaxdev:kymaxdev), EXPZ_T(Nionpmdev, -kzmaxdev:kzmaxdev)
complex :: SUMQEXPV(Nkvecdev)
complex :: SUMQEXPV_T(Nkvecdev)
complex :: expv_diff
integer :: kkx, kky, kkz
real :: b, ksq, CONST

k = threadidx%x

kkx = KX(k)
kky = KY(k)
kkz = KZ(k)
!calculating the constant part for a vector
ksq = axdevaxdevreal(kkxkkx) + aydevaydevreal(kkykky) + azdevazdevreal(kkz*kkz)
CONST = exp( -ksq / 4.0 / alphadev / alphadev ) / ksq
if(kkx == 0) CONST = 0.5 * CONST
de = 0.0
!SUMQEXPV_T = 0
!Looping over all the atoms in the molecule
do i = 1, Nionpmdev

b = real(kkx) * axdev * Xdevnew(i)
!write(,) b
EXPX_T(i,kkx) = cmplx( cos(b), sin(b) )

b = real(kky) * aydev * Ydevnew(i)
!write(,) b
EXPY_T(i,kky) = cmplx( cos(b), sin(b) )
EXPY_T(i,-kky) = conjg( EXPY_T(i,kky) )

b = real(kkz) * azdev * Zdevnew(i)
!write(,) b
EXPZ_T(i,kkz) = cmplx( cos(b), sin(b) )
EXPZ_T(i,-kkz) = conjg( EXPZ_T(i,kkz) )

expv_diff = EXPX_T(i,kkx) * EXPY_T(i,kky) * EXPZ_T(i,kkz) - &
EXPX(i,kkx) * EXPY(i,kky) * EXPZ(i,kkz)

SUMQEXPV_T(k) = SUMQEXPV_T(k) + expv_diff * chargedev(i)

end do

if( iflag == 0 ) then


dethread = CONST * &
( conjg( SUMQEXPV_T(k) ) * SUMQEXPV_T(k) - &
conjg( SUMQEXPV(k) ) * SUMQEXPV(k) )

else

dethread = CONST * &
( conjg( SUMQEXPV_T(k) ) * SUMQEXPV_T(k) - &
conjg( SUMQEXPV_T(k) ) * SUMQEXPV_T(k) )

end if

call syncthreads()

!call atomicadd( de, dethread )

de(k) = dethread

end subroutine

subroutine fourier_move(mol, icb, de)

Use sp
Use ewald
Use xyz

! This routine calculates the change in the Fourier term of the
! Ewald.

integer :: mol
integer :: icb
real :: de


integer :: i, k
integer :: iflag
integer, device :: iflagdev
integer :: kkx, kky, kkz
real :: xii, yii, zii
real :: b
complex :: expv_diff
!real, device, dimension(Nionpm) :: Xdev, Ydev, Zdev
real, device, allocatable, dimension(:) :: dedev
type(dim3) :: dimGrid, dimBlock


de = 0.0

if( icb < 0 ) then
iflag = 1
icb = -icb
SUMQEXPV_T(icb,:) = SUMQEXPV_T(1,:)
else
iflag = 0
SUMQEXPV_T(icb,:) = SUMQEXPV(:)
end if

EXPX_T(icb,:,0) = (1.0,0.0)
EXPY_T(icb,:,0) = (1.0,0.0)
EXPZ_T(icb,:,0) = (1.0,0.0)

!!!
! Assigning the values for device variables

allocate(dedev(Nkvec))
dedev = 0.0
do i = 1, Nionpm

Xdevnew(i) = Xi(mol,i)
Ydevnew(i) = Yi(mol,i)
Zdevnew(i) = Zi(mol,i)
EXPXdev(i,:) = EXPX(mol,i,:)
EXPYdev(i,:) = EXPY(mol,i,:)
EXPZdev(i,:) = EXPZ(mol,i,:)
end do
SUMQEXPVdev = SUMQEXPV
SUMQEXPV_Tdev(:) = SUMQEXPV_T(icb,:)
iflagdev = iflag
! Allocating the number of threads and invoking the kernel
dimGrid = dim3(1, 1, 1)
dimBlock = dim3(Nkvec, 1, 1)
call fourier_kernel<<<dimGrid,dimBlock>>>(KXdev, KYdev, KZdev, EXPXdev,&
EXPYdev, EXPZdev, SUMQEXPVdev, EXPX_Tdev, EXPY_Tdev, EXPZ_Tdev, SUMQEXPV_Tdev, dedev, iflagdev )

! Assigning the output back to the host variable
SUMQEXPV_T(icb,:) = SUMQEXPV_Tdev(:)
de = sum(dedev(1:Nkvec))

do i = 1, Nionpm
EXPX_T(icb,i,:) = EXPX_Tdev(i,:)
EXPY_T(icb,i,:) = EXPY_Tdev(i,:)
EXPZ_T(icb,i,:) = EXPZ_Tdev(i,:)
end do

de = de * 4.0 * pi / volume_e

return

end subroutine fourier_move

end module fouriercalc

Regards
Kaustubh

Hi Kaustubh,

My best guess is that you are encountering a known issue (TPR#16767) where you are passing in a device array to your device kernel that was declared in an external module. Specifically, since the EXPXdev, EXPYdev, and EXPZdev arrays are not declared in the host routine, I’m assuming that they are declared in a module. The work around is to declare these variables locally in fourier_move. Note that this issue is expected to be fixed in either the 11.0 or 11.1 releases.

The other possibility is that you have not declared these variables.

If neither of these are the problem, do you mind posting or sending to PGI Customer Service (trs@pgrIf neoup.com) the source for the “ewald”, “xyz”, “sp” and “realcalc” modules?

Hope this helps,
Mat

Hi Mat,
Yes it worked. That error was removed when I replaced my variable CONST with some other name. Then too I was getting all zeros as the output. By defining the variables in the same module as you said, that problem was solved. Now I am getting values as desired.
Thank you very much.
Regards
Kaustubh