Internal compiler error. lili redefinition

Hello,

I have been working on a CUDA-Fortran code. The code compiles correctly when adding the -g debug flag. However, when switching to -O3 the following error pops us:

PGF90-F-0000-Internal compiler error. lili redefinition.

The incriminating line in the subroutine:

attributes(global) subroutine solve_kernel(lx,ly,am,ap,ac,rhs,n,nx,ny)
real(DP), dimension(:), intent(in), device :: lx,ly
complex(DP), dimension(:), intent(in), device :: am,ap
complex(DP), dimension(:,:,:), intent(inout), device :: ac,rhs
integer, value :: n,nx,ny
real(DP) :: r0,r,w
integer :: i,j,k
		
	!**** code 
		
	do k..
		!----------------
		ac(k,j,i)=1.d0  !commenting out this line it compiles
		!----------------
	end do
		
	!**** code
		
end subroutine

I am compiling with pgf95 18.7-0:

pgf95 -c -O3 -Mcuda=nordc file.F90

Any help is much appreciated, thank you.

Paolo

Hi Paolo,

Unfortunately I can’t reproduce the error with the code snip-it you provided. Can you please post or send to PGI Customer Service (trs@pgroup.com) a reproducing example?

I can then file a problem report and hopefully find you a workaround.

Thanks,
Mat

Hi Mat,

please find a self-contained example below:

PROGRAM main

	use testMod
	
	IMPLICIT NONE
	complex(8), allocatable, dimension(:,:,:), device :: f,ac
	integer :: nx,ny,nz,ierror
	
	nz=10
	nx=10
	ny=10
	
	allocate(f(1:nz,1:nx,1:ny))
	allocate(ac(1:nz,1:nx,1:ny))
	
	call solve(nz,f,ac)
	
END PROGRAM main

!---------------------------------------------------------------------------------!

module testMod

use cudafor

contains

subroutine solve(n,f,ac)
	integer, intent(in) :: n
	complex(8), allocatable, dimension(:,:,:), intent(inout), device :: f,ac
	type(dim3) :: threads,blocks
	integer :: nx,ny,is,ie,js,je,istat

	nx=size(f,2)
	ny=size(f,3)
	is=lbound(f,2)
	ie=ubound(f,2)
	js=lbound(f,3)
	je=ubound(f,3)
		
	threads=dim3(32,32,1)
	blocks=dim3(ceiling(real(nx)/threads%x),ceiling(real(ny)/threads%y),1)
	call solveTriDiag_press_kernel<<<blocks,threads>>>(ac,n,nx,ny)
		
	istat=cudaDeviceSynchronize()
		
end subroutine
	
attributes(global) subroutine solveTriDiag_press_kernel(ac,n,nx,ny)
	complex(8), dimension(:,:,:), intent(inout), device :: ac
	integer, value :: n,nx,ny
	real(8) :: w
	integer :: i,j,k
		
	i = blockDim%x*(blockIdx%x - 1)+threadIdx%x
	j = blockDim%y*(blockIdx%y - 1)+threadIdx%y
		
	if ((i<=nx).AND.(j<=ny)) then

		do k=2,n
			w=1.d0/ac(k-1,i,j)
			ac(k,i,j)=1.d0-w
		end do

	end if

end subroutine

end module testMod

When I try to compile the module

pgf95 -c -O3 -Mcuda test.f90

the error I mentioned in the previous post shows up. Can you reproduce the same error?

Thanks for the support,

Paolo

Hi Paolo,

Yes, I was able to recreate the error with 18.7, however, I could not reproduce the error with PGI 19.1 or 19.4. Also with 18.7, I was able to successfully compile with the LLVM back-end (-Mllvm).

Are you able to update your compiler version or use the LLVM back-end?

-Mat

Hi Mat,

apparently I don’t have the LLVM code generator on the installation I am using. I was able to “fix” the problem by assigning a dummy integer km=k-1 in the inner loop over k. Good to know that the problem has been fixed in later versions.

Thanks,

Paolo