Could anyone help me to check the problem?

Dear all,

I have the following code:

! Kernel definition
attributes(global) subroutine ksaxpy( n, a, x, y )
real, dimension(*) :: x,y
real, value :: a
integer, value :: n, i
i = (blockidx%x-1) * blockdim%x + threadidx%x
if( i <= n ) y(i) = a * x(i) + y(i)
end subroutine

! Host subroutine
subroutine solve( n, a, x, y )
real, device, dimension(*) :: x, y
real :: a
integer :: n
! call the kernel
call ksaxpy<<<n/64, 64>>>( n, a, x, y )
end subroutine

integer::n,i,j
real::a
real,allocatable::x(:),y(:)
a=2.0
n=1000000
allocate(x(n),y(n))
!$omp parallel do
do i=1,n
x(i)=i
y(i)=i*i
enddo
!$omp end parallel do

call solve(n,a,x,y)
end


I built the program by Windows 7 x64 + VS.net2008 + PGI 10.3, the following error message displayed:

pgnvd-Error-NVOPEN64DIR value is not a directory: C:/Program Files (x86)/PGI/win32/10.3/cuda/open64/lib
pgnvd-Error-CUDADIR value is not a directory: C:/Program Files (x86)/PGI/win32/10.3/cuda/bin
E:\solver\GPUtest\axpy\axpy\ConsoleApp.f90(7) : error F0000 : Internal compiler error. pgnvd job exited with nonzero status code 0
PGF90/x86 Windows 10.3-0: compilation aborted
pgfortran-Warning-CUDALIB is not properly set to a directory in the pgfortran rcfiles; -ta=analysis assumed

axpy build failed.




Could anyone tell me how to set options in project settings?
I can attach the whole project to here if allowed.

Thanks,
Zhanghong Tang

I noticed that some other people have the same problem but it is said that the problem is solved in version 10.3.

Thanks,
Zhanghong Tang

Hi Zhanghong Tang,

These errors typically mean that you didn’t install the CUDA packages that accompany the compilers. If this is the case, please reinstall and select ‘yes’ when prompted if you wish to install CUDA.

Note that your program as written will fail. Device subroutines need to have an interface before they can be called. You will need to add an interface block to “solve” or put “ksaxpy” into a module. Module subroutines have implicit interfaces.

  • Mat

Dear Mat,

Thank you very much for your kindly reply. I have installed the CUDA 3.0 before the PGI compiler installed, do you think that the PGI compiler can’t recognize the package installed before?

I have changed the code as follows, the error is still the same. Could you please take a look at it for me?

module GPUfun
	interface ksaxpy
		attributes(global) subroutine ksaxpy( n, a, x, y )
			real, dimension(*) :: x,y
			real, value :: a
			integer, value :: n 
		end subroutine
	end interface
end module

! Kernel definition
attributes(global) subroutine ksaxpy( n, a, x, y )
real, dimension(*) :: x,y
real, value :: a
integer, value :: n, i
i = (blockidx%x-1) * blockdim%x + threadidx%x
if( i <= n ) y(i) = a * x(i) + y(i)
end subroutine

! Host subroutine
subroutine solve( n, a, x, y )
use GPUfun
real, device, dimension(*) :: x, y
real :: a
integer :: n
! call the kernel
call ksaxpy<<<n/64, 64>>>( n, a, x, y )
end subroutine

integer::n,i,j
real::a
real,allocatable::x(:),y(:)
a=2.0
n=1000000
allocate(x(n),y(n))
!$omp parallel do
do i=1,n
	x(i)=i
	y(i)=i*i
enddo
!$omp end parallel do

call solve(n,a,x,y)
end

Thanks,
Zhanghong Tang[/code]

Thank you very much for your kindly reply. I have installed the CUDA 3.0 before the PGI compiler installed, do you think that the PGI compiler can’t recognize the package installed before?

Due to compatibility reasons, we only use the versions of CUDA that ship with the compilers. You must install the the CUDA libraries included with the PVF installation in order to use CUDA Fortran.

Putting ksaxpy in a module will give it an implicit interface.

module GPUfun

contains

! Kernel definition
attributes(global) subroutine ksaxpy( n, a, x, y )
real, dimension(*) :: x,y
real, value :: a
integer, value :: n, i
i = (blockidx%x-1) * blockdim%x + threadidx%x
if( i <= n ) y(i) = a * x(i) + y(i)
end subroutine

end module

! Host subroutine
subroutine solve( n, a, x, y )
use GPUfun
real, device, dimension(*) :: x, y
real :: a
integer :: n
! call the kernel
call ksaxpy<<<n/64, 64>>>( n, a, x, y )
end subroutine

integer::n,i,j
real::a
real,allocatable::x(:),y(:)
a=2.0
n=1000000
allocate(x(n),y(n))
!$omp parallel do
do i=1,n
   x(i)=i
   y(i)=i*i
enddo
!$omp end parallel do

call solve(n,a,x,y)
end

Hope this helps,
Mat

Dear Mat,

Thank you very much for you kindly reply. Now it works after reinstalled the PGI together with CUDA.

Thanks,
Zhanghong Tang