I need helps again: my cuda fortran code does not run now. It worked well before. I guess it was because I have made some system updates (Windows 10) or firmware updates (dell precision 7720). I removed and reinstalled the latest CUDA toolkit and PGI fortran compiler (community version) but it did not solve my problem:
The PGI fortran compiler itself works well – it prints out “hello world” correctly, and it compiles the “deviceQuery.cuf” code and runs correctly,
The CUDA toolkit works well – I can run the sample programs in the CUDA toolkit package correctly
It seems that the PGI fortran compiler is not calling CUDA gpu (Quadro P4000) correctly. The message when running my code is:
ALLOCATE: copyin Symbol Memcpy FAILED:13(invalid device symbol)
I also tested with the following sampel CUDA fortran code:
module mathOps
contains
attributes(global) subroutine saxpy(x, y, a)
implicit none
real :: x(:), y(:)
real, value :: a
integer :: i, n
n = size(x)
i = blockDim%x * (blockIdx%x - 1) + threadIdx%x
if (i <= n) y(i) = y(i) + a*x(i)
end subroutine saxpy
end module mathOps
program testSaxpy
use mathOps
use cudafor
implicit none
integer, parameter :: N = 40000
real :: x(N), y(N), a
real, device :: x_d(N), y_d(N)
type(dim3) :: grid, tBlock
tBlock = dim3(256,1,1)
grid = dim3(ceiling(real(N)/tBlock%x),1,1)
x = 1.0; y = 2.0; a = 2.0
x_d = x
y_d = y
call saxpy<<<grid, tBlock>>>(x_d, y_d, a)
y = y_d
write(*,*) 'Max error: ', maxval(abs(y-4.0))
end program testSaxpy
The result when running this code is:
PGI$ ./saxpy.exe
Max error: 2.000000
Segmentation fault
What might be wrong with my computer? Thanks a lot!