Hi, I am a complete newbie to nvfortran (pgfortran). I have problem to compile a simple program which I will list below. But I think it is irrelevant. Any program can’t be compiled. Here is info:
- Version of Nvidia driver is 390xx. I cannot install a newer version on the laptop where I use external monitor with Optimus-manager.
- The driver is compatible with Cuda toolkit 9.1 or older.
- There is only the latest version HPC-SDK 21.1 available on Nvidia site. Older version of PGI fortran compiler are available on pgroup website but only for fee, so no op for me.
- I direct nvfortran (or pgfortran or pgf95) to use Cuda toolkit 9.1:
nvfortran CUDA_HOME=/home/popsi/Downloads/cuda-9.1 -cuda -cudalibs saxpy.cuf -L/home/popsi/Downloads/cuda-9.1/lib64 and get the following errors:
/usr/bin/ld: cannot find -lcutensor
/usr/bin/ld: cannot find -lnccl
/usr/bin/ld: cannot find -lnvshmem
pgacclnk: child process exit status 1: /usr/bin/ld
- Compiling with Cuda toolkit that comes with HPC-SDK 21.1 by:
pgfortran CUDA_HOME=/opt/nvidia/hpc_sdk/Linux_x86_64/21.1/cuda/11.2 -cuda saxpy.cuf -L/opt/nvidia/hpc_sdk/Linux_x86_64/21.1/math_libs/11.2/targets/x86_64-linux/lib
is successful but executable reports the error:
0: ALLOCATE: 160000 bytes requested; not enough memory: 3(initialization error)
It is true whatever small array is in the code.
I believe it is due to incompatibility between Nvidia driver 390xx and Cuda Toolkit 11 installed within HPC-SDK 21.1.
So, my question is how to compile the program.
The program is the one given as the first simple one in tutorial (saxpy.cuf).
module mathOps
contains
attributes(global) subroutine saxpy(x, y, a)
implicit none
real :: x(:), y(:)
real, value :: a
integer :: i, n
n = size(x)
i = blockDim%x * (blockIdx%x - 1) + threadIdx%x
if (i <= n) y(i) = y(i) + a*x(i)
end subroutine saxpy
end module mathOps
program testSaxpy
use mathOps
use cudafor
implicit none
integer, parameter :: N = 40000
real :: x(N), y(N), a
real, device :: x_d(N), y_d(N)
type(dim3) :: grid, tBlock
tBlock = dim3(256,1,1)
grid = dim3(ceiling(real(N)/tBlock%x),1,1)
x = 1.0; y = 2.0; a = 2.0
x_d = x
y_d = y
call saxpy<<<grid, tBlock>>>(x_d, y_d, a)
y = y_d
write(,) 'Max error: ', maxval(abs(y-4.0))
end program testSaxpy