Hi,
I am testing out Windows 10 WSL Ubuntu 20.04.
I am able to install the CUDA toolkit and run a sample code per the instructions at
and I am able to run nvidia-smi
:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02 Driver Version: 528.49 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A |
| 30% 47C P0 45W / 200W | 404MiB / 8192MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 23 G /Xwayland N/A |
+-----------------------------------------------------------------------------+
I am also able to install the NVIDIA HPC SDK compiler.
However, when I try to compile my Fortran MPI+OpenACC+DC code, I get:
mpif90 -O3 -march=native -acc=gpu -stdpar=gpu -gpu=cc86,cc61,nomanaged -Minfo=accel -c pchip_module_v1.0.0.f90 -o pchip_module.o
nvfortran-Error-CUDA 11.1 or later required
nvfortran-Error-A CUDA toolkit matching the current driver version (0) or a supported older version (11.0) was not installed with this HPC SDK.
make: *** [Makefile:62: pchip_module.o] Error 1
Since the Linux driver is not installed (as WSL uses the Windows driver), it seems the NVHPC is detecting driver â0â.
However, if I explicitly tell the compiler to use Cuda 12, the compilation DOES work:
mpif90 -O3 -march=native -acc=gpu -stdpar=gpu -gpu=cc86,cuda12.0,nomanaged -Minfo=accel -I/opt/psi/nv/ext_deps/deps/hdf4/include -I/opt/psi/nv/ext_deps/deps/hdf5/include -c mas_sed_expmac.f -o mas.o
... ... ...
mpif90 -O3 -march=native -acc=gpu -stdpar=gpu -gpu=cc86,cuda12.0,nomanaged -Minfo=accel -I/opt/psi/nv/ext_deps/deps/hdf4/include -I/opt/psi/nv/ext_deps/deps/hdf5/include pchip_module.o mas.o -L/opt/psi/nv/ext_deps/deps/hdf4/lib -lmfhdf -ldf -L/opt/psi/nv/ext_deps/deps/hdf5/lib -lhdf5_fortran -lhdf5hl_fortran -lhdf5 -lhdf5_hl -L/opt/psi/nv/ext_deps/deps/jpeg/lib -ljpeg -L/opt/psi/nv/ext_deps/deps/zlib/lib -lz -o mas
But, when I try to run my code - it silently fails at the first use of a GPU kernel:
mpiexec -np 1 ../../../../branches/mas_acc/mas mas
... ... ...
Current file: /opt/psi/nv/mas/branches/mas_acc/mas_sed_expmac.f
function: zero_avec
line: 25732
This file was compiled: -acc=gpu -gpu=cc80 -gpu=cc86
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[37099,1],0]
Exit code: 1
--------------------------------------------------------------------------
Is there a way to make the NVHPC compile GPU codes that work on WSL?
â Ron