Using cudaGetDeviceProperties in nvfortran

MMB · December 12, 2023, 11:50pm

I am using nvfortran 23.11 and Cuda 12.3 - just updated both. Previously, I was able to use cudaGetDeviceProperties as in:
istat = cudaGetDeviceProperties(prop, 0)
if(istat /= cudaSuccess) then
write(,) ‘GetDevice kernel error:’, cudaGetErrorString(istat)
stop
endif
write(,“('Device Name: ',a)”) trim(prop%name)
write(,“('Max GridSize: ‘,2(i0,’ x '),i0)”) prop%maxGridSize
write(*,“('MaxThreadsPerBlock: ',i0)”) prop%maxThreadsPerBlock
Since upgrading, I get the error message:
/home/malcolm/copied/apps/Jaya_dble/compile_info.f90:24: undefined reference to cudagetdeviceproperties_' /usr/bin/ld: /home/malcolm/copied/apps/Jaya_dble/compile_info.f90:26: undefined reference to __pgf90_getcudaerrorstring’
/usr/bin/ld: main4jaya.o:(.init_array+0x10): undefined reference to `Mcuda_compiled’
make: *** [Makefile:95: test-rand1] Error 2
Your help with this would be appreciated.

MMB · December 12, 2023, 11:59pm

In the original post< I should have included:

   integer :: istat
  type(cudaDeviceProp) :: prop

MatColgrove · December 13, 2023, 3:26pm

Hi MMB,

How are you linking? This looks like your missing the “-cuda” flag on your link line.

-Mat

MMB · December 13, 2023, 5:08pm

Hi Mat, below is from my Makefile:

CC = nvc
CXX = nvc++
F90 = nvfortran
  F90FLAGS = -O3 -cuda -cudalibs -gpu=cc60

# F90FLAGS = -Ofast -ffree-form -ffree-line-length-132 -gpu=cc60
# F90FLAGS = -Mipa=fast,inline -Mfree -Mextend

%.o : %.f90
        @echo ' '
        @echo 'Today is ' | tr -d '\012'; date
        @echo 'F90 = ' $(F90)
        @echo 'Now compiling'
        $(F90) $(F90FLAGS) -o $@ -c $<

$(NAME): $(fortobjs)
        @echo ' '
        @echo 'Now linking'
        $(F90) -o $@ $(fortobjs)
        @echo ' '

.SUFFIXES:: .f90,.o

All of the files compile, they are all *.f90 type, shown next:

   NVDIR = /opt/nvidia/hpc_sdk/Linux_x86_64

   NVCOMPDIR = $(NVDIR)/23.11/compilers/bin

   NAME = test-rand1

#  FSOURCES = compile_info Constants_dble_m time_report get_next_IO_unit Series_m Distributions_m obj_cubature1_m main4jaya
   FSOURCES = Constants_dble_m time_report get_next_IO_unit Series_m Distributions_m obj_cubature1_m main4jaya

If you need to see any of them, I’ll be happy to send them to you.

Malcolm

MMB · December 13, 2023, 5:12pm

Mat, the copy didn’t include a # sign in front of the first FSOURCES. So compile_info.f90 is not used. M.

MMB · December 13, 2023, 5:15pm

Matt, Why the # sign is missing, I don’t know. But the two HEAVY f90flags lines also are preceded by #. M.

MatColgrove · December 13, 2023, 5:43pm

I fixed it. You just need to include the text in a code block.

The link like looks like it doesn’t include any flags. Can you try adding “$(F90FLAGS)”?

$(NAME): $(fortobjs)
        @echo ' '
        @echo 'Now linking'
        $(F90) -o $@ $(F90FLAGS) $(fortobjs)
        @echo ' '

-Mat

MMB · December 13, 2023, 5:58pm

Mat, I incorporated your suggestion, and get the following:

Now linking
nvfortran -o test-rand1 -O3 -cuda -cudalibs -gpu=cc60 Constants_dble_m.o time_report.o get_next_IO_unit.o Series_m.o Distributions_m.o obj_cubature1_m.o main4jaya.o
/usr/bin/ld: cannot find -lcusolverMp
/usr/bin/ld: cannot find -lcal
/usr/bin/ld: cannot find -lcutensor
/usr/bin/ld: cannot find -lcutensorMg
/usr/bin/ld: cannot find -lnccl
/usr/bin/ld: cannot find -lnvshmem_device
/usr/bin/ld: cannot find -lnvshmem_host
pgacclnk: child process exit status 1: /usr/bin/ld
make: *** [Makefile:95: test-rand1] Error 2
malcolm145

Again I need your help! Malcolm

MatColgrove · December 13, 2023, 6:11pm

The “-cudalibs” flag will implicitly include all the auxiliary CUDA libraries. Why the linker can’t find them, I’m sure. Though if you aren’t using them, then remove the “-cudalibs” flag. If you’re using a specific library, like cuBLAS, then use “-cudalib=cublas”.

If you do need all the libraries, add the flag “-v” (verbose) to the link and you can see the linker command (i.e. the “ld” command line). This will show the paths being used for the location of the libraries and check that they exist. Feel free to post the verbose output from “ld” if you need help.

MMB · December 13, 2023, 6:23pm

Mat, I removed the -cudalibs flag, and this is what I got:

Now linking
nvfortran -v -o test-rand1 -O3 -cuda -gpu=cc60 Constants_dble_m.o time_report.o get_next_IO_unit.o Series_m.o Distributions_m.o obj_cubature1_m.o main4jaya.o
Export PGI_CURR_CUDA_HOME=/usr/local/cuda-12.3
Export NVHPC_CURRENT_CUDA_HOME=/usr/local/cuda-12.3
Export NVHPC_CURRENT_CUDA_VERSION=12.3.101
Export NVCOMPILER=/opt/nvidia/hpc_sdk/Linux_x86_64/23.11
Export PGI=/opt/nvidia/hpc_sdk

/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/tools/acclnk -nvidia /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/tools/nvdd -cuda12030 -cudaroot /usr/local/cuda-12.3 -cudalink -computecap=60 -nvvm70 /usr/bin/ld /usr/lib64/crt1.o /usr/lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/11//crtbegin.o /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/f90main.o --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -T /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/nvhpc.ld -L/usr/local/cuda-12.3/lib64 -L/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/11/ Constants_dble_m.o time_report.o get_next_IO_unit.o Series_m.o Distributions_m.o obj_cubature1_m.o main4jaya.o -rpath /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib -rpath /usr/local/cuda-12.3/lib64 -o test-rand1 -L/usr/lib/gcc/x86_64-redhat-linux/11//…/…/…/…/lib64 -lcudafor_120 -lcudafor /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/cuda_init_register_end.o -lcudadevrt -lcudart -lcudadevice -lcudafor2 -lnvf -lnvomp -ldl --as-needed -lnvhpcatm -latomic --no-as-needed -lpthread -lnvcpumath -lnsnvc -lnvc -lrt -lpthread -lgcc -lc -lgcc_s -lm /usr/lib/gcc/x86_64-redhat-linux/11//crtend.o /usr/lib64/crtn.o
Unlinking directory /tmp/nvfortranZYEmzJh2LYGB.ext
Unlinking directory /tmp/nvfortranlYEmH_JSsO9U.il

I will need the cudalibs later in my program development, but lets solve this problem first! Malcolm

MMB · December 14, 2023, 2:44pm

Mat, I changed the O3 to Ofast and the linker phase now runs! Here is the output:

Now linking
nvfortran -v -o test-rand1 -Ofast -cuda -gpu=cc60 Constants_dble_m.o time_report.o get_next_IO_unit.o Series_m.o Distributions_m.o obj_cubature1_m.o main4jaya.o
Export PGI_CURR_CUDA_HOME=/usr/local/cuda-12.3
Export NVHPC_CURRENT_CUDA_HOME=/usr/local/cuda-12.3
Export NVHPC_CURRENT_CUDA_VERSION=12.3.101
Export NVCOMPILER=/opt/nvidia/hpc_sdk/Linux_x86_64/23.11
Export PGI=/opt/nvidia/hpc_sdk

/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/tools/acclnk -nvidia /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/tools/nvdd -cuda12030 -cudaroot /usr/local/cuda-12.3 -cudalink -computecap=60 -nvvm70 /usr/bin/ld /usr/lib64/crt1.o /usr/lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/11//crtbegin.o /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/f90main.o --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -T /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/nvhpc.ld -L/usr/local/cuda-12.3/lib64 -L/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/11/ Constants_dble_m.o time_report.o get_next_IO_unit.o Series_m.o Distributions_m.o obj_cubature1_m.o main4jaya.o -rpath /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib -rpath /usr/local/cuda-12.3/lib64 -o test-rand1 -L/usr/lib/gcc/x86_64-redhat-linux/11//…/…/…/…/lib64 -lcudafor_120 -lcudafor /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/cuda_init_register_end.o -lcudadevrt -lcudart -lcudadevice -lcudafor2 -lnvf -lnvomp -ldl --as-needed -lnvhpcatm -latomic --no-as-needed -lpthread -lnvcpumath -lnsnvc -lnvc -lrt -lpthread -lgcc -lc -lgcc_s -lm /usr/lib/gcc/x86_64-redhat-linux/11//crtend.o /usr/lib64/crtn.o
Unlinking directory /tmp/nvfortranmxhuKTq3EHK-.ext
Unlinking directory /tmp/nvfortranSxhueAwhhQv2.il

MMB · December 14, 2023, 3:02pm

Mat, I still cannot add the -cudalibs to the F90FLAGS set. Below is the linker output:

Now linking
nvfortran -v -o test-rand1 -Ofast -cuda -cudalibs -gpu=cc60 Constants_dble_m.o time_report.o get_next_IO_unit.o Series_m.o Distributions_m.o obj_cubature1_m.o main4jaya.o
Export PGI_CURR_CUDA_HOME=/usr/local/cuda-12.3
Export NVHPC_CURRENT_CUDA_HOME=/usr/local/cuda-12.3
Export NVHPC_CURRENT_CUDA_VERSION=12.3.101
Export NVCOMPILER=/opt/nvidia/hpc_sdk/Linux_x86_64/23.11
Export PGI=/opt/nvidia/hpc_sdk

/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/tools/acclnk -nvidia /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/tools/nvdd -cuda12030 -cudaroot /usr/local/cuda-12.3 -cudalink -computecap=60 -nvvm70 /usr/bin/ld /usr/lib64/crt1.o /usr/lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/11//crtbegin.o /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/f90main.o --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -T /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/nvhpc.ld -L/usr/local/cuda-12.3/lib64 -L/usr/local/cuda-12.3/nccl/lib -L/usr/local/cuda-12.3/nvshmem/lib -L/usr/local/cuda-12.3/lib64 -L/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/11/ Constants_dble_m.o time_report.o get_next_IO_unit.o Series_m.o Distributions_m.o obj_cubature1_m.o main4jaya.o -rpath /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib -rpath /usr/local/cuda-12.3/lib64 -rpath /usr/local/cuda-12.3/lib64 -rpath /usr/local/cuda-12.3/nccl/lib -rpath /usr/local/cuda-12.3/nvshmem/lib -rpath /usr/local/cuda-12.3/hpcx/latest/ucc/lib -rpath /usr/local/cuda-12.3/hpcx/latest/ucx/lib -o test-rand1 -L/usr/lib/gcc/x86_64-redhat-linux/11//…/…/…/…/lib64 --as-needed -lnvhpcwrapcufft -lcufft -lcufftw -lcudaforwraprand -lcurand -lcusolver -lcusolverMp -lcal -lcudaforwrapsparse12 -lcusparse -lcudaforwraptensor -lcudaforwraptensor_118 -lcutensor -lcutensorMg -lnvblas -lcudaforwrapnccl -lnccl -lnvhpcwrapshmem -lnvshmem_device -lnvshmem_host -L/usr/local/cuda-12.3/lib64/stubs -lnvidia-ml -lnvlamath -lblas -llapack -lnvhpcwrapnvtx -lcublas -lcublasLt -lcudaforwrapblas -lcudaforwrapblas117 -lcudart --no-as-needed -lcudafor_120 -lcudafor /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/lib/cuda_init_register_end.o -lcuda -lcudadevrt -lcudart -lcudadevice -lcudafor2 -lnvf -lnvomp -ldl --as-needed -lnvhpcatm -latomic --no-as-needed -lpthread -lnvcpumath -lnsnvc -lnvc -lrt -lpthread -lgcc -lc -lgcc_s -lm -lstdc++ /usr/lib/gcc/x86_64-redhat-linux/11//crtend.o /usr/lib64/crtn.o
/usr/bin/ld: cannot find -lcusolverMp
/usr/bin/ld: cannot find -lcal
/usr/bin/ld: cannot find -lcutensor
/usr/bin/ld: cannot find -lcutensorMg
/usr/bin/ld: cannot find -lnccl
/usr/bin/ld: cannot find -lnvshmem_device
/usr/bin/ld: cannot find -lnvshmem_host
pgacclnk: child process exit status 1: /usr/bin/ld
nvfortran-Fatal-linker completed with exit code 1

Unlinking directory /tmp/nvfortranT-jvhh8ZsCvm.ext
Unlinking directory /tmp/nvfortranv-jv–89Vqw4.il
make: *** [Makefile:96: test-rand1] Error 2

Any recommendations, I really want to use several cuda libraries?

MatColgrove · December 14, 2023, 4:43pm

It looks like you’re setting the environment variable “NVHPC_CUDA_HOME” to a local CUDA install, “/usr/local/cuda-12.3/”, so the linker is trying to find these libraries there. I’m assuming that you didn’t install the libraries, hence they can’t be found.

The NVHPC SDK packages multiple versions of CUDA (the exact versions depend on the the release) as well as all the extra CUDA libraries. Hence I’d recommend you not set NVHPC_CUDA_HOME. The only reason to do so is if you need an older or newer version of CUDA not packaged with the NVHPC SDK. Since CUDA 12.3 does ship with with NVHPC 23.11, there shouldn’t be a reason to use a local CUDA 12.3 install.

MMB · December 14, 2023, 7:12pm

Mat, I can still not compile with -cudalibs. The error message is:
malcolm12 nvfortran tred.f90
nvfortran-Fatal-The value of CUDAROOT is not a directory: /usr/local/cuda-12.3/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/Linux_x86_64/23.11/compilers/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin:/home/malcolm/.local/bin:/home/malcolm/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/home/malcolm/hipreclibs/MP/tmp/mpfun20-fort-v10/fortran-var1:/home/malcolm/Python-3.6.8/Tools/msi/pip:/usr/local/lib64/python3.6/site-packages:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpc-1.0.3:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpfr-3.1.6:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/gmp-6.1.0:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0:.

I have modified, and sourced, my .bashrc as follows:

 .bashrc_nv

echo 'Entered .bashrc_nv'

# alias nvidia-uvm='nvidia-uvm-440.95.01'
alias nvidia-uvm='nvidia-uvm-545.23.08'

NVARCH=`uname -s`_`uname -m`; export NVARCH
NVCOMPILERS=/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin; export NVCOMPILERS

# PATH=$NVCOMPILERS/$NVARCH/22.1/compilers/bin:$PATH; export PATH
# PATH=/usr/local/cuda-11.4/bin:$PATH; export PATH
PATH=$NVCOMPILERS/$NVARCH/23.11/compilers/bin:$PATH; export PATH

NVHPC_CUDA_HOME=/usr/local/cuda-12.3/bin:$PATH; export PATH
CUDAROOT=/usr/local/cuda-12.3/bin:$PATH; export PATH


# LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib64:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib64/libmpfr.so.4.1.6:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib64/libmpc.so.3.1.0:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib64/libgmp.so.10.3.2:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH

export PATH=$NVCOMPILERS/$NVARCH/23.11/comm_libs/mpi/bin:$PATH
export MANPATH=$MANPATH:$NVCOMPILERS/$NVARCH/23.11/comm_libs/mpi/man

export MODULEPATH=/usr/local/Modules/modulefiles

export NCPUS=4
export OMP_NUM_THREADS=4
# export OMP_STACKSIZE=16M

echo 'Leaving .bashrc_nv'

This produced the error message above. Any more ideas?

MMB · December 14, 2023, 7:40pm

Mat, in my preceding message, the heavy font lines are actually commented out! M.

MatColgrove · December 14, 2023, 7:58pm

Hi Malcom,

You can fix the formatting by enclosing the text using the “preformatted text” button (the “</>” symbol).

This is your issue. NVHPC_CUDA_HOME overrides the default location of the CUDA SDK. You’re basically telling the compiler to use you’re own install of the CUDA SDK instead of the ones that ship with the NVHPC SDK.

You’re welcome to continue using it, but it does mean that you’ll also need to install the extra CUDA math and communication libraries and directly add them to link line rather than using the “-cudalibs” flag. Though since the NVHPC SDK 23.11 already has them packaged for you, as well as CUDA 12.3, plus the convenience of the “-cudalibs” flag, I’d recommend you simply remove this from your environment.

-Mat

MMB · December 15, 2023, 4:27pm

Mat, I cleaned up the .bashrc file by deleting offending lines. I don’t understand what you mean when you refer to “preformatted text” button (the “</>” symbol).

The cleaned up .bashrc is below, but I still get the following error message:

nvfortran tred.f90
nvfortran-Fatal-The value of CUDAROOT is not a directory: /usr/local/cuda-12.3/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/Linux_x86_64/23.11/compilers/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin:/home/malcolm/.local/bin:/home/malcolm/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/home/malcolm/hipreclibs/MP/tmp/mpfun20-fort-v10/fortran-var1:/home/malcolm/Python-3.6.8/Tools/msi/pip:/usr/local/lib64/python3.6/site-packages:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpc-1.0.3:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpfr-3.1.6:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/gmp-6.1.0:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0:.

This occurs with, or without, the CUDAROOT statement present!

Your thoughts please. M.

MatColgrove · December 15, 2023, 7:49pm

When you post, there are 11 icons just above the area where you are typing. They let you quote items, make them bold or italic, add links, use bullet lists, and so forth. Basically things you can use to control how the output looks.

The 6th icon is “</>”. By selecting the specific region of text you need to appear as is and then clicking this icon, the pre-formatted text wont apply any of the forum’s mark-up symbols like “#”.

I went ahead an edited you’r post to include the pre-formatted text.

“CUDAROOT” is an internal variable in our compiler configuration (rc) files. From what I can tell, you shouldn’t be able to effect it when setting it in the environment. I tried using you bash environment, and couldn’t get it to fail. My best guess is you’ve modified your “localrc” file to include it since I can only reproduce this error when I add it. If you did do this, I’d suggest to remove this setting.

I do see one issue with the compilers where the list of CUDA libraries isn’t accounting for the different CUDA versions. Given cusolverMP didn’t ship with CUDA 12.3, it can cause it to not be found. I’ve report this.

My suggestion is to not use the “-cudalibs” flag, but instead use “-cudalib=cublas,cusolver,…,” with the specific libraries you need.

On a side node, in your bashrc file, you should be setting “NVCOMPILERS” to the base path to NVHPC, otherwise your PATH is being set incorrectly.

NVCOMPILERS=/opt/nvidia/hpc_sdk; export NVCOMPILERS

MMB · December 15, 2023, 8:46pm

Mat, I think we are overthinking this problem. The program, named tred.f90, is something I copied from the Nvidia website. Here it is:

program multidimred
use cudafor
! real(8), managed :: a(5,5,5,5,5)
! real(8), managed :: b(5,5,5,5)
real(8) :: a(5,5,5,5,5)
real(8) :: b(5,5,5,5)
real(8) :: c
call random_number(a)
do idim = 1, 5
b = sum(a, dim=idim)
c = max(maxval(b), c)
end do
print *,“Max along any dimension”,c
end program

When I run nvfortran tred.f90, I get:
nvfortran tred.f90
nvfortran-Fatal-The value of CUDAROOT is not a directory: /usr/local/cuda-12.3/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/Linux_x86_64/23.11/compilers/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin:/home/malcolm/.local/bin:/home/malcolm/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/home/malcolm/hipreclibs/MP/tmp/mpfun20-fort-v10/fortran-var1:/home/malcolm/Python-3.6.8/Tools/msi/pip:/usr/local/lib64/python3.6/site-packages:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpc-1.0.3:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpfr-3.1.6:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/gmp-6.1.0:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0:.

The output is the same whether the “use curand” is included or not. When I use:

nvfortran -cuda tred.f90 then I get:
malcolm34 nvfortran -cuda tred.f90
nvfortran-Fatal-The value of CUDAROOT is not a directory: /usr/local/cuda-12.3/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin/Linux_x86_64/23.11/compilers/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/bin:/home/malcolm/.local/bin:/home/malcolm/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/home/malcolm/hipreclibs/MP/tmp/mpfun20-fort-v10/fortran-var1:/home/malcolm/Python-3.6.8/Tools/msi/pip:/usr/local/lib64/python3.6/site-packages:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpc-1.0.3:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/mpfr-3.1.6:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0/gmp-6.1.0:/dev/trunk/tmp/gcc-11.1.0-source/gcc-11.1.0:.
nvfortran-Error-A CUDA toolkit matching the current driver version (0) or a supported older version (12.3) was not installed with this HPC SDK.

I have not knowingly touched the localrc file. I did correct the path statement to NVCOMPILERS.

Thanks, Malcolm

MatColgrove · December 15, 2023, 9:10pm

Ok, though this does seem to be the only plausible way for this to occur.

None the less, the reason why it worked before is that the NVHPC_CUDA_HOME setting overrides CUDAROOT. Hence why don’t you add back NVHPC_CUDA_HOME but set it to the CUDA that ships with NVHPC. Something like:

NVHPC_CUDA_HOME=$NVCOMPILERS/$NVARCH/23.11/cuda/12.3; export NVHPC_CUDA_HOME

However, this will bring us back to the beginning since the “-cudalibs” flag won’t work when overriding this setting.

Hence you’ll need manually add the CUDA libraries to the link:

nvfortran -V23.11 -cuda tred.f90 -L/opt/nvidia/hpc_sdk//Linux_x86_64/23.11/math_libs/12.3/lib64/ -lcurand

If you do want to try to determine the cause of the CUDAROOT error, let try creating a new localrc via the command:

makelocalrc -d . -x

The creates a new localrc in the current directory. You can move or rename this file as you need. Then set then environment variable

export NVLOCALRC=/full/path/to/the/new/localrc

Changing the path to the actual full path to the new localrc

Topic		Replies	Views
CUDA version not available message with nvc++ on Ubuntu nvc, nvc++ and nvfortran	11	7669	April 30, 2021
CUDA compile trouble CUDA Programming and Performance	47	5119	November 8, 2010
Ubuntu cuda-11-8 package wrong dependency on cuda-drivers CUDA Setup and Installation cuda , ubuntu	11	18467	July 18, 2024
The problem of installing and using the NVhpc SDK nvc, nvc++ and nvfortran	3	585	January 23, 2024
Cuda installation problem for NVIDIA A40, Linux x86_64 Ubuntu 20.04 Linux cuda , ubuntu	2	4232	June 30, 2022
No CUDA-capable device is detected TAO Toolkit cuda , tao	9	53	February 17, 2025
run on K40 Linux	83	4611	June 29, 2018
Cuda 8.0 toolkit install - nvcc not found - ubuntu 16.04 CUDA Setup and Installation	21	141457	October 31, 2018
CUFFT_INVALID_DEVICE on cufftPlan1d in NVIDIA's Simple CUFFT example GPU-Accelerated Libraries	6	3805	December 15, 2014
Using ...../extras/QD nvc, nvc++ and nvfortran	15	656	January 5, 2024

Using cudaGetDeviceProperties in nvfortran

Related topics