"No CUDA device code available" error

A .so library encapsulating CFD simulation procedures has been compiled and linked without error. The lib consists of multiple fortran modules. The data & subroutines are shared cross-file using the fortran module and ‘use’ command.

When trying to load the lib via function ::LoadLibrary(…), the host programe exits with an error message: “No CUDA device code available”.

Note that an executable has been made from the same procedures, and it runs perfectly without any problem.

Some general info are provided as follows:

OS: Centos 8
hpc_toolkit: 11.1
hpc_sdk: 20.9
compile flags: -cuda -shared -fPIC -O4 -fast -gpu=cc80 -Wall -Wextra
linker flags: -lcufft -lcublas -lnvToolsExt -lpthread -lstdc++ -ldl

[***@fdgs00-ws ]$ pgaccelinfo
CUDA Driver Version: 11020
NVRM version: NVIDIA UNIX x86_64 Kernel Module 460.67 Thu Mar 11 00:11:45 UTC 2021
Device Number: 0
Device Name: GeForce RTX 3090
Device Revision Number: 8.6
Global Memory Size: 25443893248
Number of Multiprocessors: 82
Concurrent Copy and Execution: Yes
Total Constant Memory: 65536
Total Shared Memory per Block: 49152
Registers per Block: 65536
Warp Size: 32
Maximum Threads per Block: 1024
Maximum Block Dimensions: 1024, 1024, 64
Maximum Grid Dimensions: 2147483647 x 65535 x 65535
Maximum Memory Pitch: 2147483647B
Texture Alignment: 512B
Clock Rate: 1695 MHz
Execution Timeout: Yes
Integrated Device: No
Can Map Host Memory: Yes
Compute Mode: default
Concurrent Kernels: Yes
ECC Enabled: No
Memory Clock Rate: 9751 MHz
Memory Bus Width: 384 bits
L2 Cache Size: 6291456 bytes
Max Threads Per SMP: 1536
Async Engines: 2
Unified Addressing: Yes
Managed Memory: Yes
Concurrent Managed Memory: Yes
Preemption Supported: Yes
Cooperative Launch: Yes
Multi-Device: Yes
Default Target: cc80

Further info, the .so lib was compiled and linked by nvfortran. The executable used to load the .so lib was compiled and linked by g++.

update:
following the thread: Missing cuda device code when trying to link nvc object file with gcc
After disabling RDC via the “-gpu=nordc” flag while compiling the CUDA fortran code, I am able to load the .so lib from the executable now. But new problem comes out. I start to receive the “0: copyin Memcpy (dev=0x7f2b6dea6400, host=0x29d98800, size=24) FAILED: 700(an illegal memory access was encountered)” error which did not emerg before disabling RDC. I noted that the error does not come from the very first call to memcpy in the code.

I highly suggest you update your compiler version to our latest. While I’ve forgotten the exact release in which it was added, since 20.9 we’ve added support for RDC in Fortran and C shared objects with OpenACC and CUDA Fortran.

You can find our latest release here: NVIDIA HPC SDK Current Release Downloads | NVIDIA Developer

But new problem comes out. I start to receive the “0: copyin Memcpy (dev=0x7f2b6dea6400, host=0x29d98800, size=24) FAILED: 700(an illegal memory access was encountered)” error which did not emerg before disabling RDC.

Without a reproducing example, it’s difficult to say the exact cause, though the error is likely to be coming from the kernel preceding the memcpy. Unless you explicit added error handling, kernel errors often show up in the next device operation after the kernel.

One possibility is that in order to directly access device module variables, RDC must be enabled. Otherwise, the global device variables aren’t linked and you can get this error when accessing them. If this is the case, your only options are to update the compiler to get RDC support, or only pass module variables as kernel arguments.

-Mat

Thanks for reply. To try the latest version of compiler. I have installed the SDK-23.5 (nvhpc_2023_235_Linux_x86_64_cuda_12.1) and updated my CUDA toolkit to 12.1 . Compiling is ok, but got following errors during linking stage:
/usr/bin/ld: cannot find -lcufft
/usr/bin/ld: cannot find -lcublas

I added the following to my .bashrc:
#----------------------------------------------------------------------------------------------------------------
export NVARCH=uname -s_uname -m;
export NVARCH
export NVCOMPILERS=/scratch/OpenFOAM_gpu_lux/nvhpc_2023_235_Linux_x86_64_cuda_12.1

export MANPATH=“$NVCOMPILERS/$NVARCH/23.5/compilers/man:$MANPATH”
export PATH=“$NVCOMPILERS/$NVARCH/23.5/compilers/bin:$PATH”
export LD_LIBRARY_PATH=“$NVCOMPILERS/$NVARCH/23.5/compilers/lib:$LD_LIBRARY_PATH”
export MANPATH=“$MANPATH:$NVCOMPILERS/$NVARCH/23.5/comm_libs/mpi/man”
export PATH=“$NVCOMPILERS/$NVARCH/23.5/comm_libs/mpi/bin:$PATH”
export LD_LIBRARY_PATH=“$NVCOMPILERS/$NVARCH/23.5/comm_libs/mpi/lib:$LD_LIBRARY_PATH”

CUDA_TOOLKIT=/usr/local/cuda-12.1
export CPATH=$CUDA_TOOLKIT/targets/x86_64-linux/include:$CPATH
export PATH=$CUDA_TOOLKIT/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=$CUDA_TOOLKIT/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
#----------------------------------------------------------------------------------------------------------------

Are you trying to link against the cuFFT and cuBLAS libraries that ship with the CUDA 12.1 SDK, or the ones we ship with the HPC SDK?

For the CUDA SDK, are you setting the “-L<path/to/lib>” flag to point to the correct directory which contains the libraries?

For the HPC SDK, are you using the flag “-cudalib=cufft,cublas” so the compiler will add them to the link?

Thanks, problem solved by using the flag “-cudalib=cufft,cublas”.

For the CUDA SDK, I’ve exported the following in my env:

CUDA_TOOLKIT=/usr/local/cuda-12.1
export CPATH=$CUDA_TOOLKIT/targets/x86_64-linux/include:$CPATH
export PATH=$CUDA_TOOLKIT/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=$CUDA_TOOLKIT/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

where “/usr/local/cuda-12.1” is the path I install CUDA SDK 12.1. With these exports, the linker still could not find the libs. If I use the “-L/usr/local/cuda-12.1” flag, libs are found, but complained about “undefined reference” to some cublas functions.

Also, I wonder what would be the difference between linking against CUDA SDK or HPC SDK? For the HPC SDK, why I could not find the cublas and cufft libs in its cuda folder (e.g. nvhpc_2023_235_Linux_x86_64_cuda_12.1/Linux_x86_64/23.5/cuda/12.1)?

They are the exact same libraries. We ship them as a convivence with the HPC SDK so the CUDA SDK isn’t a requirement. However, we do move them to the same directory as the rest of our math libraries, i.e. under “23.5/math_libs/12.1”.