nvlink fatal

Hi,

I’m getting

nvlink fatal : Input file ‘/global/common/cori/software/cuda/10.0/lib64/libcudadevrt.a:cuda_device_runtime.o’ newer than toolkit (100 vs 91)

No matter which versions of cuda and pgi are used.

It even occurred when other version of cuda lib are explicitly specified for the linking line:

mpif90 -Mcuda -lcudaforblas /global/common/cori/software/cuda/9.1/lib64/libcudadevrt.a -o mycuda.x
nvlink fatal : Input file ‘/global/common/cori/software/cuda/10.0/lib64/libcudadevrt.a:cuda_device_runtime.o’ newer than toolkit (100 vs 91)

Do you know what happened here?

Thanks,

– Jing

Hi Jing,

Do you know what happened here?

Not sure, but I’m guessing that you have a mismatch someplace on which CUDA runtime you’re using to link and the nvlink version. You can try adding the verbose flag (-v) to your link and post the output. We should be able to then see where the mismatch is.

Also, try compiling and linking with “-Mcuda=cuda10.0” to explicitly set the CUDA version. Also link using “-Mcudalib=cublas” instead of explicitly adding “-lcudaforblas” so the correct cuBlas library version is brought in.

Are you setting “CUDA_HOME” in your environment? By default we’ll use the CUDA components we ship but users can use their own CUDA installation when CUDA_HOME is set. Given the error references a library in your CUDA install, I’m guessing this is set. Using “-Mcuda=cuda10.0” will override this.

-Mat

Hi Mat,

“-v” clears shows that CUDA_HOME was set to point “9.1” and this was the reason for linking error.

Then “-Mcuda=cuda10.0” was used to explicitly set it to “10.1” and the error was gone.

Thanks a lot and have a good day.

– Jin

Hi Mat,

One more thing. What’s the syntax for using multiple cu library, such as

cuBLAS, cuFFT, cuRAND, and cuSPARSE, cuSOLVER

?

Is “-Mcudalib=cublas,cufft, curand,cusolver” right?

Is “-Mcudalib=cublas,cufft, curand,cusolver” right?

Correct, a comma delineated list. However, no space between “,” and “curand”.

-Mat

Thanks.

– Jin