nvlink error related to cuRAND in CUDA Fortran

Hi,

I am trying to use cuRAND device API in my CUDA Fortran code. There are several subroutines so I have to compile each subroutine and link them all together. The compilation passes but the linking fails:

nvlink error   : Undefined reference to '__pgicudalib_curandInitXORWOW' in 'disk.o'
nvlink error   : Undefined reference to '__pgicudalib_curandUniformXORWOW' in 'disk.o'
pgacclnk: child process exit status 2: /software/gp/opt/pgi/2016/linux86-64/16.7/bin/pgnvd

In the file “disk.f90”, there is a kernel using the module “curand_device” and calling “curand_init” and “curand_uniform”, and the kernel is called by the subroutine in disk.f90. The subroutines were compiled with

pgf90 -Mcuda=nollvm -Mcudalib=curand -c *.f90

I followed the procedure shown in the example file distributed with the PGI Fortran (CUDA-Libraries > cuRAND > test_rand_cuf) where I can compile the code and run without problem.

I guess the symptom is similar to this topic but I did not understand how that issue was solved.

It is greatly appreciated if someone can provide a possible cure to the symptom…

Thanks,

Jimmy

Hi Jimmy,

I think this might be an object ordering issue on your link line. nvlink needs the object that contains the reference come before the object that references it on the link line.

Here, it looks like you have a Fortran module called “pgicudalib” with a interface or call to cuRand’s device routines. If I’m correct, then you should be able to put this object before “disk.o” and have nvlink find the reference.

Note that do ship interface modules for cuRand, both host, “use curand” and for the device, “use curand_device”, so writing your own may not be necessary.

  • Mat

Hi Mat,

Thank you so much for your quick reply!

I tried to reorder the linking sequence by putting the “disk” subroutine (and even the main routine) to the last, but it still failed.

Then I checked if it works if I put both “use curand” in the host subroutine and “use curand_device” in the device subroutine, and it also failed.

Desperately, I tried to test with different pgf90 compilation tags. What I forgot to show was that, in fact, I used a “-g” option to collect debug information – and it turned out that this tag somehow killed the linkage between the host and device. The linking went smoothly if I remove the option.

Could you please quickly comment if this is an expected behavior or you would think the “-g” option should be able to be turned on along with the usage of cuRAND?

Thanks,

Jimmy

Debug code, “-g”, does require LLVM so this may not have anything to do with cuRAND, rather a mismatch of trying to compile “-Mcuda=nollvm -g”.

Do is work with “-Mcuda -g”? Is there a reason why you’ve disabled LLVM?

  • Mat

Hi Mat,

Sorry for my late reply… Thanks for letting me know that debugging needs LLVM.

I disable LLVM because I learned from the example codes distributed with PGI CUDA Fortran stating that LLVM needs to be disabled when using the device codes from cuRAND (in PGI_Examples/CUDA-Libraries/cuRAND/test_rand_cuf/).

On the other hand, I could not link the files even if I enabled LLVM… Putting the debug tag “-g” will cause the linker to fail and produce exactly the same error messages that I posted at the beginning of the thread.

Has there been other reports on the failure of debugging CUDA Fortran with debugging information enabled?

Thanks,

Jimmy

Hi Jimmy,

Thanks for the explanation. The “trand2” test calls cuRAND from device code. Since cuRANDs device side routines are actually inlined from CUDA header files, this requires use to generate CUDA code rather than LLVM. Hence why in this particular case the code needs to be compiled with -Mcuda=nollvm.

If you need to compile the host code with debugging enabled, you can compile the code “-g -Mcuda=nollvm,nodebug”.

  • Mat