Linking Device functions from static libraries with CMake

code.zip (25.0 KB)
I’m trying to keep some device functions in a separate directory/project and use them in multiple libraries within a larger project.

MinRep File Structure:

  • CMakeLists.txt
  • fx
    • CMakeLists.txt
    • fx.cu
    • fx.cuh
  • kernel
    • CMakeLists.txt
    • kernel.cu
    • kernel.cuh
    • main.cpp

When I try to build in a flat directory (no subdirs) I get:

nvlink error   : Undefined reference to '_Z4axbyiiii' in 'CMakeFiles/exe.dir/kernel.o'
make[2]: *** [kernel/CMakeFiles/exe.dir/build.make:117: kernel/CMakeFiles/exe.dir/cmake_device_link.o] Error 255

when trying to use a device function from the fx library inside the global function defined in kernel.cu. I can fix that by adding the fx.cu file to the list of sources for the kernel, but once I separate things out to separate directories, adding source file from a different directory becomes annoying.

Am I missing a CUDA related trick or some general CMake knowledge?

To keep device functions in a “separate project” i.e. a separate compilation unit, and then link them to another project (compilation unit) will require compiling with relocatable device code with device linking. I’m not a CMake expert and CMake is not a NVIDIA product. However this is a common question and you can find various write-ups about it.

Failure to specify proper device linking would give exactly the error you indicate. kernel.cu is referencing an axby function in another file/compilation unit, and either you haven’t linked that compilation unit at all, or you haven’t properly specified -rdc=true or whatever the equivalent sequence would be via CMake.

The closest thing I’ve found is this SO post. It is pretty close to what I am looking for but still just one guy messing around on SO.

Is there a recommended source for reading on relocatable device code vs separable compilation of cuda? These are two different concepts that come up frequently wrt this topic that I don’t understand.

I frequently have problems with trying to find up to date and complete information regarding how to use CUDA with CMake because the doc pages for CMake don’t seem to have an all-in-one language reference for CUDA, making finding info for the various properties a amtter of chasing down forum/SO posts.

Nvidia doesn’t control CMake but with CMake being the defacto cross platform C++ build tool, it is a shame that CUDA support isn’t documented to anywhere close to the same thoroughness as CUDA internals.

The first one that comes to mind for me is the nvcc manual.

There is also a contributed CMake blog here. There is a corresponding GTC session, also.

With a bit of searching you can also find other tutorials such as this one. And as you mention, there are numerous forum articles.