How to split the code in multiple files?

HEllo,

I am writing quite complex program and now it reached several thousands line. I need to put my global functions in separate files. Maybe similar to how LAMMPPS is organised, one cpp file for each function. The problem is that all my global functions have call to one device function. If I have several files containing the same device function I get compile error, because it already declared. This is also not desirable because I have to change it from time to time. I tried to declare the device function as extern, but this also does not work, only host functions can be declared extern.

How can I solve my problem?

In short. I have a bunch of global functions all with calls to one device function. I want to put each global function in a separate file, but I do not want to have this device function in each of them, just in one place.

Cristian

this requires separate compilation and linking, also called relocatable device code (rdc)

http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#using-separate-compilation-in-cuda

define the device function only in one file (as you normally would)

if you are using a single nvcc command to do all your compilation and linking, just compile your code with:

-rdc=true

If you are compiling and linking separately, or in multiple steps, it may be slightly more complicated (not much) and will depend slightly on your exact compile and link steps. However if you search on the web for CUDA relocatable device code examples, you will find many.

Be advised that using rdc often involves a noticeable performance hit. The compiler can’t do the same level of optimization in all cases. I have seen some folks use inventive mechanisms to include files in a way that preserves a single compilation unit, to avoid the use of rdc.

Thanks. I will try this. It looks like what I need. It it works it will make my life much easier.

I use only one line for compiling. At this momment I have a bunch of .cu and cpp files doing memory allocation and setting the original configuration and one very long file with the main work (velociy verlet algorithm). The sorting part of the algorithm is already separated. The last file I have to split has a dozen or more functions and more than 3000 lines and I will try to split it as well to make the further development easier.

Thanks, this is helpful :)