Separate compilation of CUDA code into library, for use with existing code base

Olumide · June 14, 2017, 10:36am

I plan to write a new CUDA module for use in an existing code base. Because I don’t want to mess with my existig build system, I don’t want the my existing code to have an explicit CUDA dependency. Therefore I plan to build the CUDA modfules as a library and with C-linkage (as NVCC depends on an older version of GCC than my existing code base uses) as follows:

[main.cpp – representative of existing code]

extern "C" void runKernel();

int main(int argc, char **argv)
{
	runKernel();
}

[Test.cu – representative of CUDA modules]

__global__ void testKernel()
{
}

extern "C" void runKernel()
{
	testKernel<<<1,1>>>();
}

Following the tutorial Separate Compilation and Linking of CUDA C++ Device Code I’ve added the -dlink flag (I understand) in order to generate device code and the -lcuart flag in order to link using GCC (recall that I would like to link to the CUDA library to my existing code without introducing CUDA dependencies i.e. NVCC.

Build steps

nvcc -ccbin g++ -m64 -gencode arch=compute_30,code=sm_30 -dlink -dc -o MWE.o -c MWE.cu
nvcc -ccbin g++ -m64 -gencode arch=compute_30,code=sm_30 -dlink -o MWE.a MWE.o
g++ -m64 -o main.o -c main.cpp
g++ -m64 -lcudart MWE.a main.o -o test

Build errors

MWE.a: In function `__cudaUnregisterBinaryUtil':
link.stub:(.text+0xf): undefined reference to `__cudaUnregisterFatBinary'
MWE.a: In function `__cudaRegisterLinkedBinary_38_tmpxft_000019d4_00000000_7_MWE_cpp1_ii_d6d5cc43':
link.stub:(.text+0x5a): undefined reference to `__fatbinwrap_38_tmpxft_000019d4_00000000_7_MWE_cpp1_ii_d6d5cc43'
MWE.a: In function `__cudaRegisterLinkedBinary(__fatBinC_Wrapper_t const*, void (*)(void**), void*)':
link.stub:(.text+0x102): undefined reference to `__cudaRegisterFatBinary'
main.o: In function `main':
main.cpp:(.text+0x10): undefined reference to `runKernel'
collect2: error: ld returned 1 exit status

BulatZiganshin · June 14, 2017, 12:07pm

if i correctly understood you, you try to link together objfiles build by different, incompatible compilers. afaik, that’s impossible - executable should contain only one run-time library. the alternative is to build DLL file with one compiler and then call it from executable build by another compiler - each DLL file has its own run-time library. i’m a windows user, though, situation may be different on Linux

cbuchner1 · June 14, 2017, 1:07pm

is the -ccbin g++ necessary? Usually nvcc picks the g++ version that it officially supports in the respective CUDA toolkit. By overriding this with -ccbin you’re making it use a possibly incompatible version (the system default compiler).

BulatZiganshin · June 14, 2017, 1:15pm

he said “NVCC depends on an older version of GCC than my existing code base uses” so it seems that his code is incompatible with nvcc-compatible version of gcc

cbuchner1 · June 14, 2017, 1:21pm

My recommendation would still be to try this without the -ccbin g++

You’ll mix object code generated by different compiler versions. But there is a chance it will work, as the interface between both modules only uses plain C syntax.

Olumide · June 14, 2017, 1:27pm

Thanks for your feedback. My understanding of the error messages is that suggest undefined functions rather than incompatible ones.

cbuchner1 · June 14, 2017, 1:51pm

the purpose of this step is to create a static library out of one (or multiple) .o files?

nvcc -ccbin g++ -m64 -gencode arch=compute_30,code=sm_30 -dlink -o MWE.a MWE.o

Have you tried to use the ar tool instead?

ar rcs MWE.a MWE.o

Regarding the ar options: r means to insert with replacement, c means to create a new archive, and s means to write an index.

MutantJohn · June 14, 2017, 3:41pm

It almost looks like you might need to link to the cuda runtimes, cudadevrt and cudart.

Robert_Crovella · June 14, 2017, 4:39pm

Leaving aside the question of mixing gnu toolchains, there are several changes that need to be made to the build sequence to get it to work:

#ordinary rdc compilation of CUDA source
nvcc -ccbin g++ -m64 -arch=sm_30 -dc -o MWE.o -c MWE.cu
#separate device link step, necessary for rdc flow
nvcc -ccbin g++ -m64 -arch=sm_30 -dlink -o MWE.dlink.o MWE.o
#creation of library - note we need ordinary linkable object and device-link object!
nvcc -ccbin g++ -m64 -arch=sm_30 -lib -o MWE.a MWE.o MWE.dlink.o
#host code compilation
g++ -m64 -o main.o -c main.cpp
#host (final) link phase - the order of entries on this line is important!!
g++ -m64 main.o MWE.a -o test -L/usr/local/cuda/lib64 -lcudart

Olumide · June 15, 2017, 12:03am

Thanks txbob!

Topic		Replies	Views
nvcc (nvlink) not linking against device code library CUDA Programming and Performance	7	11518	June 20, 2018
How to link host code with a static CUDA library after separable compilation? CUDA Programming and Performance	2	7745	April 30, 2013
Linking objects MPI and CUDA CUDA Programming and Performance	3	2233	October 7, 2016
Linking frustration -lcuda fails CUDA Programming and Performance	7	27628	November 27, 2009
Linking gcc and nvcc compiled code nvcc compiled object files not recognized by gcc obj files CUDA Programming and Performance	5	5131	May 5, 2012
Starting cuda development: linking problem CUDA Programming and Performance	3	1726	January 30, 2010
Linking error for cuda separate compilation mode and static linking of cudart CUDA Programming and Performance cuda	2	1746	October 24, 2020
Compiling / linking CUDA apps? CUDA Programming and Performance	8	4927	September 21, 2009
linking issues between C program and cuda library CUDA Programming and Performance	3	3399	December 23, 2009
Linking multiple static cuda libs CUDA Programming and Performance	1	2140	August 26, 2020

Separate compilation of CUDA code into library, for use with existing code base

Related topics