Error linking c++ library .so with CUDA

Hi,

I have a big shared library written in C++, and now, I want to pass a part of code to CUDA, the situation is:

I have four cpp files and one cu file. I have the line “extern “C” void launch_kernels()” in one cpp file and in CUDA file in order to link them.
The .cu file compiles without problems with nvcc to .o file, as same as cpp files separately whit this arguments:

g++ -std=c++14 -m64 -pthread -DWM_DP -DWM_LABEL_SIZE=32 -Wall -Wextra -Wold-style-cast -Wnon-virtual-dtor -Wno-unused-parameter -Wno-invalid-offsetof -Wno-attributes -Wno-unknown-pragmas -O3 -DNoRepository -ftemplate-depth-100 -fPIC -Wno-old-style-cast -Wno-unused-local-typedefs -Wno-array-bounds -Wno-deprecated-declarations -fpermissive -I/usr/include -fPIC -c file1.c -o file1.o

Then, I build a shared library .so with this four .o without problems:

g++ -std=c++14 -m64 -pthread -DWM_DP -DWM_LABEL_SIZE=32 -Wall -Wextra -Wold-style-cast -Wnon-virtual-dtor -Wno-unused-parameter -Wno-invalid-offsetof -Wno-attributes -Wno-unknown-pragmas -O3 -DNoRepository -ftemplate-depth-100 -fPIC -Wno-old-style-cast -Wno-unused-local-typedefs -Wno-array-bounds -Wno-deprecated-declarations -fpermissive -I/usr/include -fPIC -shared -Xlinker --add-needed -Xlinker --no-as-needed file1.o file2.o file3.o file4.o -o finallib.so

The problem is in the last step, when I want to link my cuda file with this finallib.so. I tried with nvcc:

nvcc --shared -Xcompiler -fPIC finallib.so filecuda.cu -o finallibcuda.so --expt-relaxed-constexpr --extended-lambda

And I only get an unworking and very small file. Then I tried with g++:

g++ -std=c++14 -m64 -pthread -DWM_DP -DWM_LABEL_SIZE=32 -Wall -Wextra -Wold-style-cast -Wnon-virtual-dtor -Wno-unused-parameter -Wno-invalid-offsetof -Wno-attributes -Wno-unknown-pragmas -O3 -DNoRepository -ftemplate-depth-100 -fPIC -Wno-old-style-cast -Wno-unused-local-typedefs -Wno-array-bounds -Wno-deprecated-declarations -fpermissive -fPIC -shared -Xlinker --add-needed -Xlinker --no-as-needed finallib.so filecuda.o -L/usr/local/cuda/lib64 -lcudart -o finallibcuda.so

And I get this error:

/usr/bin/ld: filecuda.o: warning: relocation against _Z4Findv' in read-only section .text.startup’ /usr/bin/ld: filecuda.o: relocation R_X86_64_PC32 against symbol `_Z4Findv’ can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value

And I also tried linking .o and .cu before the creation of the final .so file and this linked fine, but I get the same error when I try to create the final .so.

Any idea?,
thanks in advance.

I was able to reduce the problem a lot. Only three files, one .cpp, one .cu and one heather, and the same error as the original message appears when trying to link them.

The code of example files is very very simple:

example.cpp

#include "example.h"

extern "C" void launch_kernels();

void example::launch()
{

    launch_kernels();

}

example_kernel.cu

#include <iostream>

__global__ void Find()
{}

extern "C" void launch_kernels() {

    std::cout << "Hello\n" << std::endl;

}

example.h

class example {

public:
    void launch();
};

Compiling…

 g++ -shared -fPIC -Wall -O3 -c example.cpp -o example.o
nvcc -shared -c -O3 example_kernel.cu -o example_kernel.o --expt-relaxed-constexpr --extended-lambda

And I get two .o files without problems. But, in the last step:

nvcc -Xcompiler -fPIC -shared example_kernel.o example.o -o example.a

I get the fatal original error:

/usr/bin/ld: example_kernel.o: warning: relocation against `_ZNKSt5ctypeIcE8do_widenEc' in read-only section `.text'
/usr/bin/ld: example_kernel.o: relocation R_X86_64_PC32 against symbol `_Z4Findv' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status

A simple test case seemed to work for me:

# cat test.cpp
#include <iostream>
void foo(){std::cout << "Foo called" << std::endl;}
# cat cudatest.cu
#include <cstdio>
void foo();

__global__ void k(){
  printf("bar called\n");}

void bar(){
  foo();
  k<<<1,1>>>();
  cudaDeviceSynchronize();
}
# cat main.cpp
void bar();

int main(){
  bar();}
# g++ -fPIC -c test.cpp
# g++ -fPIC -shared test.o -o libfinal.so
# nvcc -shared -Xcompiler -fPIC libfinal.so cudatest.cu -o libfinalcuda.so
# nvcc  libfinalcuda.so  main.cpp  -o test
# ./test
Foo called
bar called
#

CUDA 12.2, g++ 11.4

As suggested in your cross posting, your nvcc compile command in your updated example:

omits the -Xcompiler -fPIC switch, and this is generally necessary for creating shared objects.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.