How to share device code between a.cu and b.cu in my test project

Project address: xuesongshu/learn_cuda: only a test (github.com)

The project mainly composed of a.cu, b.cucommon.cu and CMakeLists.txt. If
code set(CMAKE_CUDA_FLAGS "-rdc=true -dlink") is commented out, Link always report error: __cudaRegisterLinkedBinary_348e6aeb_4_a_cu_ab145366. If code set(CMAKE_CUDA_FLAGS "-rdc=true -dlink") is not commented out, Compile always report error: ptxas fatal : Unresolved extern function '_Z12common_helloi'.

How to share common device method common_hello between a.cu and b.cu?

Thanks.

项目已上传到github,地址就是贴子的第一行。

项目主要由a.cu, b.cucommon.cuCMakeLists.txt组成。如果代码set(CMAKE_CUDA_FLAGS "-rdc=true -dlink")注释掉,链接总是报错无法解析的外部符号__cudaRegisterLinkedBinary_348e6aeb_4_a_cu_ab145366。如果代码set(CMAKE_CUDA_FLAGS "-rdc=true -dlink")不注释,编译总是报错:ptxas fatal : Unresolved extern function ‘_Z12common_helloi’`。

如何把设备方法common_helloa.cub.cu之间共享呢?

谢谢。

Can someone help me?

I’m not sure why you are using extern "C". It shouldn’t be necessary for what you have shown here, and it is causing you trouble. The fact that linker is having trouble with '_Z12common_helloi' indicates it is looking for a usable device function with C++ style linkage, named common_hello(). But the only definition of common_hello() you have given is one that declares itself using C style linkage; the definition is preceded by extern "C".

If it were me, I would start by getting rid of every instance of extern "C" in your project.

I remove extern "C" and restore compilation problem again. Please check my lastest commit.

Please post text as text, not as images.

OK. I will paste text today evening in Beijing time. Now I cannot get the computer.

I don’t have any trouble compiling your code now that you have removed the extern "C" decorators.

Here is what I did:

$ git clone https://github.com/xuesongshu/learn_cuda
Cloning into 'learn_cuda'...
remote: Enumerating objects: 52, done.
remote: Counting objects: 100% (52/52), done.
remote: Compressing objects: 100% (34/34), done.
remote: Total 52 (delta 22), reused 44 (delta 15), pack-reused 0
Unpacking objects: 100% (52/52), done.
$ cd learn_cuda/yml
$ nvcc -I. -o test a.cu b.cu common.cu main.cpp  -rdc=true
$

If you’re having trouble compiling the things in the yml directory, my guess is that you are having a problem with CMake. I won’t be able to help with that, although others may. CMake is not a NVIDIA product.

1 Like

Thanks. I’ll find out what trouble CMake bring in.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.