Can device functions be a external definition function?

yezhipiaoyao · June 27, 2023, 8:44am

device functions are only declaration not implementation,
when compile
$(EXEC) $(NVCC) $(GENCODE_FLAGS) -Xcompiler -fPIC --device-link ./tmp1/*.o --output-file link.o
prompt error:Undefined reference to…

But global function only declaration not implementation is OK, and using the nm command global function is a U-type symbol，mean an external definition function。

//compile OK
global kernelA();
void proc()
{
kernelA<<<>>>();
}

//compile ERROR:Undefined reference to deviceA
device deviceA();
global kernelA()
{
deviceA();
}
void proc()
{
kernelA<<<>>>();
}

njuffa · June 27, 2023, 9:56am

I am not quite sure what you are asking, but I think you want to look into separate compilation and linking of CUDA device code in more detail. A good starting point may be the following blog post:

Note that __global__ functions are called from host code, therefore tools for handling host code, like nm, can be used to inspect the reference. But nm knows nothing about device code.

yezhipiaoyao · June 28, 2023, 1:12am

i read the blog,but my question is if
device function only declaration not implementation,
when compile it prompt error:Undefined reference to。I want to slove this problem.

njuffa · June 28, 2023, 2:43am

device function only declaration not implementation, when compile it prompt error

In your example, kernelA() calls deviceA() but deviceA() is not defined in the same compilation unit. Unless you have another separate compilation unit that defines deviceA(), and link the object file generated from that, the program is obviously incomplete, and an error will result. Here is a simple example of working with more than one compilation unit:

A header file my_device_funcs.h to export deviceA():

#ifndef MY_DEVICE_FUNCS_H_
#define MY_DEVICE_FUNCS_H_

__device__ int deviceA (int x);

#endif // MY_DEVICE_FUNCS_H_

Define the function deviceA() in a file my_device_funcs.cu:

#include "my_device_funcs.h"

__device__ int deviceA (int x)
{
    return x * x;
}

Here is the main program, in a file my_main.cu:

#include <stdio.h>
#include <stdlib.h>
#include "my_device_funcs.h"

__global__ void kernel (int x)
{
    printf ("GPU: %d\n", deviceA (x));
}

int main (void)
{
    kernel<<<1,1>>>(5);
    return EXIT_SUCCESS;
}

Compile the device function into an object file:
nvcc -rdc=true -c -o my_device_funcs.obj my_device_funcs.cu

Compile the main program into an executable, linking the previously generated object file:
nvcc -rdc=true -o my_main.exe my_main.cu my_device_funcs.obj

We now have an executable my_main.exe that when invoked prints:
GPU: 25

Note that for the purpose of linking multiple object files can also be combined into a static or dynamic library.

yezhipiaoyao · June 28, 2023, 3:25am

Yes, the implementation of the device function is in other libraries, and this library only has declarations.
__ Global__ only declaration no implementation compile OK,__ Device__ only declaration no implementation error.

njuffa · June 28, 2023, 3:36am

If your code calls a function (it does not matter whether in host or device code) whose object code is not accessible to the linker, that is an error and an error message should result as it is not possible to complete the building of the executable. That is no different from pure host code. If you try to build an executable from

int main (void)
{
    foo (4);
    return 0;
}

an error message will be emitted, for example:

error LNK2019: unresolved external symbol foo referenced in function main
fatal error LNK1120: 1 unresolved externals

__global__ functions behave no different in that aspect. Example:

__global__ void kernelA(int);
int main (void)
{
    kernelA<<<1,1>>>(5);
    return 0;
}

trying to compile the above results in an error message:

tmpxft_000023f4_00000000-18_foobar.obj : error LNK2019: unresolved external symbol "void __cdecl kernelA(int)" (?kernelA@@YAXH@Z) referenced in function main
fatal error LNK1120: 1 unresolved externals

References can remain unresolved when an object file is created, but they must be resolved when linking together the executable. For example, in my example above, I could also build the code like this:

nvcc -rdc=true -c -o my_device_funcs.obj my_device_funcs.cu
nvcc -rdc=true -c -o my_main.obj my_main.cu
nvcc -rdc=true -o my_main.exe my_main.obj my_device_funcs.obj

At the second step, deviceA() is an (as of yet) undefined external symbol. This is resolved in the third step which performs linking to create an executable by including my_device_funcs.obj which defines deviceA().

yezhipiaoyao · June 28, 2023, 6:29am

nvcc --device-c a.cu b.cu
nvcc --device-link a.o b.o --output-file link.o //There was an error in this step
ar -r ./tmp1/abc.so a.o b.o link.o
I need to compile it into a shared library

njuffa · June 28, 2023, 6:52am

Do you need a static library or a dynamic library? The use of the archiver ar suggests a static library, but I am confused by abc.so, since the suffix .so suggests a dynamic library to me. The use of ar with a dynamic library seems incorrect to me, gcc -shared or something of that sort should be used.

For a worked example of how to build a static library with CUDA on Linux, see my forum post here

If you need to build a dynamic library, I cannot help. It has probably been ten years or more since I last needed to build a dynamic library on Linux, and I do not have a Linux system at hand right now to refresh my memory. I suspect the title of this thread will not entice many readers to follow it to the end, so consider asking a new question on how to build a dynamic library with CUDA on Linux, as that is the task you are actually trying to accomplish from what I understand now.

Topic		Replies	Views
Declaration problems of __global__, __device__ Confused about declarations CUDA Programming and Performance	5	8338	September 10, 2008
Undefined symbol for external device function CUDA NVCC Compiler	0	541	April 21, 2023
Device Function Library How to make a lib of device functions CUDA Programming and Performance	6	4900	June 24, 2009
CMake Linking error while trying to link to a __device__ void foo{} function CUDA Programming and Performance cuda	0	93	February 27, 2025
extern __device__ functions CUDA Programming and Performance	1	3098	January 12, 2010
Calling device function returns "calling a host function..." error CUDA Programming and Performance	6	6525	August 23, 2010
__device__ CUDA Programming and Performance	7	3860	December 12, 2011
Linking Problems CUDA Programming and Performance	3	1168	February 4, 2011
Linking device code CUDA Programming and Performance	13	7399	December 8, 2014
Linking Device functions from static libraries with CMake CUDA NVCC Compiler	3	197	January 7, 2025

Can __device__ functions be a external definition function?

Related topics

Can device functions be a external definition function?