Should I list kernels in CUDA unit header files?

dscerutti · July 26, 2022, 3:59pm

As I develop my code base, I’ve tried to maintain high standards for documentation, and documentation on the purpose of each function (as well as its formal arguments) goes in a header file. For CUDA units (foo.cu), there is an associated CUDA header file (foo.cuh). But, I’m conflicted as to whether to document the __global__ functions in those CUDA units in the respective header files, the reason being that, without relocatable device code, I’m not sure I can call the __global__ kernels from another CUDA unit. Is this correct?

My inclination is to stop listing the __global__ functions in the header and stick to the C++ accessible functions that will launch them, e.g. extern launch_foo. Can anyone suggest a standard practice and a reason to adopt it?

Robert_Crovella · July 26, 2022, 10:21pm

You can call a kernel from a different compilation unit without specifying relocatable device code during the compile/link.

$ cat t2a.cu
#include <cstdio>
__global__ void k();

int main(){

  k<<<1,1>>>();
  cudaDeviceSynchronize();
}
$ cat t2b.cu
#include <cstdio>

__global__ void k(){

  printf("*\n");
}
$ nvcc -o test t2a.cu t2b.cu
$ compute-sanitizer ./test
========= COMPUTE-SANITIZER
*
========= ERROR SUMMARY: 0 errors
$

CUDA 11.4

dscerutti · July 26, 2022, 11:46pm

This is important and looks like a solution. If the __global__ functions can be traded between compilation units I’ll make separate headers for them all, and put the usage documentation there.

system · August 9, 2022, 11:47pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is it possible to compile CUDA kernels in a .cu file that are directly callable by multiple different applications after statically linking? CUDA Programming and Performance	2	378	January 21, 2024
Declaration problems of __global__, __device__ Confused about declarations CUDA Programming and Performance	5	8311	September 10, 2008
How to separate device function and kernel function? CUDA Programming and Performance	2	1560	November 22, 2009
How to re-structure code for CUDA (.cu, .cuh, .c)? CUDA Programming and Performance	4	5184	August 19, 2009
a kernel call within another kernel CUDA Programming and Performance	16	11695	January 23, 2018
can a function be both __global__ and __device__ CUDA Programming and Performance	1	4447	November 9, 2007
call a device function from a c++ file CUDA Programming and Performance	2	1157	July 28, 2018
How do I compile __global__ kernels with a class? CUDA Programming and Performance	3	1948	September 28, 2015
Error Compiling CUDA CUDA Programming and Performance	3	5218	April 30, 2010
Device Function Library How to make a lib of device functions CUDA Programming and Performance	6	4876	June 24, 2009

Should I list kernels in CUDA unit header files?

Related topics