How do I compile global kernels with a class?

trf86 · September 23, 2015, 12:54pm

I am working on a class that uses CUDA and I need a couple custom kernels (i.e. global void functions), but I can’t figure out where to place these functions.

All of the examples I have seen online (e.g. http://devblogs.nvidia.com/parallelforall/separate-compilation-linking-cuda-device-code/) put these kernels in the main.cpp file, but I cannot do this with my project.

How can I include them along side my class and compile it all into a .o file I can then link against the main executable?

EDIT: to clarify a bit, my program looks like the following:
cuda class (used in main program), compiled to .o with nvcc
main program, compiled with icc and linked against libraries

I need the cuda class to be able to call kernels (i.e. global void functions), but don’t know where to place these… I cannot place them in the main program.

episteme · September 25, 2015, 12:13am

I’m afraid following codes don’t help you…

// ----- foo.h
class foo {
  unsigned int size_;
public:
  foo(unsigned int size) : size_(size) {}
  void add(int* c, const int* a, const int* b);
};

// ----- foo.cu
#include <cuda_runtime.h>
#include <device_launch_parameters.h>

__global__ void addKernel(int *c, const int *a, const int *b);

void foo::add(int* c, const int* a, const int* b) {
  addKernel<<<1,size_>>>(c, a, b);
}

// ----- addKernel.cu
#include <cuda_runtime.h>
#include <device_launch_parameters.h>

__global__ void addKernel(int *c, const int *a, const int *b) {
  int i = threadIdx.x;
  c[i] = a[i] + b[i];
}

// ----- main.cpp
#include "foo.h"
int main() {
   const unsigned int N = 10;
   int* dev_a;
   int* dev_b;
   int* dev_c;
   ....
   foo aFoo(N);
   aFoo.add(dev_c, dev_a, dev_b);
   ...
}

episteme · September 25, 2015, 2:57am

You say “I wanna call kernel-func. in .cpp” ?
so,

// ----- foo.cpp
...

void foo::add(int* c, const int* a, const int* b) {
  // following codes are equivalent to:  addKernel<<<1,size_>>>(c, a, b);
  void* args[] = { &c, &a, &b };
  cudaLaunchKernel<void>(&addKernel, 1, size_, args);
}

trf86 · September 28, 2015, 12:04pm

Thanks for the posts. I don’t know what happened to my post (they were disappearing or hidden, but I think that’s fixed now) but I did find what I needed in the CppIntegration example inside the CUDA package.

I accomplished it doing essentially what you posted in the first post. I added a wrapper external “C” void function to call the kernel, then included that header in the class template, then included that header class template in the main program and it compiled and works fine. Thanks again.

Topic		Replies	Views
Mixing C++ and CUDA CUDA Programming and Performance	5	6422	August 20, 2013
Can __global__ function included in a c++ class? c++ integration CUDA Programming and Performance	0	1155	November 13, 2009
What's the general opinion on launching kernels from class methods? CUDA Programming and Performance	0	409	August 5, 2022
__global__ function in classes CUDA Programming and Performance	1	3846	September 19, 2009
question about calling CUDA kernels using a class CUDA Programming and Performance	5	14675	July 12, 2010
Compile device function from other class CUDA and C++ CUDA Programming and Performance cuda	4	1709	May 26, 2021
C++ class to use CUDA into a C++ project CUDA Programming and Performance	4	21875	September 22, 2007
Error Compiling CUDA CUDA Programming and Performance	3	5282	April 30, 2010
Should I list kernels in CUDA unit header files? CUDA Programming and Performance	3	1645	August 9, 2022
Calling a class from cuda-kernel CUDA Programming and Performance	6	65669	March 1, 2010

How do I compile __global__ kernels with a class?

Related topics

How do I compile global kernels with a class?