Best way to configure #defines for a kernel

My kernel is stored in a .h file, which I include into a .cu file, where the kernel is actually invoked
with triple chevrons. The .h file is expecting a number of #defines that I would like to programatically set for different instances of the kernel. This kernel is a port from OpenCL, where I could set the #defines when I built the kernel from source.

Here is what I would like to be able to do :

#include "my_kernel.h"

#define FOO 1
#define BAR 2

my_kernel<<<...>>>(...);

#define FOO 3
#define BAR 4

my_kernel<<<...>>>(...);

Is this possible, and if so what is the best approach ?

I think the C++ way to do this is probably templating.

template <int FOO, int BAR>
__global__ void k(int *d){
   *d = FOO+BAR;
}

int main(){
  int *d = NULL;
  my_kernel<1,2><<<1,1>>>(d);
  my_kernel<3,4><<<1,1>>>(d);
}

If the kernel code itself is depending on FOO/BAR, your suggestion wouldn’t work anyway (even if you used #undef).

1 Like

Thanks, @Robert_Crovella , I did try templating and it worked. For some #defines, I can’t get away with that as they determine which types are used in the kernel. So, in these cases, I will need separate .cu files, and will
put the #defines before the header inclusion.

templating can also control the types used by a kernel

template <typename T>
__global__ void k(T *d){ };

templating is a C++ concept, not unique or specific to CUDA

1 Like

Thanks, that’s true, hadn’t thought of that.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.