Template function calling a kernel with separated files architecture The normal function works, the

[indent]Hello,[/indent]I don’t manage to make a function a template function. This function is in a .cu file and is called by the main function which is in a .cpp file. The function call a kernel from another .cu file.

Without the template the code runs correctly but, with it, the kernel is no more recognized as well.Thus, I get these errors:

NOTE: I’m “completing” the NPP library for our application needs by using modified UtilNPP files.

Here is a simplified code which reproduce the error (image1 and image2 can either be ImageCPU or ImageNPP objects) :

main.cpp

#include "filter.cuh"

int main(int argc, char* argv[])

{

  ImageCPU_8u_C3 image1;

  ImageNPP_8u_C3 image2;

  ...

  filter( image1, image2 );

  ...

}

filter.cuh

template < class source_T, class dest_T >

void filter( source_T & , dest_T & );

#include "filter.cu" // (filter.tcu) template CUDA file

filter.cu

template < class source_T, class dest_T >

void filter( source_T & image1, dest_T & image2 )

{

  ...

  filter_kernel<<< dimGrid, dimBlock >>>( image1.data(), image2.data() );

  ...

}

Thank for any help,

Your main.cpp needs to be named main.cu and compiled with nvcc since it is launching a CUDA kernel in an #included header file.

OK (it seems logical once known), the code now works !

Thank you !