Calling template kernel from template functio


I’ve been using CUDA on Linux for a while and was happy to see that I should be able to run it on my Macbook Pro now. I’ve installed everything fine and all the examples in the SDK run fine (at least the ones I have tested).

However, I have some problems with compiling some code which I have successfully compiled on Linux and Windows. The situation is as follows:

In a .cu (let’s call it file I have a template function and a template kernel which is called from the template

template global void my_kernel( cuFloatComplex* data_in, cuFloatComplex* edata_out, T dim)


The function which calls it would be look something like this:

template void
my_function(cuFloatComplex* data_in, cuFloatComplex* data_out, T dim)

my_kernel<<< gridDim, blockDim >>>(data_in, data_out, dim);


I can compile this on Linux and Windows and it runs. When trying to compile on the Mac, I get an error message: In function ‘void my_function(cuFloatComplex*, cuFloatComplex*, T)’: error: ‘my_kernel’ was not declared in this scope
make: *** [obj/release/FFT.cu_o] Error 255

If I explicitly declare overloaded my_functions for all the types I need to use, then it compiles fine, but it is a bit messy and also strange that it should be necessary.

Any hints, comments, etc would be most welcome.


Try to add this line at the end of your makefile

NVCCFLAGS += --host-compilation ‘C’

Thanks! That did the trick. What exactly does that flag do?