Since I am using templates extensively in the described way on device and kernel functions, I can only say that it does work.
Obviously templates are a C++ feature, so extern “C” cannot possibly work on templates.
Thus I would assume that you compile your rcode in plain C for some reason (check the compiler settings).
However I allways include all code needed for a kernel and its C++ interface in one single file, since linking device code was (and still is ?!) not possible. Thus I put such functions into *.cuh files that I include via #include in the .cu file where they are needed.
If you have all your code under control and your functions are not called by third party, there is absolutely no reason not to compile any C code as C++, except that you might need to add some function prototypes to header files, since C is not very strict about this. I do this whenever feasible since it gives you all the advatages of strict typing and there is no need to use any C++ features (although preferably i.m.h.o).
Can you just declare your function as extern and not extern “C”?
The extern “C” is just to avoid having differing name mangling schemes for the same function. That is, declaring a function as being extern “C” notifies the compiler to treat the function as a C function rather than a C++ one (and hence mangle its name in a non-C++ manner), but that isn’t necessary for you, right?
You are free to put your global function inside of an extern “C” block, but keep your template device functions outside of the block. The extern “C” declaration does not prevent you from using C++ inside of your global function; it just prevents C++ name mangling of the functions you declare inside the block.