c++ templates with driver API

Is it possible to use C++ template device functions with the driver API?
I’m not sure what string to pass to cuModuleGetTexRef() if this is possible.

It appears if I use the long mangled symbol name for the function it works (as in, cuModuleGetTexRef() doesn’t fail–haven’t invoked the function yet). Quite inelegant, though.

It certainly is possible, though, as you said, the name mangling scheme is a bit of a posterior pain to deal with.

The way I do it, say I have the following kernel template:

[codebox]template

global kernelName(…)

{

}[/codebox]

And I use the kernel with steps == 1, and steps == 32.

In a .cu file, I have

[codebox]\

DummyFunction(…)

{

kernelName<1><<<1,1>>>(…);

kernelName<32><<<1,1>>>(…)

}[/codebox]

The “DummyFunction” never gets linked into any sort of code, but it tells the CUDA compiler for which values of “steps” to compile the kernel and include that in the .ptx or .cubin output. Getting the mangled names is a bit of trial and error, but they usually remain constant, so it’s not that bad.

Actually outside of any function you can also do:

[codebox]

template global void DummyFunction<1>(void);

template global void DummyFunction<32>(void);

[/codebox]

Thanks for verifying that the code won’t go kablooey when I finish porting all the other runtime API calls to the driver API & actually run the kernel. External Media