(Template kernel) VERSUS (Driver API)

Hi everyone, i have a problem in using templates with driver API. How should i call the kernel if it contains a template. And where should i exactly put CUdeviceptr (kernel, host or both of them)???

template <class T, CUDPPOperator op>
global void compactData(T *oData,
const unsigned int *iValidFlags,
const T *iData)
{

}

runTest()
{

CU_SAFE_CALL( cuFuncSetBlockShape( transform, numThreads, 1, 1 ));
CU_SAFE_CALL( cuFuncSetSharedSize( transform, sizeSharedMemory) );

int offset = 0;
CU_SAFE_CALL(cuParamSeti(transform, offset, outArray));			offset += sizeof(outArray);
CU_SAFE_CALL(cuParamSeti(transform, offset, isValid);				offset += sizeof(isValid);
CU_SAFE_CALL(cuParamSeti(transform, offset, const inArray);	    offset += sizeof(inArray);
CU_SAFE_CALL(cuParamSetSize(transform, offset));
CU_SAFE_CALL( cuLaunchGrid( transform, numBlocks, 1) );


}

Any help is appreciated… Thanx…

You need to instantiate a template first in any function.

int main()
{
// instantiate template functions
max_r<512> <<< 0, 0, 0>>>(0, 0, 0);
max_r<256> <<< 0, 0, 0>>>(0, 0, 0);
max_r<128> <<< 0, 0, 0>>>(0, 0, 0);
max_r<64> <<< 0, 0, 0>>>(0, 0, 0);
max_r<32> <<< 0, 0, 0>>>(0, 0, 0);

return 0;
}

Paste this code into your *.cu file. Then look into resulting cubin or linkinfo file to get mangled names for functions.

And your CUdeviceptr goes to context creation function if I got your question right.