cuda in template header

Hello,

Just a simple question, is it possible to call a cuda function inside a template member of a class?

template <typename Tin, typename Tout>
    void apply(Tin TImIn,Tout TImOut,int w,int h)
    {

        dim3 blocks(ceil((float)iwidth_ / ( BLOCK_SIZE_X)), ceil((float)iheight_ / BLOCK_SIZE_Y));
        dim3 threads(BLOCK_SIZE_X, BLOCK_SIZE_Y);


            if(iwidth_ == 1 && iheight_!=1)
            {
                int isharedMemSize = ((BLOCK_SIZE_X)*(BLOCK_SIZE_Y+iheight_) + +iheight_)*sizeof(float);
//                SepRowkernelK<<<blocks,threads,isharedMemSize>>>(TImIn,TImOut,w,h,fKernelTab_Device_,iheight_); !!!!!!!! ERROR
                checkCudaErrors(cudaGetLastError());
            }

    }

The previous code don’t seems to works properly.

It should be possible.

In fact the kernel launch function don’t works , it sends me that error:

SepRowkernelK<<<u_int8_t,float>>><<<blocks,threads,isharedMemSize>>>(TImIn,TImOut,w,h,fKernelTab_Device_,iheight_);
                                ^
../cuda/KernelFilter.h:79:41: error: expected primary-expression before ‘,’ token
                 SepRowkernelK<<<u_int8_t,float>>><<<blocks,threads,isharedMemSize>>>(TImIn,TImOut,w,h,fKernelTab_Device_,iheight_);
                                         ^
../cuda/KernelFilter.h:79:42: error: expected primary-expression before ‘float’
                 SepRowkernelK<<<u_int8_t,float>>><<<blocks,threads,isharedMemSize>>>(TImIn,TImOut,w,h,fKernelTab_Device_,iheight_);

Are you compiling this code in a .cu file using nvcc?

You should provide a short, complete example, if you want help.

Templates should be in headers and headers should be included in source files and source files should be compiled with nvcc which should be able to actually build your code in such a way. Keep in mind, templates give the compiler information to someday generate code. It’s when you instantiate that it all gets “built”, I guess (I’m butchering this, I know [baby dev still!]).

SepRowkernelK<<<u_int8_t,float>>><<<blocks,threads,isharedMemSize>>>

It looks like your have waaaaay too many < and >. Templates only need one < and >. Kernel launches require 3.

Sorry about the <<< and >>>, it was a mistake, here is the true error:

/KernelFilter.h:80:48: error: expected primary-expression before ‘<’ token
                 SepRowkernelK<u_int8_t,float><<<blocks,threads,isharedMemSize>>>(TImIn,TImOut,w,h,fKernelTab_Device_,iheight_);

I decided to put the function into the cu file. Now the compilation nof KernelFilter class with the template works properly but When I use class I get that:

référence indéfinie vers « void gpu::KernelFilter::apply<unsigned char*, float*>(unsigned char*, float*, int, int) »

for that line:

KGaussianX->apply(ptSrc_Device,f_ptImageTmp1_Device_,iwidth_,iheight_);