compile fails with CUDA 2.2 ? - possible problem

I just upgraded to CUDA 2.2 today ( i know…its kinda late…)

and my code which was compiling with CUDA 2.1 with no problems, stopped to compile with the new CUDA 2.2 ??

I get strange errors in the cudafe1.stub.c file ??

I get this error at the positions where I use templatized kernel with template specializations for specific cases !!


Host code

dim3 grid (N/32, N/32);

dim3 threads (16,16,1);

if (n == 3)

 my_kernel <3> <<<grid, threads>>> (d_odata, d_idata);

else if (n == 4)

 my_kernel <4> <<<grid, threads>>> (d_odata, d_idata);



Device code

template global void my_kernel ( float *odata, float *idata);

template<> global void my_kernel<3>( float *odata, float *idata)



template<> global void my_kernel<4>( float *odata, float *idata)




the error - :unsure:


/tmp/tmpxft_00002821_00000000-1_my_kernel.cudafe1.stub.c:17: error: expected initializer before ‘<’ token

/tmp/tmpxft_00002821_00000000-1_my_kernel.cudafe1.stub.c:29: error: expected initializer before ‘<’ token


I have also a created a simple test case to re-create this error I am getting ( with the files attached).

The files compile without any issues with nvcc 2.1

but throws out the above described error with nvcc 2.2 ???

can anyone please help me out in figuring out why this is happening ?? - problem with my code ?? (not likely since it complied cleanly with 2.1) or problem in CUDA 2.2 ???

thanks in advance…

P.S: plz rename the attached *.txt file to *.cu, the uploader didn’t allow me to upload *.cu files
template_test.txt (1.92 KB)
my_kernel.txt (505 Bytes)

could this be a possible bug in cuda 2.2 ???

or am i wrong with my code somewhere ??

any insights from anybody ??

this is a known bug in 2.2 that is fixed in the 2.3 release.
untill then, giving the prototype instantiation an (empty)
implementation will make it compile correctly again

yup it works :)

thanks a ton…

so we have to wait till 2.3 for this to be fixed ??