cuda and partial specialization template

Testi56 · March 22, 2012, 3:25pm

Hi,

I’m getting troubles compiling device function with partial specialization template

Here is the code :

template<typename T,int shared_mem_size,int shared_mem_size_it>

struct warperMax2;

template<typename T,int shared_mem_size>

struct warperMax2<T,shared_mem_size,0> {

	static __device__ void warp_reduce() {

		return;

	} 

	

};

template<typename T,int shared_mem_size,int shared_mem_size_it>

struct warperMax2 {

	static __device__ void warp_reduce(T smem[shared_mem_size]) {

		smem[threadIdx.x] = smem[threadIdx.x+shared_mem_size_it/2] > smem[threadIdx.x] ? 

					smem[threadIdx.x+shared_mem_size/2] : smem[threadIdx.x];

                __syncthreads();

		warperMax2<T,shared_mem_size,(shared_mem_size_it/2)>::warp_reduce(smem);

	} 

};

template<typename T,int shared_mem_size,int shared_mem_size_it>

__device__ void warp_reduce_max3( T smem[shared_mem_size]){

			warperMax2<T,shared_mem_size,shared_mem_size_it>::warp_reduce(smem);

}

the compiler show the following error when i use warp_reduce_max3 inside a kernel

error : too many arguments

for line :

warperMax2<T,shared_mem_size,(shared_mem_size_it/2)>::warp_reduce(smem);

ran some test with a similar host functions instead of device and it worked,

someone got an idea about how to resolve the problem ?

Testi

Topic		Replies	Views
Access to CUDA library functions inside specialized instantiations of __device__ function templates CUDA Programming and Performance	3	1347	April 9, 2013
Function template specialization in device code CUDA Programming and Performance	1	1197	November 20, 2013
(error 98) due to "invalid device function" for a very simple templated kernel example CUDA Programming and Performance cuda , kernel	3	3541	July 8, 2020
Computing capability 1.1 Problems with the key word __shared__ CUDA Programming and Performance	6	7883	January 19, 2010
Templated arguments / shared memory CUDA Programming and Performance	8	2135	September 8, 2008
Is CUDA C or C++ ? CUDA Programming and Performance	12	33649	January 30, 2009
A problem with template and kernel call Compilation fails in this case CUDA Programming and Performance	2	911	April 22, 2010
Generating inline ptx assembly using templates? CUDA Programming and Performance	0	577	February 4, 2020
CUDPP reduction function used for array summation CUDA on Windows Subsystem for Linux cuda , kernel , ubuntu	3	633	October 6, 2021
Arrays of structs in device memory CUDA Programming and Performance	5	1576	October 17, 2010

cuda and partial specialization template

Related topics