Access to CUDA library functions inside specialized instantiations of device function templates

mjmawson · April 8, 2013, 5:33pm

I have the following template device function:

template<typename T>
__device__ void MyatomicAdd(T *address, T val){
	atomicAdd(address , val);
}

that compiles just fine, but obviously there is no atomicAdd() for doubles, so I want to specialize MyatomicAdd() to allow a hand coded implementation in double precision. Ignoring the double precision specialization for now, the single precision specialization and template look like this:

template<typename T>
__device__ void MyatomicAdd(T *address, T val){
	};


template<>
__device__ void MyatomicAdd<float>(float *address, float val){
atomicAdd(address , val);

}

Now the compiler complains that atomicAdd() is undefined in my specialization. Any help? Thanks.

tera · April 8, 2013, 5:52pm

You are probably compiling for compute capability 1.x, where atomicAdd() cannot operate on float variables.

This is likely as compute capability 1.0 is the default. You also see the error if you compile for multiple compute capabilities and one of them is 1.x .

mjmawson · April 8, 2013, 9:38pm

Sadly not, I’m compiling in compute capability 3.0. The first template compiles and runs just fine if the template is instantiated with T as a float. It’s only when I specialize the instantiation (like in the second example) that the compiler complains atomicAdd() is undefined.

mjmawson · April 9, 2013, 9:43pm

I’m still not having any luck with this. I put the above code into a small test program and it does compile and run, so there’s an issue with the rest of my program somewhere. Just in case anyone has dealt with anything similar before, my cu file contains a template kernel and a couple of template device functions. These are then #include-ed in a header file containing the definitions of several c++ template functions (one of which calls the kernel in the cu file) belonging to a template class. Without any specialization of templates in the cu file the code compiles, and runs correctly.

Topic		Replies	Views
forcing template compilation in CUDA C CUDA Programming and Performance	9	21370	April 5, 2011
Undefined reference to `atomicaddd' using emulation mode Legacy PGI Compilers	6	7404	November 16, 2012
Problem with atomicAdd. CUDA Programming and Performance	7	21123	December 10, 2011
cuda and partial specialization template CUDA Programming and Performance	0	2549	March 22, 2012
error: identifier "atomic" is undefined CUDA Programming and Performance	1	1252	December 28, 2016
Function template specialization in device code CUDA Programming and Performance	1	1194	November 20, 2013
How to I use atomic functions? any library? CUDA Programming and Performance	3	7904	February 9, 2008
__syncthreads() and atomicAdd are undefined in visual studio 2015 CUDA Programming and Performance	9	10601	April 18, 2025
atomicAdd() showing error: no instance of overloaded function "atomicAdd" mathes the argument list CUDA Programming and Performance cuda	1	2580	February 27, 2023
Templated Functions on the Device CUDA Programming and Performance	1	2073	June 21, 2007

Access to CUDA library functions inside specialized instantiations of __device__ function templates

Related topics

Access to CUDA library functions inside specialized instantiations of device function templates