Hi,
Does anyone have a suggestion for getting floating point literals to be the right type in code that is templated based on the floating point precision? E.g.:
template <typename T>
T add2(T x)
{
return x + 2. // but with 2. for T=double, 2.f for T=float
}
Is there any way to accomplish this without writing the code twice? Since precision has such important performance implications, it would be nice to have some sort of nvcc extension, e.g. 2._T or something. Note: The ptx guide says that literals are always represented in 64-bit double-precision format but for the operation (type T)*s, I get:
[list=i]
[*]For T=float, s=3.3333333333333333f
mov.f32 %f7, 0f40555555; // 3.33333
[*]For T=double, s=3.3333333333333333f
mov.f64 %fd3, 0d400aaaaaa0000000; // 3.33333
[*]For T=float, s=3.3333333333333333
mov.f64 %fd2, 0d400aaaaaaaaaaaab; // 3.33333
[*]For T=double, s=3.3333333333333333
mov.f64 %fd3, 0d400aaaaaaaaaaaab; // 3.33333
So I seem to only get pure 32 bit operations if T=float and using the ‘f’ suffix and accurate 64 bit operations with T=double and no suffix. :(