On tackling float-point precision issues in CUDA

njuffa · August 11, 2019, 8:42am

I am not aware of an fmad() intrinsic in CUDA. I would suggest using the C++ standard functions fma() and fmaf() as needed, as such code should be portable between host and device. Occasionally the specific use of device function intrinsics like __fma_rn() and __fmaf_rn() may be useful.

I concur that educating oneself about the advantages of the fused multiply-add operation is highly recommended in general, as it can be a powerful tool in numerical codes. Knowledge about it is not as widespread among programmers as it should be, given that “all” modern processor architectures (both CPUs and GPUs) support the operation in hardware.

Topic		Replies	Views
Floats and floats... difference between CPU and GPU? CUDA Programming and Performance	12	14340	February 2, 2010
floating point compare CUDA Programming and Performance	4	2039	June 13, 2019
Accuracy Issues with Tesla C870 CUDA Programming and Performance	2	4806	June 13, 2008
how CUDA represents the floating number/. CUDA Programming and Performance	6	3133	June 26, 2008
floating point operations CUDA Programming and Performance	13	6826	May 16, 2010
Precision of floats does CUDA use half precision instead of single precision for floats? CUDA Programming and Performance	5	2319	March 15, 2010
Compare a double-precision number with zero CUDA Programming and Performance	1	637	March 7, 2014
Floating-point precision problems CUDA Programming and Performance	14	4500	January 7, 2011
floating point precision on CUDA CUDA Programming and Performance	11	14980	June 8, 2010
FMA precision issue CUDA Programming and Performance	9	19452	November 21, 2010

On tackling float-point precision issues in CUDA

Related topics