On tackling float-point precision issues in CUDA

edwardliang11 · July 31, 2019, 7:49am

Hi,

From reading the CUDA documentation, I have learnt that float-point values in CUDA also follow the IEEE 754 standard. To this end, I would like to ask whether I can tackle float-point precision issues in CUDA the same way I do in C/C++.

Specifically, my questions are as follows.

For float and double types, are all values precisely representatable in C/C++ (including zero) also precisely representable in CUDA?
To compare float-point variables, I use the following code in C/C++.

bool isZero(double d){
     return d >= -DBL_EPSILON && d <= DBL_EPSILON;
}

bool isEqual(double d1, double d2){
     return fabs(d1 - d2) < DBL_EPSILON;
}

Is this code directly transferrable to CUDA?

Thank you!

tera · July 31, 2019, 8:16am

Yes, and the representation is even the same!
Yes, you can keep using that code on GPUS as well if it has turned out sufficient for your needs in CPU code.

Generally, the fmad() fma() intrinsic (which compiles to a single instruction) is hugely useful. It seems to be not that well known amongst (CPU) programmers, as has only been added to the x86 instruction set relatively recently.

Nvidia has produced a whole whitepaper about Floating Point and IEEE 754.

njuffa · August 11, 2019, 8:42am

I am not aware of an fmad() intrinsic in CUDA. I would suggest using the C++ standard functions fma() and fmaf() as needed, as such code should be portable between host and device. Occasionally the specific use of device function intrinsics like __fma_rn() and __fmaf_rn() may be useful.

I concur that educating oneself about the advantages of the fused multiply-add operation is highly recommended in general, as it can be a powerful tool in numerical codes. Knowledge about it is not as widespread among programmers as it should be, given that “all” modern processor architectures (both CPUs and GPUs) support the operation in hardware.

tera · August 11, 2019, 5:48pm

Thanks Norbert for pointing our my typo in style. I don’t think there was a big risk of being misunderstood, as I was already linking to the correctly spelled intrinsic, but I have of course corrected the typo nevertheless.

Jimmy_Pettersson · August 13, 2019, 2:25pm

Your equality check is only really valid for numbers near zero. Your not taking into account the relative size of the values: i.e. you should bee looking at the absolute RELATIVE difference.

You should hence compare the following:

|d1-d2|/max(|d1|,|d2|) < EPSILON , where the relative difference rd = |d1-d2|/max(|d1|,|d2|)

Lets take some examples (fp32).

With your method we would have gotten:

d1=1.234567f
d2=1.234568f
=>
|1.234567 - 1.234568| => 0.000001 < FP32_EPSILON => OK! EQUAL!

These numbers are considered equal, great! Lets multiply them with a large number say 1000.

d1=1234.567f
d2=1234.568f
=>
|1234.567 - 1234.568| => 0.001 > FP32_EPSILON => NOT OK! NOT EQUAL!

So let’s agree that 2 equal numbers multiplied with ex 1000 should still be considered equal, ok? :)

Now lets instead use the suggested definition:

d1=1.234567f
d2=1.234568f
=>
rd = |1.234567 - 1.234568|/max(|1.234567|,|1.234568|) = 0.000001 / 1.234568 = 0.00000081

=> 0.00000081 < FP32_EPSILON => OK! EQUAL!

And now AGAIN multiply by 1000.

d1=1234.567f
d2=1234.568f

=>
rd = |1234.567 - 1234.568|/max(|1234.567|,|1234.568|) = 0.001 / 1234.568 = 0.00000081
(just like before the relative difference is the same)
=> 0.00000081 < FP32_EPSILON => OK! EQUAL!

So just add the additional code to check for the relative difference instead of absolute difference! (and make sure to avoid zero division aswell…).

Topic		Replies	Views
Floats and floats... difference between CPU and GPU? CUDA Programming and Performance	12	14340	February 2, 2010
floating point compare CUDA Programming and Performance	4	2039	June 13, 2019
Accuracy Issues with Tesla C870 CUDA Programming and Performance	2	4806	June 13, 2008
how CUDA represents the floating number/. CUDA Programming and Performance	6	3133	June 26, 2008
floating point operations CUDA Programming and Performance	13	6826	May 16, 2010
Precision of floats does CUDA use half precision instead of single precision for floats? CUDA Programming and Performance	5	2319	March 15, 2010
Compare a double-precision number with zero CUDA Programming and Performance	1	637	March 7, 2014
Floating-point precision problems CUDA Programming and Performance	14	4500	January 7, 2011
floating point precision on CUDA CUDA Programming and Performance	11	14980	June 8, 2010
FMA precision issue CUDA Programming and Performance	9	19452	November 21, 2010

On tackling float-point precision issues in CUDA

Related topics