floating point operations

hi everybody

i’m just passing some C functions of my program to CUDA kernels.

When i compare 2 results calculated in these 2 different ways to control if parallel kernels do their jobs well and both calculations match, i find errors on floating point operations (i’m working in single precision).
These errors are around 10^(-5) and i don’t think that it’s a gpu fault, since i know that it supports IEEE 754 (my graphic card is GTX295).

Assuming that my algorithm is right (i’m still controlling everything…) is it normal to have this kind of errors for some other reasons?!
Do they occur in some in particular operations?

i call device functions from kernel, just internal function i had in C language, adapted in cuda and transformed in device functions.
Can it influence anything?


Do you use something like exp sincos etc?

Yes i do…is there any problem involving SFUs?

There is no IEEE 754 single precision conformance on the GT200. Double precision is IEEE754 compliant. If you are seeing 5 or 6 decimal places of agreement in single precision, then that is about as good as you can reasonably expect using single precision arithmetic anyway.

thanks, i didn’t know about this feature of GT200.

Last thing i haven’t clear: using double precision involves SFUs instead of SPs, isn’t it?

are u sure about that??? and i’m using a GTX…

You need to read programming developing guide, everything about precision is described there.

The GT200 is the name of the underlying GPU used in the GTX260/275/280/285/295, Telsa C1060/M1060/S1070, Quadro FX4800/5800 and Quadro Plex 2200 series. It has IEEE754 compliant double precision, but not single precision. IEEE754-2008 compliant single precision was only introduced with the new GF100 “Fermi” family of parts.

you’re right, i jumped appendixes, where everything about precision is explained.

from the table, it seems full IEEE 754 is in add and multiply only.

Thanks your for your time

ieee754 compliance or not, you can never expect two different pieces of hardware return the same exact results. 754 compliance in Fermi means specified error bounds for some operations (that had non-754 bounds before) and some rounding stuff. Note that even the specification itself includes error ranges, you have to expect different results.

The discrepancy you see, around E-5, is normal. Single precision floats can only represent about 6 significant digits (in decimal notation) and any kind of computation is likely to introduce errors to the least significant ones. Functions such as exp, sin etc. will introduce bigger errors. You will see this kind of small differences even while running your code on the CPU with different compile options.

thank for your complete explanation

now i feel myself lucky for other cases in which i’ve found a total match doing an if == compare between floats
only sin effects made results unmatched

== shoud never be used with floats, no matter the platform.
Always check if the two values are inside some epsilon.

Yes of course i’ve usually checked within a epsilon.

But when i started to find errors where they shouldn’t be, i did some checks more

It’s also worth pointing out that IEEE-754 does not specify the precise rounding behavior of transcendental functions. (This is due to the “Table Maker’s Dilemma.”) Two IEEE-754 compliant implementations have to give the the same answer for the same sequence of basic floating point operations, but once you throw a sin() in there, things can be different.