i’m just passing some C functions of my program to CUDA kernels.
When i compare 2 results calculated in these 2 different ways to control if parallel kernels do their jobs well and both calculations match, i find errors on floating point operations (i’m working in single precision).
These errors are around 10^(-5) and i don’t think that it’s a gpu fault, since i know that it supports IEEE 754 (my graphic card is GTX295).
Assuming that my algorithm is right (i’m still controlling everything…) is it normal to have this kind of errors for some other reasons?!
Do they occur in some in particular operations?
i call device functions from kernel, just internal function i had in C language, adapted in cuda and transformed in device functions.
Can it influence anything?
There is no IEEE 754 single precision conformance on the GT200. Double precision is IEEE754 compliant. If you are seeing 5 or 6 decimal places of agreement in single precision, then that is about as good as you can reasonably expect using single precision arithmetic anyway.
The GT200 is the name of the underlying GPU used in the GTX260/275/280/285/295, Telsa C1060/M1060/S1070, Quadro FX4800/5800 and Quadro Plex 2200 series. It has IEEE754 compliant double precision, but not single precision. IEEE754-2008 compliant single precision was only introduced with the new GF100 “Fermi” family of parts.
ieee754 compliance or not, you can never expect two different pieces of hardware return the same exact results. 754 compliance in Fermi means specified error bounds for some operations (that had non-754 bounds before) and some rounding stuff. Note that even the specification itself includes error ranges, you have to expect different results.
The discrepancy you see, around E-5, is normal. Single precision floats can only represent about 6 significant digits (in decimal notation) and any kind of computation is likely to introduce errors to the least significant ones. Functions such as exp, sin etc. will introduce bigger errors. You will see this kind of small differences even while running your code on the CPU with different compile options.
now i feel myself lucky for other cases in which i’ve found a total match doing an if == compare between floats
only sin effects made results unmatched
It’s also worth pointing out that IEEE-754 does not specify the precise rounding behavior of transcendental functions. (This is due to the “Table Maker’s Dilemma.”) Two IEEE-754 compliant implementations have to give the the same answer for the same sequence of basic floating point operations, but once you throw a sin() in there, things can be different.