I have the following expression being evaluated on both CPU and GPU (in order to be able to compare results):
(x + 0.00915003f) * (-2.15891f + 4.15652f) * ADF0(-6.04954f), where ADF0 is a function of one argument arg0 (described in C for clearness):
float ADF0(float argo)
return arg0 / (arg0 - arg0 * arg0 / arg0) - arg0 * (arg0 - arg0 * arg0 / (arg0 + arg0 - arg0));
x has the only possible value: 10.0f. ADF0 is a vacuous function that simply should return infinity for any input value of arg0 because of division by zero.
Consequently, correct result of the source expression is -1.#INF (and CPU produces it succesfully) for any x.
However, GPU returns -2.53664e+008.
It is necessary to say that on both CPU and GPU expression is parsed using reverse polish notation, not a direct C operand (say, float fResult = (x + 0.00915003f) * (-2.15891f + 4.15652f) * ADF0(-6.04954f)). So this expression is parsed in the loop with push/pops to/from stack with subsequent math operations.
On GPU FMAds are eleminated using __fadd_rz and __fmul_rz, loops are absolutely identical on both CPU and GPU. This strange behaviour is not too frequent: as I have lots of different expressions with lots of possible values I can admit that such significant difference occures not too frequent, but nevertheless occures.
This fact is very annoying as I can’t rely on GPU on 100% - a couple of times the result from GPU were not obviously incorrect (like huge or very small abnormal number), but, say, 127 - just a number that I can hardly identify as abnormal. What GPU should actually return is still infinity or NaN.
How is it possible to make sure that GPU will work correct ? I’d like to remind that arguments like “floating point is not even associative and the sequence of math operations plays it’s role” are correct but not actual for my case - sequence of operations is absolutely the same on both CPU and GPU, FMADs are restricted.
I’d really appreciate any help on this subject.
Thanks in advance,