The results of GPU are always not so precisely as compared to CPU results, especially on single-precision devices. So is there any method to enhance results’ precison?
they are not the same, as can be expected with floating point calculation. There are numerous ways to enhance the stability of results, it really depends on what you are doing. For example there is Kahan’s for reduction kind of algorithms.
There is a number format called “double-single” precision, which uses two single precision float variables to represent one number. Double-single numbers only have 48 bits of mantissa, rather than 53 bits, like a true double. A popular implementation of double-single arithmetic is the dsfun90 library. Some of this code has been ported to CUDA in the form of a header file:
Note that double-single arithmetic is very slow (much slower than true double precision addition on the GT200 devices). Adding two numbers takes more than 11 instructions.
If you want to reduce round-off error in sums of many single precision numbers, there is also a trick called Kahan summation:
One addition with the Kahan summation algorithm takes 4 instructions, making it faster than CUDA double precision. The final answer will not have more precision than a single precision float, but it will have much less accumulation of round-off error than a simple sum of floats will.