round-to-double on GPUs? precision issue on the GPU

mianlu · November 12, 2009, 8:24am

Hi~

I am working on a project that is very sensitive to the precision using a GPU with 1.3 compute capability. I would like to know whether GPUs can also be set the floating-point control word as on the CPU. The major reason is that I want to know the FPUs on the GPU use a standard double precision or an extended precision like in Intel x86 processors by default. I hope I can set a flag to the GPU to use a round-to-double model if extended precision is used.

Thanks in advance!

Sylvain_Collange · November 12, 2009, 1:32pm

Like most architectures except (legacy non-SSE) x86, NVIDIA GPUs have no precision control flag. They just have single-precision instructions operating on single-precision data, and double-precision instructions operating on double-precision data. They directly map to the float and double C datatypes.

No extended-precision format is supported, but there is a Fused Multiply-Add in double precision. It’s use is enabled by default and it might sometimes improve accuracy.

mianlu · November 13, 2009, 4:28am

Thaks for your nice reply. I have a further question. That is for the mathematical library with the maximum ULP error table, it is said comparing with “a correctly rounded double-precision result”. I am not sure what is “a correctly rounded double-precision result”? This is obtained from CPU-based C math library? And why it may be not correctly rounded? Since five basic operations have zero ULP error.

Sylvain_Collange · November 13, 2009, 3:38pm

I find this confusing too, and I already complained about this table… I believe errors should be measured compared to the exact value.

The “correctly rounded result” means the floating-point number that would be obtained by applying the IEEE-754 rounding rules to the exact answer. Usually that boils down to returning the floating-point number that is closest.

The max rounding error compared to the exact result is 0.5 ulps in this case.

IEEE-754 does not require correct rounding for elementary functions (exp, log, sin, pow…), because it is very hard to achieve (both computationally and theoretically). So most C math libraries do not offer correct rounding, but are generally close (~0.501 ulps).

The CUDA library is a bit less accurate (literally!), but it should not be a problem in practice.

mianlu · November 14, 2009, 9:53am

really confusing…

Actully I aslo do not find the ULP descritpions for C math library. So this CUDA result errors should be comapred with the correctly rounded exact results?

I find this confusing too, and I already complained about this table… I believe errors should be measured compared to the exact value.

The “correctly rounded result” means the floating-point number that would be obtained by applying the IEEE-754 rounding rules to the exact answer. Usually that boils down to returning the floating-point number that is closest.

The max rounding error compared to the exact result is 0.5 ulps in this case.

IEEE-754 does not require correct rounding for elementary functions (exp, log, sin, pow…), because it is very hard to achieve (both computationally and theoretically). So most C math libraries do not offer correct rounding, but are generally close (~0.501 ulps).

The CUDA library is a bit less accurate (literally!), but it should not be a problem in practice.

Sylvain_Collange · November 14, 2009, 5:25pm

Yes, to NVIDIA’s credit, there are few math libraries that document their error bounds. A confusing documentation is better than no documentation. :)

Since the correctly-rounded result has at most 0.5 ulps of error compared to the exact value in round-to-nearest, you can just add 0.5 to all the error bounds of the documentation and get an error bound relative to the exact value. But it may significantly overestimate the error.

mianlu · November 16, 2009, 3:24am

Do you know how they get these maximum ULP numbers? Through some test benchmarks? Thanks.

Sylvain_Collange · November 16, 2009, 9:47am

From the manual:

Exhaustive tests could probably be performed in reasonable time for single-precision functions that take one argument.

Double-precision is another matter…

By the way, are you just curious or do a few ulps really matter for your application?

And if so, are you 100% sure that your CPU implementation delivers correct results?

mianlu · November 16, 2009, 10:08am

en… actually I want to implement some high precision algorithms on the GPU. I want to know if it is reasonable. The most important thing is that double precision on GPUs can follow the IEEE754 standard. However, there are still some concerns. Since many high precision functions (such as exp) are actually based on double-precision functions. Then compared with the math functions in C library, it looks like GPU may produce larger errors. However I also do not find the ulp errors for C math library in any documents.

Sylvain_Collange · November 16, 2009, 3:35pm

Then it is, most likely.

An error of a few ulps is something small. For instance, 4 ulps means the 53-bit result is accurate to 51 bits. If the two last bits matter for your algorithm, then it’s probably unstable even on the CPU…

Here are some older experimental results :

http://www.vinc17.org/research/testlibm/index.en.html

0% incorrect rounding is perfect (0.5 ulp error) and less than 50% is still very good (means something like 0.500000001 ulp…)

More than 50% means an error higher than 1 ulp, I think.

mianlu · November 17, 2009, 4:39am

Thanks a lot!

Topic		Replies	Views
Why accuracy CPU and GPU not equal? CUDA Programming and Performance	6	10964	October 28, 2014
Possible Rounding/Precision Errors in CUDA Math APIs? GPU-Accelerated Libraries math-api	5	145	July 31, 2024
Is there a difference between GPU double precision and CPU double precision? CUDA Programming and Performance	14	10750	November 26, 2009
CPU and GPU Floating point anomaly CUDA Programming and Performance	10	5666	November 10, 2013
Single Precision Accuracy CUDA Programming and Performance	9	9175	October 6, 2010
Accuracy: CUDA 4.0 math vs C math CUDA math functions have lower accuracy than C math ones CUDA Programming and Performance	2	5409	July 18, 2011
Float precision error in matrix multiplication application. CUDA Programming and Performance	14	3581	February 27, 2014
Precision of floats does CUDA use half precision instead of single precision for floats? CUDA Programming and Performance	5	2285	March 15, 2010
Precision issue about Single-Precision Floating-Point Function's table in programming guide CUDA Programming and Performance	2	505	July 14, 2023
Question regarding Precision Issues in BLAS CUDA Programming and Performance	9	8516	March 4, 2010

round-to-double on GPUs? precision issue on the GPU

Related topics