Tackling Floating point differences caused by CUDA tool kits


The floating point computation results generated in CUDA tool kit 8.0 and CUDA tool kit 10.2.89 with compute capability 61 are different.

There is a minor deviation in the floating point results which creates considerable difference when processing further with those float values.

I did not notice such deviation in floating point results when generated with CUDA tool kit 8.0 and CUDA tool kit 9.2.

Only with CUDA 10.2.89 I see the deviation.

Why there is a deviation in floating point result from CUDA 8.0 and CUDA 10.2.89 environment when executed with same Compute Capability 61.

The release note provided by NVIDIA does not contain detail’s related to floating point differences caused by different CUDA tool kits.

Is there any other documentation/information provided by NVIDIA related to this issue.

How to tackle this issue ?


I am using Quadro P5000 card with Compute 61 in CUDA 8.0, CUDA 9.2 and CUDA 10.2.89 environment.

Choosing known source code implementations of transcendental functions – such as those frequently posted by Norbert Juffa (njuffa) on this forum – could be a reasonable workaround for undocumented changes that nVidia have made to the official implementations.


Changes to optimizations affecting floating-point arithmetic are typically minor. There should be no expectation of bitwise identical results across different version of a tool chain, on any platform. Have you read NVIDIA’s floating-point whitepaper for background?

If relatively small changes in the tool chain’s handling of floating-point computation lead to significantly different final results, this is a pretty good indication that your software implementation lacks numerical stability, something you might want to investigate.

Orthogonal to that effort, in order to recommend mitigation steps, you would have to first narrow down which section of code is the root cause of the observed differences. Two common scenarios in the context of CUDA are: (1) compiler changes affecting contraction of FMUL followed by FADD into FMA (fused multiply-add); (2) accuracy improvements to transcendental functions in the standard math library.

If you use floating-point atomics, there is an indeterminate order of the operations, and because floating-point operations are generally not associative, results may differ. I am pretty sure that is spelled out in the whitepaper I mentioned.