If your code uses floating-point computation, the most likely reason is that FADD dependent upon FMUL will frequently get contracted to FMA (fused multiply-add) in a release build. To confirm that this explains the differences, you can turn off this contraction by building with -fma=false. Since this typically has a negative impact on both accuracy and performance you wouldn’t want to use that for your production build, but it is useful for experiments.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Weird result difference between release and debug even with -fmad=false | 8 | 755 | June 30, 2022 | |
| Release and Debug modes on CUDA 5.0 | 2 | 1050 | January 9, 2014 | |
| Floating point operations IEE compliance and debug mode | 3 | 1056 | April 4, 2013 | |
| Same code, same input, different results | 5 | 457 | September 6, 2023 | |
| fma() | 2 | 9521 | April 20, 2014 | |
| FMA precision issue | 9 | 19531 | November 21, 2010 | |
| Debugger error. | 3 | 618 | December 29, 2016 | |
| GPU Code and CPU Code output not matching till machine precision (i.e. 13 decimals places) | 22 | 1092 | August 9, 2023 | |
| code complied with -g -G is different from that compiled with -O | 10 | 1511 | March 8, 2014 | |
| long live the compiler | 8 | 1075 | April 12, 2015 |