Hello!
We are currently using the CUDA Math Library to experiment with the numerical stability of its Math APIs. To verify correctness, we compare CUDA Math APIs with the corresponding C programming math functions. We have encountered some issues, particularly with rounding errors, where C version and CUDA version results are different.
We understand that CUDA and C have different rounding mechanisms. However, we want to verify if these are actually rounding errors and the root cause for such errors. Below are the functions and the inputs we used to identify the errors:
acos - input: 0.0001590810570633039;
C result: 1.5706372261047363;
CUDA result: 1.5706373453140259
acosh- input: 2326705117069312.0
C result: 36.076377868652344
CUDA result: 36.07637405395508
asinh - input: -4.003921508789062
C result: -2.09566330909729
CUDA result: -2.095663070678711
atan - input: 191.99949645996094
C result: 1.5655879974365234
CUDA result: 1.565588116645813
atanh- input: -0.9530639052391052
C result: -1.864183783531189
CUDA result: -1.8641839027404785
cbrt - input: -3831.995849609375
C result: -15.648582458496094
CUDA result: -15.64858341217041
cosh - input: 0.125
C result: 1.0078226327896118
CUDA result: 1.0078227519989014
erfc - input: -0.00012207029794808477
C result: 1.0001376867294312
CUDA result: 1.0001378059387207
exp10- input: 0.007812499534338713
C result: 1.0181517601013184
CUDA result: 1.0181516408920288
exp2 - input: 0.06152203306555748
C result: 1.043566107749939
CUDA result: 1.0435662269592285
expm1 - input: 0.9999998211860657
C result: 1.7182813882827759
CUDA result: 1.7182812690734863
j0 - input: 0.008056640625
C result: 0.9999837875366211
CUDA result: 0.9999839067459106
j1 - input: -8192.01953125
C result: 0.00786489900201559
CUDA result: 0.00786471646279096
lgamma- input: 2097664.0
C result: 28436630.0
CUDA result: 28436628.0
log,- input: 3276800.25
C result: 15.0023775100708
CUDA result: 15.002378463745117
log10- input: 25026078.0
C result: 7.398392677307129
CUDA result: 7.398393154144287
log1p- input: 458363.46875
C result: 13.035419464111328
CUDA result: 13.035420417785645
tan - input: 0.9999999403953552
C result: 1.5574074983596802
CUDA result: 1.5574076175689697
tgamma - input: 0.0390625
C result: 25.060091018676758
CUDA result: 25.06009292602539
y0f - input: 0.008666995912790298
C result: -3.096553087234497
CUDA result: -3.096553325653076
y1 - input: 0.12500381469726562
C result: -5.199782848358154
CUDA result: -5.199781894683838