The GPU hardware operations follow IEEE 754-2008 with respect to the handling of special cases. The CUDA standard math functions follow the requirements of ISO C99 in this regard (the CUDA math library predates the extension of the C++ standard math library and therefore was the only model available when CUDA functionality was defined; C++ likewise simply adopted C99 specifications). Special case handling for CUDA math functions that are no part of the standard C/C++ math library are specified analogous to the standard functions, e.g. norm3d() is based on hypot().

Define special case handling for all operations is covered by tests, and I do not recall any bugs ever reported against this functionality.

Obviously, if you turn on FTZ mode or use device intrinsics instead of the standard library functions, speed is prioritized over special case handling, so you “get what you get”. Note that -use_fast_math implies both use of FTZ and device instrinsics.

As far as additions and subtractions are concerned, NaN will result for (+INF) + (-INF), (-INF) + (+INF), (+INF) - (+INF), (-INF) - (-INF). For multiplications, (+/-INF) * (+/- 0) will result in NaN, for divisions, (+/-INF) / (+/-INF) and (+/-0) / (+/-0).

Note that the question on Stackoverflow involves calls to pow(), and that standard math function has the largest number of special cases of any math function, I seem to recall even more than atan2(). The ISO-C standard, or one of the final drafts thereof available on the internet for free, enumerates these cases in detail.

FWIW, that Stackoverflow question is only the second time in my live I have seen someone call pow (2.71828183,x) instead of exp (x). Kids, don’t try this at home!