On other programming platforms the documentation for this function (introduced with C99) is more explicit:
This function is used infrequently, and therefore unlikely to be fast. In CUDA, however, the single-precision version is fast: there is only one canonical NaN supported for single precision (there is no NaN payload; this is IEEE Std 754-2008 compliant), so that canonical QNaN will be returned no matter what the input.
One typically uses nan() to specify a NaN payload in the form of a small integer, which allows one to track the origin of NaNs to some degree. I do not know any software currently in use on any platform that actually uses that feature. Historically, Apple’s SANE (Standard Apple Numerical Environment) used it.
nan ("9876"); // decimal
nan ("0765"); // octal
nan ("0xFEC"); // hexadecimal
I don’t know why the C standard committee chose a string-based interface here, that strikes me as poor design. It requires a string parser to insert the payload into the QNaN.
You can look at the generated bit pattern (e.g. with __double2hiint(), __double2loint()) to see what CUDA actually produces based on the passed integer. Be aware that the bit pattern may or may not match what the host toolchain produces. The C standard (and the C++ standard who simply inherited this) gives complete freedom to an implementation.