I am testing now a kernel which takes long time to compute output values and I just see that they are all NaN values. Please can you tell me if those NaN values can drastically slow down the math operation inside Cuda kernel?
As opposed to many CPUs, GPUs handle all special floating-point operands at full speed. At least that is the case all the way to the Pascal architecture, which is the latest one I have used.
Depending on what your numerical algorithm does, the presence of NaNs could cause it to iterate indefinitely, however.
@njuffa Thank you for the details. I just realized that the Nan effectively affects the computation time.