I’m developing a thermal lattice Boltzmann simulation on CUDA and I’m seeing some stability issues that I think are related to floating point precision. The simulation runs stably when I emulate it, but not while running on the device.
While debugging I ran across some behavior that confuses me concerning CUDA’S floating point arithmetic, namely if I add -1.49e-008 and 3.35e-008 on the device the result is given as zero. These two values are well within the range of a 32-bit float and have the same exponent; I can’t see any reason the card should round the result to zero.
Am I missing something? Thanks for any thoughts.