Undefined and NaN results

colede · February 7, 2011, 9:45pm

hello

I have a kernel that uses code that I directly copied from some CPU-only versions. The two versions give identical results, except that something is causing the CUDA version to give NaN in some spots. I looked throught the code, and of course there are normal operators. The only other things are some pow(), abs() and those types of math operations. I am also using the cuComplex library for workingw ith complext numbers, but it is also being used in the CPU version.

Are there any operations that I should look out for that could be particularly troublesome? Is there any way to set a breakpoint for a particular array index value in cuda-gdb, this way I could inspect the values when that index got changed?

Gregory_Diamos · February 7, 2011, 11:27pm

Are you running on a SM_2X device? Ocelot Google Code Archive - Long-term storage for Google Code Project Hosting. should faithfully implement the unordered number behavior of an actual Fermi device. What I usually do is run the emulator over an application that produces the NaN, record the entire instruction trace, search for the first instruction that generated it, then lookup the corresponding source code line.

SPWorley · February 8, 2011, 12:03am

Even though I use Ocelot myself, I’ve never used it to try to track down NaN instructions like that… makes sense!

That would be a great demo tutorial for Ocelot users, too…

Gregory_Diamos · February 8, 2011, 12:06am

Thanks for the suggestion, I’ll keep it in mind. I am trying to put together some slides for a tutorial…

It might also be a good candidate for a new correctness checking tool. It would be fairly simple to watch the instruction stream and raise an error and print out the source line on the first instruction to generate a NaN. We could also do it as an instrumentation pass and run it on the device, although it would be some extra work.

colede · February 8, 2011, 4:13pm

Well, I found out it was a normal float that was being set to NaN because of a division by zero, I suppose? I never could figure out the exact cause, because it always said my variables were out of scope, but I was able to check that float in the kernel with if(b!=b) and it seems to have fixed the problem.

Seems weird that that same exact code produces correct results when running on the CPU

Gregory_Diamos · February 8, 2011, 6:31pm

Some operations have different behaviour with regards to floating point ULP error or treatment of unordered and subnormal numbers in PTX. Most of the time these come from your exp, log, sin, cos, div.approx functions (which have poor accuracy), or by using the ftz modifier for many instructions (rather than dividing by a very small number you end up dividing by 0). There are a lot of different floating point modifiers available for most instructions, and if you tweak them one instruction at a time you can usually get the behaviour the match the CPU. Most CPUs also allow you to set the modes on a per-instruction basis (fesetround/fegetround/_controlfp). When there is a difference, most of the times in my experience the modifiers that you have selected end up not matching up…

Topic		Replies	Views
NaN results GPU-Accelerated Libraries math-api	5	376	August 1, 2024
Working in emulation but not in device mode CUDA Programming and Performance	5	3840	January 8, 2009
Random NaN result from cuda alogrithm CUDA Programming and Performance	13	7081	December 13, 2019
Checking for NaN and -1.#IND CUDA Programming and Performance	4	7985	July 28, 2011
nan value in array CUDA Programming and Performance	5	6427	April 18, 2016
cuda - division by 0 or multiplication by Nan must give zero CUDA Programming and Performance	6	5642	October 14, 2011
Detecting NaN inside CUDA Fortran Kernel Legacy PGI Compilers	5	11795	February 2, 2017
Floating point multiplication seemingly producing nan CUDA Programming and Performance cuda	12	2920	March 31, 2021
Using only ( *,+ and - ) on 32 bit floats under what circumstances will NaNs and INFs be generated? CUDA Programming and Performance	3	1778	March 6, 2016
Can NaN value slow down the CUDA computation (Math operation)? CUDA Programming and Performance	3	924	April 24, 2019

Undefined and NaN results

Related topics