Weird Float Arithmetic results in Return Code 30

Ihave a GPU kernel which is throwing a Cuda error return code = 30 [nebulous something is wrong error]. I find that I can eliminate the error with the addition of a single lineof code that changes a float variable i calculated:

[font=Consolas][I am running Cuda 4.0 on a Win 7 box with a Fermi processor: GT 430][/font]

Thisthrows the error:

[font=Consolas]float F = w00 * x00 + w10* x10 + w01 * x01 + w11 * x11; // x are 4 data values;w are 4 weights between 0 and 1.0[/font]

[font=Consolas] int pOffset = ((y * nss +x) * MSI_LIVE + nc); // pointer to output pixel x 4 bytes/DTYPE [/font]

[font=Consolas] float *pOutPix;[/font]

[font=Consolas] pOutPix = (pOut + pOffset); // pOut is a pointer to global memory[/font]

[font=Consolas] [/font][font=Consolas] //F=123.456;[/font]

[font=Consolas] if (pOffset >= 0 &&pOffset < maxyO) // check in bounds[/font]

[font=Consolas] *pOutPix = F; // store result in global memory[/font]

[font=Consolas] [/font]If I uncomment the line F=123.456; then it runswithout an error – return code = 0 !

[font=Calibri] [/font]So something is fishy about F some of the time. I have tested for F == NULL but that doesn’t help. Nvcc wont let me do a try/catch block.

[font=Calibri] [/font]

[font=Calibri]Any other suggestions on how to test for bad F ? How could it be bad ? underflow ?? [/font]

[font=Calibri]typical values of F are [-5000.0:5000.0][/font]

Uncommenting “F=123.456;” makes all previous calculations of ‘F’ irrelevant and optimized out. In turn, that may make it unnecessary to compute w00, x00, etc. … The end result is that, by adding that single line, you effectively remove a lot of code (or rather the compiler does) and that might accidentally remove the actual code that causes the error. Which could be 50 lines above the section you quoted, for all we know …