Device computed value significantly different from precomputed value displayed in NSight/VS2010

I am computing correlation coefficients of a number of different arrays in parallel.

A few of the results are greater than 1.0, so I am tracking down the discrepancy.

When debugging in NSight, some of the intermediate results appear to be dramatically off.

For equation: float dfs2 = 250.37335 - 252 * -0.99676728 * -0.99676728 (all float except int 252)

NSight in VS2010 shows a value of 8.1612998e-06 for the computation before execution, but after stepping in NSight, the value of dfs2 is shown as 1.5258789e-05, which is dramatically different.

Excel shows the value as 7.35939E-06 (obviously closer to pre-computation displayed)

Why would the device computed results be twice as large?

Results on 525M with Optimus running latest release CUDA 5.0 and latest release NSight 3.0 with driver v320.18. (50.9 KB)

Attached a project which demonstrates this error.

Run in VS2010; note values at breakpoint, highlight to see precomputed, and then single step. (50.9 KB)

An additional annoying “feature” is that ints keep being displayed as hex and I have to click the Hex button twice to get decimal display.

So it seems the NSight/VS display may be computing double precision while the device is computing single precision float.

Is that the case?

Computing the correlation with double precision fixes the coefficient greater than 1.0 result.

Hey robosmith,

will take a look, thanks for the repro.

Yes, the debugger evaluates as double precision. I’ll create a case to see if we can address this.