Possible bug in cuda behaviour

kellykaneswaran · November 29, 2024, 9:19am

I’m working on an algorithm but the output behavior is not as expected so I’m providing a snippet of code which highlights my issue.

global void TEST(void)
{
float4 array[2];
float4 temp;
float k = .1;
float k1=k*(1/sqrtf(2));

array[0].x = 2445;
array[0].y = 2446;
array[0].z = 2447;
array[0].w = 2448;
array[1].x = 2444;
array[1].y = 2445;
array[1].z = 2446;
array[1].w = 2447;

temp.w =     (array[0].w * k1) ;//+ (array[0].w * (1-k));
temp.z =     (array[0].z * k1) + (array[1].w * (1-k1));
temp.y =     (array[0].y * k1) + (array[1].z * (1-k1));
temp.x =     (array[0].x * k1) + (array[1].y * (1-k1));

printf("test %f %f %f %f \\n",temp.x,temp.y,temp.z,temp.w);     //test 2445.000244 2446.000000 2447.000000 173.099747 , temp.x should be 2445

}

Why is temp.x 2445.000244 and not 2445? is it a rounding bug in the hardware, why is the behaviour no the same for the previous two results temp.y and temp.z?

Kelly

cbuchner1 · November 29, 2024, 10:40am

Have you run the same code on the CPU in single precision?
This might also be a floating point precision issue and not a hardware specific issue.

One thing to know about FP32 precision is that has approximately 7 decimal digits of precision. 2445.000 already shows 7 significant digits, so any further printer digits that follow can no longer be representable accurately - they are potentially random.

CUDA offers host side definitions of float4, so CPU vs GPU discrepancies should be easy to check using the same code.

Be aware that the rounding result of sqrtf() might differ between CPU and GPU. All the nVidia hardware guarantees is a specific precision in ULPs (units of least precision). It does not necessarily make guarantees about being faithfully or exactly rounded.

the nvcc compuler option -prec-sqrt may have an influence on the precision of sqrt functions.

The cuda programming guide 12.6 says this about the ULP error for sqrtf(x):

Maximum ulp error 0 when compiled with -prec-sqrt=true. Otherwise 1 for compute capability ≥ 5.2 and 3 for older architectures

Robert_Crovella · November 29, 2024, 4:29pm

When posting code on these forums, please format it correctly. One possible method: Edit your post using the pencil icon below it. Select the code. Press the </> button at the top of the edit pane. Save your changes.

Please do that now, thanks.

njuffa · November 29, 2024, 5:06pm

Nothing untoward appears to be happening. I modified the code into a pure C++ program for the host compiler, and used clang with -ffp-model=strict to compile.

On 64-bit ARM:

test 2445.000244 2446.000000 2447.000000 173.099747

On 64-bit x86:

test 2445.000244 2446.000000 2447.000000 173.099747

So this is just an example of normal fixed-precision floating-point rounding effects. float in particular is only accurate to 6 to 7 decimal digits, and within that limitation, the expected result matches the observed result. In particular, we have:

k = 0.07071068 1-k = 0.92928934
array[0].x * k = 172.88761902
array[0].y * (1-k) = 2272.11254883

The sum of the two products is 2445.00016785. The nearest available float encodings have the values 2445.0 and 2445.000244, of which the latter is closest to the sum, so that is chosen for the final result under the round-to-nearest-or-even rule.

Topic		Replies	Views
Rounding Errors CUDA Programming and Performance	7	2131	February 18, 2013
Strange precision problem CUDA Programming and Performance	3	1658	December 26, 2009
Possible Rounding/Precision Errors in CUDA Math APIs? GPU-Accelerated Libraries math-api	5	346	July 31, 2024
CUDA innacuracy? CUDA float produces different result from CPU float CUDA Programming and Performance	8	3161	September 9, 2011
Why do I have the problem of different results every time when I use CUDA for calculations？ CUDA Programming and Performance	5	318	July 24, 2023
Precision issue! Wrong result for a multiplication CUDA Programming and Performance	7	1471	April 11, 2012
Precision of floats does CUDA use half precision instead of single precision for floats? CUDA Programming and Performance	5	2392	March 15, 2010
floating point precision on CUDA CUDA Programming and Performance	11	15180	June 8, 2010
A strange bug about CUDA computing CUDA Programming and Performance	5	5291	December 15, 2007
CUDA floating point CUDA Programming and Performance	4	2185	April 20, 2009

Possible bug in cuda behaviour

Related topics