Precision issue! Wrong result for a multiplication

Floow · April 4, 2012, 1:10pm

Hi,

I was doing some test and I had a problem. I wrote a test kernel for doing multiplications but the result is false. I know that it comes from the use of float because if I use the double precision, the result is correct.

Here is my test:

#include <stdio.h>

#include <stdlib.h>

#include <time.h>

#define XBLOCK 256

#define YBLOCK 1

#define XGRID 84

#define YGRID 1

/** Kernel function **/

__global__ void KernelTest()

{

	// Thread index

	int idx = (blockIdx.x * YGRID + blockIdx.y) * XBLOCK * YBLOCK + (threadIdx.x * YBLOCK + threadIdx.y);

	

	float A, result;

	

        A = 11877.303711f;	

        result = A*A;

	

	if(idx==0){

		printf("A=%f - result=%f\n",A, result );

/* The displayed result is 141070336.0 instead of 141070343.443334 in double */

	}

}

int main(){

	

	cudaDeviceReset();

	

	/** Launch of the kernel **/

	dim3 blockSize(XBLOCK,1);

	dim3 gridSize(XGRID,1);

	KernelTest<<<gridSize, blockSize>>>();

}

I would like to know if I can have the good result with float. I tried using __fmul_ru,d,n but still wrong.

I am afraid there is not solution but I still hoping there is one.

EDIT: One idea I had if to compute the result like this

float product1 = __fmul_rd( A,A );

	float product2 = __fmul_ru( A,A );

	float result = 0.5f*( produit1 + produit2 );

	if(idx==0){

		printf("A=%f - product1=%f - product2=%f - result=%f\n",A,product1,product2, result );

        }

But the result is false. I have this:

A=11877.303711 - product1=141070336.000000 - product2=141070352.000000 - result=141070336.000000

Last thing, what do you think to use double instead of float?

Could you give me some advice?

Thanks

pasoleatis · April 4, 2012, 2:08pm

Hello,

Maybe this sounds stupid, but cold you try to use somthing like this %26.20f in the printing command. The rounding errors appear only when you have a large number to which is added a small numer over and over. Add or multiplying 2 numebrs once should not give difference.

Floow · April 4, 2012, 2:33pm

Thanks you very much for trying to help me but it change nothing!

cbuchner1 · April 4, 2012, 3:50pm

It is not false. It is correct within the precision limitations of 32 bit floats. These float numbers are precise only to about 7 decimal digits. The mantissa of a 32bit float has 23 bits which translates to log10(2^^23) = 6.92 decimal digits that are significant.

Any more digits that you display with printf have no meaning, they’re essentially random numbers External Image

People really need to understand these limitations.

Christian

RoBiK · April 4, 2012, 4:14pm

Hello,

the problem with your first source code that there is actually no computation happening inside your kernel… what you see in the output is a result that was computed by the compiler, not by the GPU.

even if you would not use a constant inside a kernel you still have an another problem: 11877.303711 can not be represented as float, if you try it out you will see that the value stored in the float variable is 11877.3037.
next problem is that even if you somehow computed the result with higher precision ultimately you are assigning it to a float, but 141070343.443334 can not be represented as float and gets truncated to 141070336.0.
So for this concrete number and computation the result is perfectly right and the best you can get with float.
For other numbers and/or computations the result might differ between computing in float and computing in double and assigning result to float.
The only way to higher precision without actually using double is simulation of double precision using single precision… google is your friend in that case :)

cheers
RoBiK

Floow · April 10, 2012, 10:12am

@ cbuchner1 : Thanks for the explanation. It seems logical now.

@ RoBiK : Now, I do the compute in double. I know that is not efficient (according to computation time) but I have other problems I would like to correct before improving computation time. I also had a look at google about “simulation of double precision using single precision” but I found nothing. Do you have some examples, or references please?

Thanks again,
Dext

cbuchner1 · April 10, 2012, 10:15am

Google for “double single” computation or a cuda port of the dsfun library (from Fortran)

a search on these forums for dsfun might also turn out some results

Christian

pasoleatis · April 11, 2012, 8:12am

Thanks for the post. I started to program CUDA with almost 0 knowledge of C. I practically learned C in the same time with CUDA.

Topic		Replies	Views
Multiplication gives 24bit results. How to get 32? CUDA Programming and Performance	11	3489	March 1, 2009
CUDA doesn't represent doubles as accurately as floats CUDA Programming and Performance	2	697	July 9, 2013
Precision of floats does CUDA use half precision instead of single precision for floats? CUDA Programming and Performance	5	2274	March 15, 2010
double and integer 64 bits CUDA Programming and Performance	5	3816	March 22, 2007
Float vs double precision on Titan V CUDA Programming and Performance	2	923	April 11, 2020
Double Precision Help... Double precision CUDA Programming and Performance	6	5044	September 1, 2011
Float precision error in matrix multiplication application. CUDA Programming and Performance	14	3546	February 27, 2014
FMA precision issue CUDA Programming and Performance	9	19344	November 21, 2010
Emulated double precision Double single routine header CUDA Programming and Performance	24	49119	October 18, 2010
Precision Problem CUDA Programming and Performance	2	1725	June 21, 2009

Precision issue! Wrong result for a multiplication

Related topics