wrong results in device

madman_boy14 · November 17, 2014, 4:21am

Hi guys!!!

I’m trying with some code but i have problems, i hope you can help me…

I have the next code in device:

__device__ float GetFitnessENi(float *x,int noVariables){


return -0.0001 * powf( (double)fabsf( sinf(x[0]) * sinf(x[1]) * expf(fabsf(100-( (powf((double)powf((double)x[0],(double)2)+powf((double)x[1],(double)2),(double).5))/(3.14159265359) ))) ) +1 ,(double).1);//Cross-in-tray  -10 - 10
		
}

and the result is someting like this:

x[0]= -5.747676  x[1]= -4.186990  r= -1.#INF00

but when i try in host, i have the next result

...
printf("\n-==> x= %f  y= %f  %f", x[0],x[1], -0.0001 * pow( (double)fabs( sin(x[0]) * sin(x[1]) * exp(fabs(100-( (pow((double)pow((double)x[0],(double)2)+pow((double)x[1],(double)2),(double).5))/(3.14159265359) ))) ) +1 ,(double).1));
...

-==> x= -5.747676  y= -4.186990  -1.618566

Can you help me with this please?

njuffa · November 17, 2014, 6:59am

There is not enough information in these snippets for a meaningful diagnosis. If you post a buildable self-contained piece of code that reproduces the issue, along with the exact nvcc invocation used to build it, that would enable others to analyze what is going on. What CUDA version are you using? What GPU do you run this code on?

Note that powf() is a function that takes ‘float’ arguments, yet in all the calls to this function in the device code you seem to be passing ‘double’ arguments. The constant 3.14159265359 is also of type double. Is there a particular reason for this mixture of double-precision data with single-precision functions?

Your host code is not identical to your device code because all functions called in the host code are double-precision math functions, while you are using single-precision math functions in the device code. You may simply have an overflow in the device code because of that.

From a performance perspective, pow() and powf() are very expensive functions on any platform (CPU or GPU), you would never want to call them to simply square data as you seem to be doing in this code.

madman_boy14 · November 17, 2014, 7:30am

Thanks njuffa. Well,i have the same problem without double casting.

I’m using a gtx580, compiling nvcc arch=sm_21, windows 7, Cuda 6.5. I’m studying when use float or double and Cuda’s overflow.

Is helpfully know your advices 'cause I didn’t know about pow’s cost

hadschi118 · November 17, 2014, 8:26am

You are compiling for the wrong compute capability. For the GTX580 you should use arch=sm_20.

Robert_Crovella · November 17, 2014, 4:53pm

You should probably learn how to do proper cuda error checking as well. It will be useful in situations like this.

madman_boy14 · November 19, 2014, 5:21am

thanks txbob, I didn’t know about this.

Well, i still have problems, can you help me with the best way implement the next ecuation
please?

[img]http://upload.wikimedia.org/math/5/1/6/516946f57ab367894e27072c55b5d7b8.png[/img]

I tried:

__device__ float GetFitnessENi(float *x,int noVariables){


return -0.0001 * powf( fabsf( sinf(x[0]) * sinf(x[1]) * expf(fabsf(100-( (powf(powf(x[0],2)+powf(x[1],2),.5))/(3.14159265359) ))) ) +1 ,.1);
		
}

thanks guys :)

njuffa · November 19, 2014, 7:08am

You really want to cut down on the number of powf() calls. See also the Best Practices Guide ([url]CUDA Toolkit Documentation). In particular, you would want to substitute:

powf (x, 2) → x*x
powf (x, 0.5f) → sqrtf(x)

There is a function dedicated to computing sqrt(xx+yy) which you may want to consider:

hypotf (x,y)

See the math function API documentation for details ([url]CUDA Math API :: CUDA Toolkit Documentation)

Note that floating-point constants without a ‘f’ suffix are double precision by default, this makes your computation much more expensive, especially since you are on a consumer GPU with relatively low double-precision operation throughput:

3.14159265359 → 3.14159265359f
-0.0001 → -0.0001f
.1 → .1f

By the way, your equations look very unusual. I would be interested to learn what area of science they are from. If you could point me to a relevant paper, that would be ideal. Thanks!

cbuchner1 · November 19, 2014, 9:40am

judging by his forum nickname, it’s mad science ;)

madman_boy14 · November 19, 2014, 1:53pm

Jaja, thanks ;),this is a benchmark, I use it 'cause i’m programing mono-objective and multi-objective metaheuristics, when finish my work I would be pleased sharing my thesis.

Topic		Replies	Views
Discrepancy between the powf() return values in host and device CUDA Programming and Performance	7	1114	December 1, 2016
Bug in the POW function? CUDA Programming and Performance	8	2680	December 5, 2021
__powf(): wrong behavior CUDA Programming and Performance	8	4622	October 23, 2009
Unsuspected work of pow() function pow() device function works incorrectly witn negative numbers CUDA Programming and Performance	4	3343	March 4, 2009
Bug in gpuGetMaxGflopsDeviceId for CUDA Toolkit 10.0 CUDA Programming and Performance	7	940	September 26, 2018
Cuda code performance CUDA Programming and Performance	14	3056	December 16, 2014
A more accurate and faster implementation of powf() CUDA Programming and Performance	3	4160	April 13, 2022
Why when I tried to use "cosf" function in CUDA, there ocurred errors? CUDA Programming and Performance cuda	11	466	August 28, 2023
Strange behavior of cosf function (possible bug ?) CUDA Programming and Performance	13	2136	March 6, 2013
error when trying to use half (fp16) CUDA Programming and Performance	16	19684	October 13, 2015

wrong results in device

Related topics