wrong results in device

Hi guys!!!

I’m trying with some code but i have problems, i hope you can help me…

I have the next code in device:

__device__ float GetFitnessENi(float *x,int noVariables){

return -0.0001 * powf( (double)fabsf( sinf(x[0]) * sinf(x[1]) * expf(fabsf(100-( (powf((double)powf((double)x[0],(double)2)+powf((double)x[1],(double)2),(double).5))/(3.14159265359) ))) ) +1 ,(double).1);//Cross-in-tray  -10 - 10

and the result is someting like this:

x[0]= -5.747676  x[1]= -4.186990  r= -1.#INF00

but when i try in host, i have the next result

printf("\n-==> x= %f  y= %f  %f", x[0],x[1], -0.0001 * pow( (double)fabs( sin(x[0]) * sin(x[1]) * exp(fabs(100-( (pow((double)pow((double)x[0],(double)2)+pow((double)x[1],(double)2),(double).5))/(3.14159265359) ))) ) +1 ,(double).1));

-==> x= -5.747676  y= -4.186990  -1.618566

Can you help me with this please?

There is not enough information in these snippets for a meaningful diagnosis. If you post a buildable self-contained piece of code that reproduces the issue, along with the exact nvcc invocation used to build it, that would enable others to analyze what is going on. What CUDA version are you using? What GPU do you run this code on?

Note that powf() is a function that takes ‘float’ arguments, yet in all the calls to this function in the device code you seem to be passing ‘double’ arguments. The constant 3.14159265359 is also of type double. Is there a particular reason for this mixture of double-precision data with single-precision functions?

Your host code is not identical to your device code because all functions called in the host code are double-precision math functions, while you are using single-precision math functions in the device code. You may simply have an overflow in the device code because of that.

From a performance perspective, pow() and powf() are very expensive functions on any platform (CPU or GPU), you would never want to call them to simply square data as you seem to be doing in this code.

Thanks njuffa. Well,i have the same problem without double casting.

I’m using a gtx580, compiling nvcc arch=sm_21, windows 7, Cuda 6.5. I’m studying when use float or double and Cuda’s overflow.

Is helpfully know your advices 'cause I didn’t know about pow’s cost

You are compiling for the wrong compute capability. For the GTX580 you should use arch=sm_20.

You should probably learn how to do proper cuda error checking as well. It will be useful in situations like this.

thanks txbob, I didn’t know about this.

Well, i still have problems, can you help me with the best way implement the next ecuation


I tried:

__device__ float GetFitnessENi(float *x,int noVariables){

return -0.0001 * powf( fabsf( sinf(x[0]) * sinf(x[1]) * expf(fabsf(100-( (powf(powf(x[0],2)+powf(x[1],2),.5))/(3.14159265359) ))) ) +1 ,.1);

thanks guys :)

You really want to cut down on the number of powf() calls. See also the Best Practices Guide (http://docs.nvidia.com/cuda/cuda-c-best-practices-guide). In particular, you would want to substitute:

powf (x, 2) --> x*x
powf (x, 0.5f) --> sqrtf(x)

There is a function dedicated to computing sqrt(xx+yy) which you may want to consider:

hypotf (x,y)

See the math function API documentation for details (http://docs.nvidia.com/cuda/cuda-math-api/index.html)

Note that floating-point constants without a ‘f’ suffix are double precision by default, this makes your computation much more expensive, especially since you are on a consumer GPU with relatively low double-precision operation throughput:

3.14159265359 -> 3.14159265359f
-0.0001 -> -0.0001f
.1 -> .1f

By the way, your equations look very unusual. I would be interested to learn what area of science they are from. If you could point me to a relevant paper, that would be ideal. Thanks!

judging by his forum nickname, it’s mad science ;)

Jaja, thanks ;),this is a benchmark, I use it 'cause i’m programing mono-objective and multi-objective metaheuristics, when finish my work I would be pleased sharing my thesis.