Equivalent FORTRAN expression in CUDA

Dear all,

being new to CUDA, I was wondering whether the following FORTRAN statement is correctly translated to CUDA:

FORTRAN-77 iarg=nint( 64i2pi( xarg-aint(xarg)+1. ) )

CUDA iarg = (int)rint(64. * 2pi * (xarg - (float)((int)xarg) + 1.));

Would these two be equivalent? I sometimes get a slightly different result and am not sure whether
this is down to using different compilers, numerical accuracy etc.

Also, when running the code Windows XP is shutting down now and then. This appears at arbitrary points. I’m using an
NVIDIA GTX 285. Each kernel runs for much less than 1 second, has ‘syncthreads()’ at the end and memory usage should not be a problem.

Did anyone else experience a similar problem? Is this perhaps related to the GPU getting too hot? Fortunately, the computer
recovered everytime so far.

Other than this, I’m quite impressed with the speedup that can be achieved.

Many thanks in advance for your advice!

No, they are not equivalent. Your C expression contains a mixture and single and double precision values, the F77 is all single precision.

How should it look like in CUDA avoiding mixing of precision?

Many thanks!

In C (“CUDA” is just standard C90 with a handful of syntax extensions to cover dealing with kernels and the non-flat memory space of the GPU), you would want something like:

(int)rintf(64.f * i2pi * (xarg - (float)((int)xarg) + 1.f))

But you should still expect some differences between the floating point results produced on the GPU and CPU, mostly because the CPU will being using either 80bit extended precision or IEEE-754 double precision internally, and then rounding the results back to single precision afterwards. You should also remember that floating point arithmetic isn’t commutative, and decomposing an algorithm into parallel steps often produces a different result than executing the same code serially.

Many thanks, avidday! This is quite useful to know.