Device Runtime Trigonometric Functions

Hi,

Is anyone mathematically inclined enough to inform me how I achieve the inverse tangent function (arctan/atan/tan^-1) from within the device. I know the CUDA library provides Sine/Cosine/Tangent functions but not their complements. I scrapped google for two hours last night and came up with nothing.

A bit of reverse engineering came up with this which could be implemented as a function accessible from the kernel:

/* ATAN.C

 * Calculates arctan(x)

 * Range: -infinite <= x <= infinite (Output -pi/2 to +pi/2)

 * Precision: +/- .000,000,04

 * Header: math.h

 * Author: Max R. D^Arsteler 9/15/83

 */

double atan(double x)

{

	double xi, q, q2, y;

	int sign;

	xi = (x < 0. ? -x : x);

	q = (xi - 1.0) / (xi + 1.0); q2 = q * q;

	y = ((((((( - .0040540580 * q2

       + .0218612286) * q2

       - .0559098861) * q2

       + .0964200441) * q2

       - .1390853351) * q2

       + .1994653599) * q2

       - .3332985605) * q2

       + .9999993329) * q + 0.785398163397;

	return(x < 0. ? -y: y);

}

Please tell me if I need not go to the extent of using the inverse tangent function to work out angles…

Cheers in advance

atan/atanh and atanf/atanhf are both listen is section b1 and and b2 of the programming guide v2.0

Maybe I’ve misunderstood, but there’s atan functions on the device.

NVIDIA CUDA Programming Guide 2.0, Table B-1: Mathematical Standard Library Functions, page 82:
atanf(x)

My bad…

my understanding was that common runtime functions could not be called from the device. So there simply isn’t a faster version of the inverse tangent function available from the device runtime component?

The atan function you call in CUDA kernel code is not the same as the common x86 ATAN function. It’s an NVIDIA implementation that’s probably about as fast as it gets on the device, given the error constraints.