I was working with the intrinsic functions and needed some information, and found this thread:

https://devtalk.nvidia.com/default/topic/960757/cuda-programming-and-performance/a-faster-and-more-accurate-implementation-of-sincosf-/2

Since it is from 2016, I’d like to know:

1 - Is it already incorporated in the toolkit? I’m currently using 9.1.

2 - In either case, is sincosf() a direct replacement for __sinf() and __cosf() in this situation:

```
__global__ void cuda_Euler(const float * __restrict__ real, float *imag, float *output, const float ANGLE, const int LENGTH)
{
int tid = blockDim.x * blockIdx.x + threadIdx.x,
offset = gridDim.x * blockDim.x;
while(tid < LENGTH)
{
output[tid] = real[tid] * __cosf(ANGLE) + imag[tid] * __sinf(ANGLE);
tid += offset;
}
}
```

Keeping in mind that this function is clearly memory-bound.