Complex arithmetics in CUDA

I frequently use complex arithmetics in CUDA and need to define my own implementations of, for example, transcendental functions (sin, cos, exp, …) on complex numbers. I want now to definitely solve the problem by writing a library of such functions so to consider a “large” number of cases and avoiding being pit-stopped any time for new implementations.

I’m aware that for some of them, code writing could be non-trivial from the point of view of trade-off between accuracy and computational complexity. Nevertheless, I’m also aware that C++ has its own complex math library.

Is there any possibility to “recycle” already existing C++ solutions for CUDA purposes?

Cam you be more explicit as to why the native functions don’t work for you? Is this for float? double? even higher than double precision? Would something like this work?
http://mplapack.sourceforge.net/

Mathematical device functions like sin, cos, sinh, atan, log, exp etc. are defined on double precision arguments, if I correctly understand the NVIDIA CUDA Library Documentation:

http://developer.download.nvidia.com/compute/cuda/4_2/rel/toolkit/docs/online/group__CUDA__MATH__DOUBLE_g3ebbca20a2937d1fe51329402880df85.html#g3ebbca20a2937d1fe51329402880df85

I’m searching for implementations thereof on complex numbers.

Okay, I get it now. Now here’s my suggestion. Why not just keep parallel arrays of the imaginary and real parts? I suppose if this is production code that has some spec that needs some sort of robust complex operations to make it ‘user/coder friendly’ as part of a deliverable then it makes sense. Just throwing some ideas out there.

See: https://devtalk.nvidia.com/default/topic/414672/complex-library-in-cuda/ and http://en.wikipedia.org/wiki/Euler%27s_formula#Relationship_to_trigonometry

For example, here is a complex product function prototype keeping real and imaginary parts separate:

__device__ void ComplexProd(double *a_R, double *a_I, double *b_R, double *b_I,
                 double *c_R, double *c_I)
/* Calculates c=a*b */
{
    *c_R=a_R[0]*b_R[0]-a_I[0]*b_I[0];
    *c_I=a_R[0]*b_I[0]+a_I[0]*b_R[0];
}

Thanks for your suggestions. I have already my own implementation of a wrapper complex type class as well as related overloaded operators. I try to better explain myself.

C/C++ has its own implementations of sin, cos, exp etc. on real numbers. They are optimized for accuracy/efficiency according to some numerical schemes (polynomial expansions, etc.).

C/C++ has also complex implementations of those functions. For many of them (e.g., trigonometric functions) the implementations seems rather simple: they are combination of mathematical functions on real argument. If, for example, the source files of those functions on complex arguments would be available, then one could port them to CUDA as easily as adding a device keywork in front. Of course, in this simple scenario invoking sin, cos, exp etc. would mean to invoke native CUDA mathematical functions.

Here is a bit more insight, now that you have clarified the question:
http://stackoverflow.com/questions/2284860/how-does-c-compute-sin-and-other-math-functions

From that link above I got found the fdlibm sources link:
http://gnuwin32.sourceforge.net/packages/fdlibm.htm

I should mention that if you choose to go that route (to build a library from scratch in CUDA), it is no small feat. There might be other libraries that are cleaner and not as bloated and better suited for porting to CUDA, that is just the first one I found that had source code.

My solution to this would be… since pretty much any of the trig function real/imaginary parts can be derived from complex exponentials (exp, expf) or other functions (sin,cos)… just use the provided CUDA functions, and use your complex wrapper function to calculate each real and imaginary argument separately. I don’t think you’d want to re-invent the wheel when CUDA already provides device functions that are already tested and work. Just adapt your wrapper class to calculate the arguments independently and return a CComplex struct/output or however you do it.

It’s very much possible that you could optimize single functions further from the method I have just described, building them up from a ‘from scratch’ implementation from some library sources… but I really think that would not be the most productive thing to do.

Hopefully the fdlibm library sources give you an idea of how to use the CUDA native sin,cos,exp, etc to build the functions in your template that would work on complex numbers… that would be the ideal way to do it, I think.

I am not aware of a package of complex math functions for CUDA C. I believe CUDA Fortran provides the customary complex math in device code, but my experience with CUDA Fortran is limited and does not extend to complex arithmetic. If CUDA Fortran is an option, you may want to investigate that.

As far as support for complex math functions in CUDA C is concerned, consider filing an enhancement request through the bug reporting form linked from the registered developer website. If you decide to do so, please prefix the synopsis (i.e. subject line) with “RFE:” to mark this as a “request for enhancement” rather than a bug.

Someone bumped up this thread a while ago:
https://devtalk.nvidia.com/default/topic/517156/a-new-implementation-of-c-std-complex-for-complex-numbers-on-cuda/

See:
https://github.com/jtravs/cuda_complex

It’s not being maintained, but would be a good starting point.

First of all, let me thank vacaloca for all his efforts.

Concerning the link

https://devtalk.nvidia.com/default/topic/517156/a-new-implementation-of-c-std-complex-for-complex-numbers-on-cuda/

let me say that I have already implemented something similar.

Regarding

http://stackoverflow.com/questions/2284860/how-does-c-compute-sin-and-other-math-functions

my opinion is the same as yours: exploit as much as possible what CUDA already has in terms of mathematical functions and develop only what is needed, especially on complex arguments.

I have taken a look at

http://gnuwin32.sourceforge.net/packages/fdlibm.htm

and I think it would be a very good starting point to possibly develop new mathematical functions in CUDA if NVIDIA will not make them available in future.

So, finally, thank you very much again.

Let me also thank njuffa for his usually kind answer. I have filed a “request for enhancement” as suggested.

On the Stackoverflow

http://stackoverflow.com/questions/17405715/complex-arithmetics-in-cuda/17495149?noredirect=1#17495149

someone was directing me to the link

https://github.com/jtravs/cuda_complex/blob/master/cuda_complex.hpp

which is almost exactly what I was searching for.