Possible bug in cuComplex.h Is typedef of cuFloatComplex correct

The include file cuComplex.h contains a statement: “typedef complex cuFloatComplex”. The C99 complex type defaults to double precision. Hence, shouldn’t this statement really be: “typedef float complex cuFloatComplex”?

Comments anyone?


I believe with devices that don’t support double precision it is automatically truncated to single precision, but don’t quote me on it.

Hi clamport. You might be right about that, but double precision support is here to stay. So, if there is a problem it needs to be fixed!



The typedef is in a section guarded by

#if (!defined(CUDACC) && defined(CU_USE_NATIVE_COMPLEX))

RIght now, it is dead code.

Hi mfatica, thanks for the reply. Your thoughts on the following will be appreciated:

We have an algorithm using complex arithmetic that we want to run on

the GPU. As a first step, we implement it using C99 native complex

types, compile it using gcc and test it on the host CPU. The second

step is to convert the “float complex” declarations to

“cuFloatComplex”, and the arithmetic operations themselves to use the

inline functions from the cuComplex.h header (e.g. replace “z = a + b”

with z = cuCaddf(a, B), etc), then recompile using gcc using

“-DCU_USE_NATIVE_COMPLEX”. In principle, this should generate

identical CPU machine code. If we’ve done everything right, it will

run again on the host and get the same answer it did before we made

this transformation. The third and final step, obviously, is to

recompile the code unchanged with nvcc and run it on the GPU.

The second step didn’t work. The reason it didn’t work is because

cuComplex.h contained this line

typedef complex cuFloatComplex;

when it should have had

typedef float complex cuFloatComplex;

Now, you can argue that the header is only supposed to work when

compiling with NVCC, but then why include the conditional for

CU_USE_NATIVE_COMPLEX? If you always have NVCC defined, the

preprocessor conditionals make CU_USE_NATIVE_COMPLEX itself dead code.

It seems to me that the way the header was written was to support

exactly this style of algorithm development: implement and test your

algorithm first on the host using native complex types, then switch to

the GPU when you get it working.