The include file cuComplex.h contains a statement: “typedef complex cuFloatComplex”. The C99 complex type defaults to double precision. Hence, shouldn’t this statement really be: “typedef float complex cuFloatComplex”?
Comments anyone?
MMB
The include file cuComplex.h contains a statement: “typedef complex cuFloatComplex”. The C99 complex type defaults to double precision. Hence, shouldn’t this statement really be: “typedef float complex cuFloatComplex”?
Comments anyone?
MMB
I believe with devices that don’t support double precision it is automatically truncated to single precision, but don’t quote me on it.
~clamport
Hi clamport. You might be right about that, but double precision support is here to stay. So, if there is a problem it needs to be fixed!
MMB
bump.
The typedef is in a section guarded by
#if (!defined(CUDACC) && defined(CU_USE_NATIVE_COMPLEX))
RIght now, it is dead code.
Hi mfatica, thanks for the reply. Your thoughts on the following will be appreciated:
We have an algorithm using complex arithmetic that we want to run on
the GPU. As a first step, we implement it using C99 native complex
types, compile it using gcc and test it on the host CPU. The second
step is to convert the “float complex” declarations to
“cuFloatComplex”, and the arithmetic operations themselves to use the
inline functions from the cuComplex.h header (e.g. replace “z = a + b”
with z = cuCaddf(a, B), etc), then recompile using gcc using
“-DCU_USE_NATIVE_COMPLEX”. In principle, this should generate
identical CPU machine code. If we’ve done everything right, it will
run again on the host and get the same answer it did before we made
this transformation. The third and final step, obviously, is to
recompile the code unchanged with nvcc and run it on the GPU.
The second step didn’t work. The reason it didn’t work is because
cuComplex.h contained this line
typedef complex cuFloatComplex;
when it should have had
typedef float complex cuFloatComplex;
Now, you can argue that the header is only supposed to work when
compiling with NVCC, but then why include the conditional for
CU_USE_NATIVE_COMPLEX? If you always have NVCC defined, the
preprocessor conditionals make CU_USE_NATIVE_COMPLEX itself dead code.
It seems to me that the way the header was written was to support
exactly this style of algorithm development: implement and test your
algorithm first on the host using native complex types, then switch to
the GPU when you get it working.
MMB
Bump
bump