using cuComplex in device and host code

This is a split off of the other thread I had going, where the topic diverged somewhat:

If I am going to have a kernel that is using complex values, i found a thread here that talked about using cuComplex as the datatype for that array. That seems straightforward enough. I am just curious if there are ways to avouid having to keep two copies of the data - one for my original complex data, and the other for my cuComplex array, which is a copy of complex data, and wheether or not there is a way to set a complex equal to a cuComplex…I havent found anything int he header file for that yet.


When using a cuComplex, it seems like I will have to allocate it on disk like a normal array of structures, is that correct?

cuComplex h_test = malloc(N*sizeof(cuComplex));

Is this going to be acceptable when later I want to move that array to the GPU with a cudaMemCpy? ala:

cudaMemcpy(d_test, h_test, N*sizeof(cuComplex), cudaMemcpyHostToDevice);

Thanks for the insight. I am sure I am not the only one wondering such things.

The complex types defined by cuComplex.h store complex data such that the low word (.x) contains the real portion, and the high word (.y) contains the imaginary portion. This layout is compatible with the C99 complex types. I do not have a Fortran standard in front of me, so I cannot tell for for sure whether this layout is also guaranteed by Fortran, but in my practical experience (e.g. Fortran wrappers for CUBLAS), it matches.

As long as your software’s layout of complex data matches that used by cuComplex.h, no additional copying of the complex data is needed, simply pass the appropriate pointers. […] Thinking whether there might be issues with different aligment requirement it seems to me that as long as the cuComplex types are used on the device side and your software’s complex types are used on the host side, you should be fine. You’ll need to copy at least once from host to device anyhow and as long as the device-side target of that copy is properly aligned it should be fine.

Oh ok - i guess I didnt realize that the std:complex used the same format. That is very handy that I will be able to just do the hostToDevice copy of my currently existing complex into a device cuComplex and it will work. I guess I will still have to write my kernel code to use the operator functions, since it doesnt appear that cuComplex has the operators for +, *, etc overloaded.

Note that I stated that cuComplex.h defines complex types whose layout matches C99’s complex types. complex are C++ types. I am looking at the ISO C++ standard right now, section 26.2 “Complex numbers”, and it seems that the layout matches (low words contains the real part, high word contains the imaginary part), but I am struggling a bit with the specification, so maybe a C++ language lawyer can chime in and confirm.

cuComplex.h was created very early in the life of CUDA when no C++ features were in the picture, in fact it follows C99’s complex support. Thus no overloading, and only a minimum set of operations needed for CUBLAS and CUFFT were defined.

So, interestingly enough, I went through some of my functions that previously used std::complex, and swapped out cuComplex for them. Now, when I try to call those methods with a cuComplex that I have malloc’d, I get something like this:

…/common/Filter.cpp:73: error: no matching function for call to ‘ReaderIF::getData(float2&, int&)’
…/readers/ReaderIF.h:63: note: candidates are: virtual bool ReaderIF::getData(cuComplex*, offset)
…/readers/ReaderIF.h:65: note: virtual bool ReaderIF::getData(std::complex, offset)
…/readers/ReaderIF.h:82: note: virtual bool ReaderIF::getData(float
, offset)

why am I getting this?

here is how i called the getData function:

cuComplex *h_hhBuff = (cuComplex *)malloc(memsize);
for (int r = 0; r < rows; r++)

		hhReader->getData(h_hhBuff[r*cols], r);

since I am clearly casting the malloc as a cuComplex, shouldnt I be able to call that first candidate?

You seem to dereference h_hhBuf, so you don’t pass a pointer to getData()?

I’m not sure what you intend to do, but you could try hhReader->getData(&h_hhBuff[r*cols], r);

to see whether it compiles as expected (notice the & in front of h_hhBuff[r*cols])

Oh, you are right. I did fix this, and forgot to post that I had solved it in this thread! Thanks!