I am trying to learn PyCuda, so this is likely a simple answer but I need to ask.
Say I have a device pointer to an array of type cuFloatComplex and I would like to copy that from device to host such that the results can be printed to standard output - what is the PyCuda way to do this? I assume that I would need to create a numpy array on host to store results but given that cuFloatComplex is not a Python type how can this be done?
The memory layout of CUDA’s complex types matches what is specified for C, C++, Fortran: the real part of each number is followed by its imaginary part. So when interfacing with those languages simple memcpy() like operations will work for copying data, and just passing a pointer for zero-copy interfaces.