cudaExtent extent = make_cudaExtent(W*sizeof(float), H, 9);
cudaPitchedPtr covOut;
cudaMalloc3D(&covOut, exent);//allocate the space on device
where W and H and 9 are the dimensions of my 3D array on the host,
can I access the variables in the 3D array on the device with the bracketed notation like
covOut[i][j][k]
Or do I have to use pointer arithmetic like
covOut[i*W*H+j*W+k] to access it as if it were a single dimension array? If I have to access it like this, I thought I read somewhere that the CUDA arrays are stored in column major, so would that change the access to this:
Malloc2D and Malloc3D are only for allocating arrays for textures, you can’t use them to allocate general purpose global memory. Just use 3d indexing into linear memory using something like your second indexing scheme.
If this is the case, do I have to flatten my host 3D array into a 1D array before I can do a cudaMemCpy to the device 1D array, or is there a method or something for that?
Oh and one other thing - I guess this may be my C/C++ noobishness showing through -my code is currently C++ and i was just using new complex to allocate complex data. How would I do this in cudaMalloc?
What is float2? Have the operators been overridden to allow for things like complex addition and multiplication, absolute value, etc?
Also, what is cuComplex? I found some reference to it, and apparently I have this library in my lib directory. Can I use cuComplex for device and host code?