Hello,
I am devoloping a method that iteratively updates values in a 3 dimensional vector field, and I would like to use the automatic interpolation from textures (cudaArrays).
As far as I understand CUDA, a way to do this, is to use both an 3d cudaArray allocated using cudaMalloc3DArray tohether with a pointer array allocated with cudaMalloc3D.
In my iterative step, I then read my values from the cudaArray as a texture, write the values to the pointer array, and finally I do a device to device copy of the written content to the cudaArray.
My problem is that it seems a bit odd how the arrays are allocated and how data is copied between them. I represent a 64x64x64 grid of float 4 values like this:
cudaArray *darray = NULL;
const cudaExtent volumeSize = make_cudaExtent(64, 64, 64);
cudaChannelFormatDesc desc = cudaCreateChannelDesc<float4>();
cudaMalloc3DArray(&darray, &desc, volumeSize);
cudaPitchedPtr dPtr;
cudaMalloc3D(&dPtr, volumeSize);
cudaMemset3D(dPtr, 0, volumeSize);
// Make a device to device copy
cudaMemcpy3DParms copyParams = {0};
copyParams.srcPtr = dPtr;
copyParams.dstArray = darray;
copyParams.extent = volumeSize;
copyParams.kind = cudaMemcpyDeviceToDevice;
cudaMemcpy3D(©Params);
This last copy fails. I geuss it is because the sizes of the arryas do not match.
The cudaArray is allocated with inforamtion about the float4 size, the other one is not. Should I use another volumeSize for the pointer array, like this maybe?
const cudaExtent volumeSize = make_cudaExtent(sizeof(float4)*64, 64, 64);
And finaly, I dont understand the pitch of the cudaPitchedPtr of a 3D array. How am I supposed to use it for correct indexing?