Need more insight into ...Malloc and ...MallocArray

Hello,

I’m creating a wrapper for C# since the CUDA.NET implemention does not fit to my needs.
Having a look over all allocation methods I found that there is some confusion (at least for me) what is the different between

cudaMalloc
cudaMallocPitch
cudaMallocArray
cudaMalloc3D
cudaMalloc3DArray

As I have learned there is a difference between a linear memory and a cudaArray. For allocating a linear memory we can call the methods

cudaMalloc
cudaMallocPitch
cudaMalloc3D

and for these type of linear memories (1D,2D,3D) we have the methods to set the data

cudaMemset
cudaMemset2D
cudaMemset3D

and the methods

cudaMemcpy
cudaMemcpy2D
cudaMemcpy3D.

This is a part which I don’t understand. We are still in linear memory and suddenly for the cudaMemcpy3D we need to work with cudaArrays. Can you explain why?

Furthermore the documentation says when allocating linear memory with cudaMalloc or cudaMallocPitch I must free them with cudaFree. What is with cudaMalloc3D, how do I free the memory there? I assume with cudaFree?

What is the difference between cudaMallocArray and cudaMalloc3DArray?

Can you give me a little more insight into the different methods, since it is very confusing to read the documentation. A little more example would be very helpful.

Thank you very much for your help.
Martin

Many of these are used only with textures. The rest automatically align your rows/columns, which is sometimes good for performance. If you only ever use cudaMalloc(), you’ll be fine.

It’s quite a mess that the CUDA team has made here. It would be nice if things were more orthogonal.

Hello Alex,

thank you for your advice. In the meantime I have downgraded my code to the driver api which seems more intiutive and at least for my needs also more appropriate.

Martin

Around the same topic : I use cudaMalloc to allocate a 2D matrix with the following instruction :

for( int i = 0; i<dimGrid.x; i++)

		CUDA_SAFE_CALL(cudaMalloc((void**)&ActivePix2D[i], dimBlock.x*sizeof(ushort2)));

and then logically:

for( int i = 0; i<dimGrid.x; i++)

		CUDA_SAFE_CALL(cudaFree(ActivePix2D[i]));

I get an access violation error at the execution of the code on the last code line.

Of course I can simply allocate a 1D memory block, but I still would like to make it clear : Does this use seem correct ?

Thanks

It is correct.

Probably you have an out-of-bounds error in the middle of your code that is corrupting the value of ActivePix2D, dimGrid, or the Runtime’s internal state.