I’m creating a wrapper for C# since the CUDA.NET implemention does not fit to my needs.
Having a look over all allocation methods I found that there is some confusion (at least for me) what is the different between
As I have learned there is a difference between a linear memory and a cudaArray. For allocating a linear memory we can call the methods
cudaMalloc
cudaMallocPitch
cudaMalloc3D
and for these type of linear memories (1D,2D,3D) we have the methods to set the data
cudaMemset
cudaMemset2D
cudaMemset3D
and the methods
cudaMemcpy
cudaMemcpy2D
cudaMemcpy3D.
This is a part which I don’t understand. We are still in linear memory and suddenly for the cudaMemcpy3D we need to work with cudaArrays. Can you explain why?
Furthermore the documentation says when allocating linear memory with cudaMalloc or cudaMallocPitch I must free them with cudaFree. What is with cudaMalloc3D, how do I free the memory there? I assume with cudaFree?
What is the difference between cudaMallocArray and cudaMalloc3DArray?
Can you give me a little more insight into the different methods, since it is very confusing to read the documentation. A little more example would be very helpful.
Many of these are used only with textures. The rest automatically align your rows/columns, which is sometimes good for performance. If you only ever use cudaMalloc(), you’ll be fine.
It’s quite a mess that the CUDA team has made here. It would be nice if things were more orthogonal.
thank you for your advice. In the meantime I have downgraded my code to the driver api which seems more intiutive and at least for my needs also more appropriate.
Probably you have an out-of-bounds error in the middle of your code that is corrupting the value of ActivePix2D, dimGrid, or the Runtime’s internal state.