Predicting the pitch returned by cudaMallocPitch

I need to estimate the sizes of my 2-D arrays before I allocate them with cudaMallocPitch(), just so that I can make all arrays fit in the memory that is available. If I know the nominal dimensions and data type of an array and the compute capability of my device (2.0 in my case), how can I predict the pitch value that cudaMallocPitch() will use?

I did read the relevant sections in the Programmer’s Guide, and it seemed to say that the pitch would have to be a multiple of the warp size (32), but it wasn’t clear to me.


If NVIDIA wanted you to know that, they would have stated it in the programming manual.
So, therez no way you can find all these out.
You need to change your line of thought.

I remember writing some code snippets where I wrote these as a multiple of 32, it worked. This was on CUDA 2.3.

It is sometimes useful if one manages to use constant indexing when it comes to performance…

How about just using cudaMalloc() and choosing some suitable pitch yourself?

Usually what I do. Then u know exactly what is happening :)

The reason cudaMallocPitch() returns the pitch (and the reason texture binding API calls return a texture offset) is to future-proof code, should hardware requirements change in the future. This is not just a theoretical concern. For example, texture alignment requirements increased between sm_1x and sm_2x which broke any code that made assumptions about the required alignment and therefore did not examine the texture offset returned by the binding APIs. Since one of the main uses of cudaMallocPitch() is to provide underlying storage for 2D textures, I would strongly recommend not trying to guess the required pitch.