Significance of Pitch for Allocation of 2D Arrays

I am writing to a program to create a 2D array on device. The concept of Pitch vis-a-vis Width is not clear to me.

I read some where that…

Why should a row be padded?

Why should the pitch be different from width?

What is this pitch afterall?

And Why should this pitch be passed to the CudaMemcpy2D() function???

Could somebody please explain the underlying idea in detail with an example or two??

Thanks

The pitch is not necessarily different from the actual width. The reason why arrays are padded is for better memory access.
If you look at the coalescing rules, you see that neighbouring threads should access neighbouring elements, but the base address (address for thread 0) should also be aligned to a fixed number of bytes.
If you would not pad the array in the X dimension, than it’s possible that e.g. accessing the the first element of the second row can not be coalesced.
When you create the array with a pitch call, then de driver will select an optimal padding for you and returns the pitch (number of bytes in each row) that you need to use to access the elements in rows other than row 0.

N.

What do you mean by coalescing two rows? And where can i find the coalescing rules???

Read Chapter 5 of the programming guide.