I’m using cuda 6.5 for image processing with GTX 780 and GTX 750. I noticed some problems with my indexes due to cudaMallocPitch. It seems like the rows are filled to a multiple of 512 Bytes.
I understand the advantage of row alignment but I do not understand why 512 Bytes are used. This is too much and several 2D-Arrays (with different datatyps) are filled with a different number of padding elements. My code would work with 128 Byte.