cudaBindTexture2D incorrect documentation? requires mallocPitch, but not stated

I believe when binding memory using cudaBindTexture2D, there are certain alignment constraints, which isn’t entirely unexpected, but they are not specified in the reference manual (or anywhere else that I can find). It seems that rows must be a multiple of 32 bytes for getting correct results. Using cudaMallocPitch will always result in correctly aligned memory, BUT this has some problems since, for example, CUFFT’s routines do not take a pitch argument as input, so memory allocated in this way can’t be used with CUFFT.

This is not alignment relating to the offset parameter in the cudaBindTexture2D function, which is always returned as 0, but related to the width of individual rows.

Attached is a code which should illustrate the problem.

Perhaps the reference manual should be updated to reflect this requirement. And in the longer term it would be nice if CUFFT could handle memory allocated with a pitch != width. (3.15 KB)

I faced the same problem a while ago and came to the same conclusion (need to pad to a multiple of 32 bytes).
I would really like to have an official answer on that subject.

Today, I spent a while trying to fix a new kernel because I had totally forgotten this problem.

eelsen: what’s the behavior of Tex2D(tex,x,y) when rows are not a multiple of 32 bytes? Do you get an offset value or zero? Or even a crash?

Now I remember

However from my experience 32 bytes is enough.

Plus I remember geting errors from cudaBindTexture2D() in 2.3 while with 3.0b I get no error and offset=0

I was getting an offset value.

And what version of cuda were you using?