coalesced access and pointer offset using 2D linear memory

Hello to all,

I have to perform calculation on a (MxN) matrix,

but to simplify my kernel my code allocates a (M+2,N+2) matrix.

I want my kernel start to compute at the element (1,1), and I can do it by passing a pointer to (v+pitch+1) instead of v.

Here it is a snippet of code to clarify

[codebox]

cudaMallocPitch( (void**)&d_v, &pitch_byte,

		(width+2)*sizeof(REAL4), (height+2);

pitch_element = pitch_byte / sizeof(REAL4);

kernel<<<dGrid, dBlock>>>(v+pitch_element+1);

[/codebox]

I don’t understand why I get coalesced access when I pass a pointer to v,

but I can’t get coalesced access when I pass a pointer to v+pitch_element+1

Thanks in advance

Francesco