global memory misalignment

Hello,

When I write to a global array at [index + i] in a preceding kernel function, and subsequently read from the same global array at [index + i] in a later kernel function, I can be sure to expect the array values at [index + i + 1]

It seems that the whole array is shifted 1 position left, with the very first value being truncated or discarded

It is a type double array, hence I would expect memory alignment to be automatically met…?

It’s hard to speculate without any code to look at but I feel safe saying something is clobbering one of *, index or i. Can you cuda-gdb single-step through the code to see where they depart from expected values?

Memory alignment isn’t the problem; All types declared in CUDA C receive the necessary memory alignment. If a double pointer were shifted by other than 8 bytes, all the doubles read from it would be garbage. I’m not sure what will actually happen if you try to load from a double * which got misaligned at runtime…