Correct offset treatment for texture access


I’ve a little bit of trouble correctly treating the offset received from cudaBindTexture2D. I have reduced it to a very nice example, see input.jpg. This image is not well aligned in memory and therefore I get an offset of 64 byte from cudaBindTexture2D. If I ignore it and just copy each pixel in a kernel to an output image, I get output-no-offset.jpg. Obviously this did not work. However, if I add those 64 byte to the x coordinates, the output is also not correct, because the accesses get clamped at the border of the image (the border is 64 pixels inside the image because I add that much to the x coordinates). Can I get both? Correct offset and not clamping inside the image? Or put differently: Can I access all data inside the specified region for cudaBindTexture2D if the offset is non-zero? If so, how?

Thanks && kind regards