3D Surfaces corrupting data?

Hi,

in toolkit 4.2, are there any limitations (apart from a maximum) regarding the size in each dimension of a 3D surface bound to a cudaArray with the flag cudaArraySurfaceLoadStore?

I am experiencing strange behaviour when reading from one surface bound to cudaArray A and writing to two different surfaces bound to cudaArray B and C, respectively in same kernel: Data in A is changed in some points when doing so(!).

My surfaces (float2) are of size
A (reading): 17x17x249
B (writing): 9x9x125
C (writing): 9x9x9

I can not find any restrictions in the CUDA C Programming Guide or NVIDIA CUDA Library, but I want to make sure not to make a stupid mistake like that. If this should work in principle, I will try to explain my problem in detail and extract some code snippet to reproduce the “error”.

By the way: I have already noticed that boundary handling does not seem to work properly for the third dimension of a 3D surface. This is not the problem here since I do not need it.

Any help welcome
JRO