I can setup and copy a 3D matrix to the device, and copy it back all successfully, but accessing it from a kernel seems to be a problem.
If the matrix is x by y by z then so long as x = z all works as it should using the single thread looping over the matrix as in the programming guide (p19)
And this also works with a 3D matrix of threads (actually a 2D matrix of blocks in the code I include here).
Anyway as soon as x != y it all goes wrong and I can’t figure out why.
Can anyone help please
3d_array.cu (2.47 KB)