Three dimensional threads memory alignment


I am using CUDA for my research work. I got a doubt when i was converting the algorithm to the CUDA code.

How are the three dimensional matrix elements stored in the shared memory?
The above link has the example of two dimensional array and its memory alignment in row major order. How will this work in case of three dimensional matrix?

Sheshank Kodam