Hi all,

I have a grid set with dim3 GridDim(1,128,1)

Each block is set with dim3 BlockDim(16,1,1)

I allocate a 3D cuda array sized (x=512,y=129,z=128)

Each block writes in his respective 2d “sub”-array of the 3d array (sub array selected with blockIdx.y).

That sub-array is located on the x and z axis. There are 129 sub arrays.

Each thread writes in a cell of that particular sub-array.

The 16 threads of a block write in the sub-array in a coalesced way on the x axis( thread k writes in cell x=k).

Writes are looped until the 128 blocks have ended filling their own sub-array.

When that is done, another kernel is launched with 64 blocks of 1024 threads.

The goal of that kernel is to fill the last (the 129th) sub-array on the y axis.

Each thread receives its own (x,z) coordonates and read through the y axis. Therefore the reads should be coalesced as all the 65536 threads of that kernel read in the same “layer” at the same time.

I would like to know if the coalescence is true on the three axis of a 3d array as i write on the x axis and read on the y axis ?

Thanks ^^