Hi,

I have to transpose a 3D volume in a specified direction.

You can think of it as a cube that have to be rotated to the left, or to the front.

I’m using the same implementation as the 2D transpose in the sdk , and using a block of 8x8x8 (= 512 which is the max).

As the blocks are of size 8, I was wondering if the reads and writes are coalesced, and if not, is there a way to coalesce this ?

Another general question :

if the width of a 2D image is not a multiple of 16, the begining of the memory the blocks accees won’t be (begin + n*16) but (begin + n*16 + m*width), so HalfWarpBaseAdress-BaseAdress won’t be a multiple of 16, is that right ? So will the reads and writes still be coalesced ?

Thanks.