Can anyone point me in the direction of information on memory coalescing (or optimizing cache efficiency) when writing to surfaces? I’m primarily interested in surfaces bound to 2D and 3D CUDA arrays. As far as I know the storage format for CUDA arrays is undocumented other than that it is optimised for 2D or 3D locality. Is that therefore all the information available on the optimum way to write to them?
Edit: I’ve just noticed that 3D surfaces don’t seem to exist (yet?).