Um, what? That’s the correct way to access elements in a 3D volume stored in a 1D array. However, if your access pattern is not really random, then there’s no need to read from a 3D texture. You can read in the same fashion as you’re writing.
I use cuda arrays for reads in a 4th order finite difference code, the access pattern is not “random” but quite “dense” and can not be completely coalesced. I use 3D cudda array to benefit from spatial and temporal locality of accesses through the cache mechanism.
I can’t find in the documentation if the cuda Arrays are row major or column major.
The threads blocks are in column-major order (the thread ID of a thread of index (x,y) is (x + y*sizex))
The C arrays are in row major.
On one hand I guess the column-major order would be more coherent, since thread blocks are column major (to avoid coordinates transformation for each thread accessing an array element), on the other hand, row major order would be more coherent with C (the order the arrays are stored on the host).
some say it’s good, some not, who believe ? :blink:
I usually store global 1D array with a column-major pattern, because when I access my1Darray[ksizexsizey + j*sizex + i] with
Majority has to do with which spatial index is changing ‘fastest’ as you progress linearly through the data as it is laid out in memory. If we assume that row-major in the context of 3D means x changes fastest, then y, and z changes the slowest, then your last example is actually row-major. Your example with thread indices is also row-major as the x index is changing faster than the y index.
Also, you don’t access memory in a cudaArray directly. You access it via textures, which the prorgamming guide tells you will be spatially coherent. Thus it shouldn’t matter to you how the data is internally stored in an array provided your texture lookups return the expected values.