Probably row major order versus column major order. I have only ever used 3D textures with a code with column major ordered storage, and it just worked, but I have no proof that it is really ordering that is in play here. The 3D texture documentation is rather, err scant…
IMO, this is normal behaviour which also applies to 2D textures. It just depends on how you store the texture data.
e.g. it is specified somewhere in the documentation that the data for a 3D texture should be a concatenation of the various slices, where each slice is a concatenation of rows an each row is a sequential array of data.
if you have 2D linear memory in CPU as
then mem[y] is actually a pointer to the first element (x=0) at row y and mem[y] is the actual data at index x of row y, but if you want to access the same data through a texture, you have to use (x,y) in the tex2D call.
The same applies for 3D textures, where you have to use [z][y] to access the value at coordinates (x,y,z) instead of (z,y,x).
Thanks! The Progrmaming guide explains how to access 3D arrays in the section under “device memory - section 3.2.1”. There in, the Z,Y,X relation shows up correctly… I think it is left to the reader to assume that “cudaMalloc3Darray()” works similarly and texture acceses also need to be done that way…
–edit-- but if you look @ it from the CPU side, an array of [Y][Z] is represented as X amount of YZ slices in memory… When these slices are copied to GPU, the GPU sees this as Z amount of YX slices. This is an important point to note. And hence texture accesses need to be Z,Y,X – This really needs some documentation somewhere.
Anyway, it works for me now and performance rocks! Good enough…
While the documentation probably could be clearer, the real problem is your lack of understanding of concepts like row-major ordering. The CPU and the GPU are using the same ordering, which is the one you decide on. The texture function parameter names imply that they expect row-major ordering in your data, but they should work fine with column-major ordering as well provided you are aware of it and supply the coordinates correctly.
Yeah they use the same ordering… CPU calls it as X,Y,Z and GPU prefers to call it as Z,Y,X. This is just how they prefer to call it.
If you read my previous post, I have mentioned the place in the programmers guide where this is quite clear (Section 3.2.1).
So, therez nothing to row-major and column-major ordering here… unless i am missing something drastic… I have never used textures in the graphical sense and probably thats why I am not seeing something that you see.
I don’t understand how the function parameter names imply something about ordering… Can you explain it?
This seems to be the source of your confusion. The CPU and the GPU “prefer to call it” the same thing. Consider this array of data:
The element at (x = 0, y = 1) is 2. However you arrange these four numbers in memory, the conceptual layout should still be the same: a 2 x 2 matrix of data where the element in the first column of the second row is 2.
In C, with a (standard, row-major) 2D array, you would access element (0, 1) as follows:
In C, with a 1D, row-major array, you would access element (0, 1) as follows:
data[1 * 2 + 0]
With a CUDA texture where you filled it using data from a row-major array, you would access element (0, 1) like so:
tex2D(data, 0, 1);
If we reverse things, though, and store data in a column-major fashion, it would look like this:
data[0 * 2 + 1]
tex2D(data, 1, 0);
A 3D array and texture would behave analogously.
If you take a look at my examples, I’m accessing the element (x = 0, y = 1). If you look at the names of the tex2D parameters, you’ll find that they are: texRef, x, y. In the row-major example, these match up. I’m supplying 0 for the x parameter and 1 for the y parameter. However, in the column-major example, they don’t match. Hence, the expected ordering is row-major, but really you can use it either way you want as long as you are consistent.
I suggest you read this Wikipedia article as it may help you understand row-major vs. column-major ordering and what it means:
You prefer to call [0,1] as 2 because of “graphics” knowledge… i.e. If you look @ your screen, the top-left most is 0,0 and as you descend down “Y” increases… This is where the confusion starts… This ordering is opposite to the memory ordering in “C” language (i.e. row-major)
Now my point is simple: I dont deal with graphics. i deal with a molecular chemistry application and I use a 3D texture… When I copy to GPU, I need to know that this kind of ordering change happens… It is very important that this is documented in the manual. Thats my point.
You’re absolutely right. It’s mostly due to my OpenGL experience that I don’t have a problem understanding this coordinate mapping and I can imagine that it can be difficult to interpret without graphics knowledge.
The “upper left corner” you’re referring to is mentioned in the CUDA reference manual though.