Multidimensional Arrays multidimensional array allocation

spiglerg · December 8, 2007, 3:53pm

Is it possible to allocate multidimensional arrays on GPU’s global memory?
Actually, I allocate 2 dimensional arrays using linear addressing, hence using just a dimension.
Then, as I need arrays of 2D arrays, I managed to do the following:
-A type a[10] is allocated on host memory ( [10] is static, but you might also consider it dynamic, it’s the same);
-A type **a_device is allocated on the device, in this case with 10sizeof(type ) size;
-For i=0…9 a[i] is allocated with the correct dimension (widthheight of 2D array)
-At last, I copy a’s contents into a_device’s ones, and it works.

Now, for further dimensions this trick is not possible, as I cannot access a_device[n].

Is there a way to allocate multidimensional arrays on device’s memory?

Simon_Green · December 8, 2007, 4:38pm

CUDA’s support for multi-dimensional arrays is essentially the same as C’s.

Personally I usually find it easier to just allocate a linear array and do the math to calculate the address in code.

If you’re using 2D arrays you may find it more efficient to use 2D textures since there is hardware that will do the addressing for you.

spiglerg · December 8, 2007, 4:46pm

Well, 2D linear addressing is really easy, and so would be 3D, but I am thinking of at least 5.
Such structure unfortunately is needed by the algorithms I am implementing.
I were also thinking of using something like [dim1dim2][[dim3dim4], which would work the way I’m currently doing, but it didn’t worked.
It just returned 0s.

MisterAnderson42 · December 8, 2007, 5:15pm

I use 4D arrays in part of my code. Note that coalescing reads starts to get a little complicated in these cases. I use cudaMallocPitch to allocate a “2D” array with width “L” and height MxMyMz. Then I index into the array by doing all the index calculations by hand. Because of my memory access pattern, a single block accesses all elements along the L axis and are coalesced because I used cudaMallocPitch. To get the right elements, I just need to access the array element at index (i*(MzMy) + jMy + k)*pitch + threadIdx.x, where pitch is in elements, not bytes.

spiglerg · December 8, 2007, 5:22pm

Isn’t indexing with so much calculations slow?
I’m trying to achieve really high performance, due to the fact I’m computing algorithms which on a common computer would take seconds.

kuisma · December 8, 2007, 6:23pm

Calculations are very cheap, and uncoalesced device memory accesses are very expensive. We are talking about a factor of magnitude 100 here. Sometime it’s even cheaper to recalculate a result then fetching it from global memory.

spiglerg · December 8, 2007, 6:38pm

Oh, that’s really interesting.
Thank you for the answers.
I think I will write a few macros for more human readable 2d/3d and 4d linear mapping, and then convert my actual array code into with 3d accessing of elements in a linear form.

Topic		Replies	Views
How can I allocate 2-dimensional array on the device memory? CUDA Programming and Performance	5	15708	August 6, 2009
Allocating a multidimensional array onto a device variable CUDA Programming and Performance	6	1585	July 15, 2015
3D arrays CUDA Programming and Performance	3	5004	March 26, 2008
Allocating multi-dimension array (An array of arrays of different lengths) CUDA Programming and Performance	10	1585	July 1, 2014
2D array indexing with double pointers CUDA Programming and Performance	1	1375	February 11, 2010
2D arrays with cuda confusion CUDA Programming and Performance	2	1098	May 9, 2010
2D arrays on GPU CUDA Programming and Performance	3	5150	November 7, 2008
Multi-dimensional arrays in global memory CUDA Programming and Performance	3	3537	August 11, 2008
Allocate dynamic 2D aray of different dimensions how to? CUDA Programming and Performance	0	690	September 14, 2011
2D arrays, pointers to pointers CUDA Programming and Performance	1	1030	February 11, 2010

Multidimensional Arrays multidimensional array allocation

Related topics