Best matrix storage method?

thejfasi · May 20, 2009, 1:46am

I am writing a piece of linear algebra software, and I would like to ask what is the best method for storing dense matrices?

My current method is like so:

Initialize a matrix with a cudaMalloc((void **) matrix_point, sizeof(int) * rows * cols), meaning a matrix is simply represented by a series of rows, and so we must address it using the standard matrix trick: matrix[i][j] becomes matrix[i*rows + j].

This is great, since we can store the matrix in global memory and drop it into shared memory in coalesced memory accesses, since the rows are stored essentially as contiguous arrays right next to one another. Needless to say, this method is terrible for column operations.

What is the standard method for storing a matrix in CUDA? Is it this, or is there a more optimal strategy?

avidday · May 20, 2009, 8:24am

Internal CUDA multidimensional arrays (for texturing) are stored in some sort of local spatially coherent ordering like Morton ordering.

For linear algebra work, there really isn’t a standard, but I think most people have stuck with column major ordering because it preserves compatibility with a lot of legacy code and algorithms written to expect Fortran ordered data.

thejfasi · May 20, 2009, 2:00pm

I’m lucky enough to be free from such constraints. I’m glad to see that there is no standard method, and that I didn’t just waste a whole ton of work.

Topic		Replies	Views
Three dimensional threads memory alignment CUDA Programming and Performance	0	7135	January 19, 2011
traversing array Legacy PGI Compilers	1	6224	February 19, 2010
CUDA array memory placement & block/thread foramt row-major versus column-major CUDA Programming and Performance	0	1314	February 24, 2010
Matrix multiplication in CUDA CUDA Programming and Performance	1	2271	October 21, 2017
Question 2D matrix operation CUDA Programming CUDA Programming and Performance	10	1436	April 7, 2011
1D access vs 3D access CUDA Programming and Performance	5	2719	March 23, 2009
How you allocated a matrix on device? CUDA Programming and Performance	5	8569	November 21, 2011
Matrix multiplication CUDA CUDA Programming and Performance	7	2955	November 12, 2012
matMul in Guide CUDA Programming and Performance	1	2607	February 1, 2009
Massive Matrix Multiplication Discussion on the best way to create Massive Matrix Multiplication CUDA Programming and Performance	10	6054	March 19, 2010

Best matrix storage method?

Related topics