Matrix Size Limitation

virati · September 1, 2009, 7:43pm

I’m trying to do a basic matrix calculation on a square matrix. My kernel works great up to 1000x1000 matrices (or about that order of magnitude). I’m using Cuda 2.3 on a GTX 295.

However, the moment I bump it up to 10000x10000 my kernel merely returns an equally large matrix of zeros. It does the same for all values above. I’m using block size of 512 threads, which gives me a grid of 202,450 blocks (well below the limitation of 65535^2 = 4.29e9).

Memory wise, even the 10kx10k matrix of doubles only takes up 8 megabytes. It IS currently running on my display device but I’m sure the device has at least 10MB free out of 1.7GB.

I’m not breaking out of the int range (even the 10kx10k only has 10e8 elements, well below the 2.1e9 range of basic signed int) so where does this problem stem from?

Is there something else I need to be considering in all this?

avidday · September 1, 2009, 9:12pm

I think you maths is slightly off. 10k x 10k = 100e6 * 8 bytes per double = 800e6 bytes or 780.25Mb. You card only has 896Mb of ram per gpu, so it seems pretty likely you are running out of memory.

virati · September 1, 2009, 10:27pm

Argh, I caught that a little later than I should have. It wasn’t a memory problem though… I actually figured a way around that (finally learned about 2d arrays of blocks into grids).

So NOW I’m memory limited (can’t get to 20k x 20k) but I suppose I can just split the matrix and tackle each quadrant seperately. Thanks for the help!

Topic		Replies	Views
problem with big matrix CUDA Programming and Performance	3	2025	August 29, 2008
Matrix multiplication ERRORS & few thoughts on CUDA Basic programming errors need correction CUDA Programming and Performance	14	13530	January 24, 2009
matrix_mul with max_size CUDA Programming and Performance	1	1134	May 21, 2010
Size limitation for 1D Arrays in CUDA? CUDA Programming and Performance	9	18570	October 17, 2013
Problems in deciding Gridsize & Blocksize for kernel CUDA Programming and Performance	13	9055	June 8, 2010
Limits on Matrix matrix multiplication CUDA Programming and Performance	9	13192	April 16, 2007
How to improve this matrix multiplication code in CUDA? CUDA Programming and Performance	6	1609	July 2, 2015
LARGE 2D arrays CUDA Programming and Performance	10	8754	August 11, 2011
Need help understanding kernel function, grid and block CUDA Programming and Performance	5	677	April 27, 2021
cuBLAS fails when matrix has more than 2^31-1 entries? CUDA Programming and Performance	12	945	September 14, 2020

Matrix Size Limitation

Related topics