Large memory Matrix

keith6014 · November 20, 2015, 3:08am

I am using [url]http://docs.nvidia.com/cuda/cublas/#cublas-lt-t-gt-gemm[/url] to do matrix by matrix multiplication.

But my matrix is large and i keep getting out of memory errors. Is there an algorithm or a way to get around it?

I was thinking of using smaller matrices and then add all the small multiples together? Would that work or is there a better way to do this?

Robert_Crovella · November 20, 2015, 5:04am

I don’t know how large your matrices are, but arithmetically matrix multiply can be decomposed into sub-problems. The beginning part of the answer here:

[url]Dynamic matrix multiplication with CUDA - Stack Overflow

discusses how a matrix multiply can be decomposed into a set of “smaller” problems.

cublasXt can handle this for you, ie. allow you to perform a matrix multiply on a problem that will fit in CPU memory but not in GPU memory:

[url]http://docs.nvidia.com/cuda/cublas/index.html#unique_235311925[/url]
[url]http://docs.nvidia.com/cuda/cublas/index.html#cublasxt_gemm[/url]

It will decompose the problem under the hood for you.

keith6014 · November 22, 2015, 5:02pm

txbob,

My array size is around 21600000 bytes.

I will look into cublasxt_gemm.

njuffa · November 22, 2015, 8:17pm

What kind of GPU do you use? 21 MB is not particularly large as matrices go, and with even budget GPUs offering at least 1 GB of on-board memory, you should be able to keep several such matrices resident on the GPU. You may want to look more closely at the memory management performed by your application.

Robert_Crovella · November 22, 2015, 9:29pm

Agreed. your out of memory problem should be investigated rather than looking for a different library.

If you have an out of memory issue that is actually due to this matrix size (perhaps because you have ~50 such matrices in memory) then the correct solution would be to manage that situation somehow. cufftXt won’t solve any problems like that for you.

keith6014 · November 23, 2015, 3:00am

I have a Tesla C2050. I believe it has 3GB of memory. I believe the problem is the result matrix is too large to fit into Memory.

njuffa · November 23, 2015, 12:56pm

There are a number of different Tesla GPUs with different amounts of memory, but it is reasonably safe to assume that your Tesla has at least 4 GB of on-board memory and thus enough memory for many instances of a 21 MB matrix. Since you have not shown any code that would allow others to reproduce your issue, I can only give the general recommendations to

review the number of memory allocations, and the size of each
properly check the return status of each CUDA API call, each CUBLAS API call, and each kernel launch

keith6014 · November 25, 2015, 3:15pm

Would cublasXt actually take a very large matrix split it up into streams and compute the result for you? This was my impression.

Robert_Crovella · November 26, 2015, 3:46pm

Yes, but a 21MB matrix is not a very large matrix. There would be no need or reason to split it up. Rather than pursuing this path, I would suggest getting a very crisp understanding of the out of memory issue. A 21MB matrix cannot by itself cause an out of memory issue on any GPU.

21MB << 3GB

keith6014 · November 28, 2015, 2:42pm

After doing some profiling, it seems I am taking over 3GB of memory for the initial matrix. When I do a transpose I take atleast double the memory.

The host itself has 128GB of memory but the GPU has 3GB.

Topic		Replies	Views
CUBLAS matrix multiplication matrix size limited by GPU memory size CUDA Programming and Performance	8	3565	August 2, 2010
cublas large matrix multiplication large matrices won't compute CUDA Programming and Performance	4	3546	January 17, 2008
Query on Matrix Multiply performance when the matrix is very huge CUDA Programming and Performance	3	873	January 7, 2016
matrix multiplication for large matrices CUDA Programming and Performance	3	1608	August 22, 2011
Large Matrix in Cuda CUDA Programming and Performance	3	8414	January 22, 2010
Cublas functions, matrix size limit..? Able to allocate too much memory through cublasAlloc CUDA Programming and Performance	0	2359	March 18, 2009
matrix_mul with max_size CUDA Programming and Performance	1	1094	May 21, 2010
Matrix multiplication woes large inner, small outer dimensions CUDA Programming and Performance	21	10156	March 24, 2009
Huge Matrices General question about how best to deal with very large matrices >4 CUDA Programming and Performance	8	2206	July 6, 2009
CUBLAS - low performance on matrix multiplication CUDA Programming and Performance	7	18222	March 30, 2011

Large memory Matrix

Related topics