How to Solve A Very Large Dense Least Square Problem with Cuda

Hello Everyone,

I am currently looking at a problem which involves the solution of min||Ax-b||, where A is a very large dense complex matrix, and b is a complex vector. Here very large means matrix size of 10000*10000 at least.

I searched documentation for cuSolver, but it seems that cuSolver cannot solve this issue, because its dense solver assumes A to be a square matrix. While its sparse solver supports non-square problem, it does not apply to my issue because my matrix A is a dense matrix.

I wonder if anyone in this forum has solved a similar problem before. Thank you!


1 Like

The MAGMA library supports solving least-squares problems with dense matrices. See . Each matrix in your case will occupy at least 1 GB memory, so you might need a GPU with as much memory as possible. What kind of engineering problems gives such large dense matrices ?

Thank you for providing this useful information.

It is a numerical acoustics problem. Actually I am using boundary element method to solve it. However, for high frequency components, the data set is extremely large.

The reason why I choose gpu is because I have to calculate each element in matrix A, so generating A itself would take me too much time. With A completed, the least square problem I mentioned above will need to be solved.

Is it possible to generate the matrix on GPU with separate steps and then use another library to solve least squares solely on host?