Fastest factorization algorithm for banded matrices

chippies · December 19, 2009, 7:34pm

Hi everyone,

I am looking for a matrix factorization algorithm for banded matrices that is also efficient to implement in CUDA. I’ll be using this to solve linear equations. The matrices I’ll be using are about 6000x6000 elements with a band width of about 60. Looking at vvolkov’s work, QR factorization is the most efficient factorization in terms of flops for dense matrices. Since the matrices are symmetric positive definite, I can also use the Cholesky decomposition or solve the system of linear equations using the conjugate gradient method. I’d appreciate any suggestions for which method is the fastest on a GPU.

If anyone can suggest libraries that supply this functionality, I’d greatly appreciate it.

Talonman · December 20, 2009, 1:57am

Is that the same as Sparse Matrix Factorization?

[url=“Highly Scalable Parallel Algorithms for Sparse Matrix Factorization | Karypis Lab”]http://glaros.dtc.umn.edu/gkhome/node/68[/url]

My best shot… ;)

chippies · December 20, 2009, 1:08pm

Thanks for the link. Banded matrices are a type of sparse matrix where only the main diagonal and a few diagonals next to it store any values. The paper you provided a link to is probably too complex, but it’s still a starting point so I will see what I can make of it.

I’ve also been searching for anything that compares the various methods (Cholesky, QR, Conjugate Gradient, etc.) so that I can see which one is the fastest for banded matrices. On a hunch, I’d guess that the answer is Cholesky factorization simply because that’s the one that is mentioned most often in the literature I’ve seen. However, I would like someones experiences here since I haven’t yet seen a paper comparing the various methods with regards to time taken to do the factorization, or solve the linear system of equations.

Thanks for the help.

Jimmy_Pettersson · December 20, 2009, 2:02pm

I’m currently solving a problem were I’m doing hundreds of small QR decompositions in parallel and then finally solving my systems of equations with a backsolve.

But my problem doesn’t seem to suit yours since I’m working on many much smaller matrices.

Since you are working with sparse matrices, is the conjugate gradient method perhaps the way to go ? I’m not sure how parallizable it is though…

chippies · December 20, 2009, 7:22pm

Looking through a book I have which describes the conjugate gradient method, it seems simple enough, so I tried to implement an unoptimized version today. I still need to find a good reference that describes how to work out the preconditioner matrix. This method looks like something I can use, but only if it can solve a problem in a small number of iterations. Since CUBLAS has a function for multiplying a banded matrix by a vector, it should be easy to optimize it for my banded matrices.

Still, I would like to know how the various methods compare, without having to implement and test all of them.

avidday · December 20, 2009, 11:32pm

Your matrix is banded, but is it SPD? That can have a big bearing on what method to use. In my (not exhaustive) experience, for linear problems with SPD matrices, the preconditioned CG method is probably more efficient than direct factorisation if you have a suitable preconditioner which is easy to implement in CUDA which allows you to keep the data in the gpu for a complete CG step at a time. That can be a pretty big if, depending on what sort of problems you are working on.

chippies · December 21, 2009, 10:05pm

Yes, it is SPD. It comes from a quadratic programming minimization problem where it has to be SPD. The preconditioner is a pretty big if. Right now, my implementation doesn’t converge with the test matrix I am using. I’ve tried using the preconditioning matrix from the Symmetric Successive Over Relaxation method, but that isn’t working either yet.

How much more efficient is the CG method?

Thanks for the help.

Topic		Replies	Views
Banded sparse matrix linear eqution solve with CUDA GPU-Accelerated Libraries	0	2236	January 16, 2013
Cusparse cholesky & structural zeros - preconditioned conjugate gradient GPU-Accelerated Libraries cuda	3	1052	March 15, 2021
Efficient ways to solve sparse linear systems GPU-Accelerated Libraries	0	1330	March 27, 2013
best way to solve a problem Using CUDA CUDA Programming and Performance	4	2336	May 14, 2012
Help with GPU Cholesky factorization CUDA Programming and Performance	6	3919	January 17, 2013
Sparse Matrix solvers with CUDA Do these solvers exist? CUDA Programming and Performance	8	32425	October 15, 2010
conjugate gradient CUDA Programming and Performance	16	9853	June 18, 2008
Accelerate Cholesky function in cuSolver. GPU-Accelerated Libraries	0	389	June 18, 2019
Linear Algebra Solvers CUDA Programming and Performance	20	19170	February 7, 2009
Any ideas on GF(2) sparse matrix-vector multiply? CUDA Programming and Performance	5	5834	August 7, 2008

Fastest factorization algorithm for banded matrices

Related topics