CUDA and Linear Algebra

Hi, everybody.
I’m a high school senior really interested in CUDA programming. I was supposed to do some research last year with CUDA and linear algebra, but my mentor couldn’t actually do it in the end. I ended up reading a lot, but not putting much to practice. Right now, I’m looking for someone experienced to give me some advice and guidelines. Is anybody experienced interested in helping me?

For a start, I’d like to code an effective solution for solving a linear system. As far as I know, CUBLAS can solve a system if it’s given in the echelon form, correct? Well, for some of my first work, how difficult would it be to compute the echelon form? Has anybody tried doing it? I’ve read some of Volkov’s papers, but it’s way too advanced for me right now. I need some introduction to it.

If anybody is interested in working in this area, I’d gladly exchange emails. I’m really interested in working with university students, but if someone with higher education would like to help me a little bit, it would be greatly appreciated. However, I guess someone with a PhD doesn’t have time for this :), so if anyone can help me, please write either here or ask me for email. I am really interested in solving linear algebra problems with CUDA. However, I need someone to help me around, introduce me to concepts or at least just recommend me good books (both in linear algebra and CUDA/parallel programming).

Thank you so much in advance.

Most people will use ready-made libraries for this, in particular LAPACK. Ports of LAPACK to the GPU are available, but not necessarily for free.

If you just want to learn and “roll your own”, I would suggest familiarizing yourselves with the underlying algorithms. Do your own CUDA implementations and don’t worry about the performance until you have a solid grasp. LU decomposition is a standard algorithm used for dense solvers. The basic ideas behind it are older, I would suggest looking up Gauss Elimination and Gauss-Jordan algorithm. Last I checked, the Wikipedia articles for these three were reasonable starting points.

Any number of books are available on the topic, and which one is most suitable is a matter of mathematical background and personal tastes. One standard reference is Golub / van Loan “Matrix Computations”.

No, I’m not really interested in sticking with ready-made libraries for a long time. I want to have a deep understanding of the underlying algorithms and parallel programming concepts. The whole area is really interesting to me.

I’m familiar with Gauss Elimination as well as the Gauss-Jordan algorithm, but I must admit I would have no idea how to implement a good parallel algorithm in CUDA! As for the last book you mentioned, it seems like a good one for supplementing my learning. Thank you so much!

I would still appreciate learning with someone, though. I’ll make sure I look into what you wrote, njuffa, thanks!

If you are already familiar with Gauss elimination, you will readily observe that it naturally exposes some amount of parallelism. While there are dependencies between row operations, there are no such dependencies between columns. This means that each column can be handled by a separate CUDA thread, which provides a good starting point for your work.

As an engineer, my recipe for tackling big, seemingly overwhelming, problems is to start at one corner that offers a subproblem that I have an idea of how to attack and learn something in the process of solving that. The knowledge gained in the process then hopefully allows me to tackle the next subproblem, and so on, until the entire problem is well understood and solved. For big problems, that whole process may take years, so persistence is helpful. I imagine it is similar in science, where some hard problems resist a comprehensive solution for decades, maybe centuries, with each generation of scientists nibbling away at various aspects of the problem until finally it is solved.

Best of luck with your work!

@momonga You can get infinitely deep by reading some of these papers http://hgpu.org/?s=matrix along with @njuffa’s recommendations.

@allanmac That link definitely looks interesting!
@njuffa I have a decent programming experience myself, but I have never tried much in parallel programming before. Your advice is still the best advice you can give to someone interested in these things, and I’m glad I had it given to me a long time ago. Thanks anyway!