CULA sgels or alternative

I have been using the CULA sgels routine for least squares solving. However, it seems to be no longer being maintained and the last release was for CUDA 5.0.

I need something that works with 5.5 and future releases of CUDA. Is there anything available?

SGELS routine for GPU is available in MAGMA Library (

IN magma_s.h, look for the routine magma_sgels_gpu or magma_sgels3_gpu : they relies on magma_sgeqrf and magma_sgeqrf3 QR factorization respectively. The “3” version is faster at the expense of some more workspace.

thanks, I’ll give that a go

OK I had a look at Magma. It does seem to contain the functionality I need.

However, I don’t think it’s the right solution for me since I am working on an open source project aimed at Windows users with an emphasis on users using the code rather than the binaries. I don’t want to create dependencies on an external library that is non-trivial to build.