Linear regression using CUDA

Hello CUDA programmers,

Is there a library to get regression coefficients (beta hat) of matrix X, y at once using CUDA library?
(https://www.facebook.com/groups/22207525146/permalink/10160102353845147/)

So far, I have found three components to calculate the coefficient : matrix multiplication (matmul), matrix inversion (cusolver) and matrix transposition.

It seems overwhelming to combine these tree components