In my case, solving a linear Ax=b system where A is a 30000*30000 symmetric (where the CSC representation has the same vectors as CSR) sparse matrix with at most 13k nnzs, is AT LEAST 10 times slower than even a single-thread laptop CPU solver.

I use RTX 2080 runs at 1.9GHz and the core utilization is near 99%. CPU I use is a laptop i7-9750h runs at 2.6GHz.

In both case I prefactorized (numerical analysis) once, and then directly solve for 20 times. RTX 2080 needs 50-55ms for each consecutive cholesky solving process, and in total 1.1s. CPU solve it at most 0.07s in single thread when hyperthreading is activated, which should be way slower.

My professor suggests me higher end GPU, but I really don’t think at least 10x performance will be gained.