Does cusolver for sparse cholesky necessarily slower than single-thread CPU?

czhan169 · August 30, 2020, 7:29pm

In my case, solving a linear Ax=b system where A is a 30000*30000 symmetric (where the CSC representation has the same vectors as CSR) sparse matrix with at most 13k nnzs, is AT LEAST 10 times slower than even a single-thread laptop CPU solver.

I use RTX 2080 runs at 1.9GHz and the core utilization is near 99%. CPU I use is a laptop i7-9750h runs at 2.6GHz.

In both case I prefactorized (numerical analysis) once, and then directly solve for 20 times. RTX 2080 needs 50-55ms for each consecutive cholesky solving process, and in total 1.1s. CPU solve it at most 0.07s in single thread when hyperthreading is activated, which should be way slower.

My professor suggests me higher end GPU, but I really don’t think at least 10x performance will be gained.

ytz15 · September 17, 2020, 9:22pm

Hi ,Just out of curiosity, how did you manage the data transfer between host and device ? Do you also transfer data for 20 times or just 1 time ?

czhan169 · September 18, 2020, 2:39am

Just 1 time, and even if I transfer 20 times the total transfer time is way small than just one solve of already factorized.

I move back to CUDA 7.5 and is at least twice as fast as CUDA 11.0, strange though still way too slower than CPU as a joke.

Topic		Replies	Views
Accelerate Cholesky function in cuSolver. GPU-Accelerated Libraries	0	389	June 18, 2019
cuSolverSp_LinearSolver performace GPU-Accelerated Libraries	0	554	November 22, 2017
Example using cusparse and cusolverSpDcsrlsvchol GPU-Accelerated Libraries cusolver , cusparse	11	91	May 7, 2025
Low performance of cuSOLVER compared to CSparse? CUDA Programming and Performance	0	381	June 27, 2020
Why cusparseDcsrsv_solve so slow? GPU-Accelerated Libraries	0	417	March 8, 2018
Cusparse cholesky & structural zeros - preconditioned conjugate gradient GPU-Accelerated Libraries cuda	3	1051	March 15, 2021
cuSPARSE generic SpSM much slower than legacy csrsm2 GPU-Accelerated Libraries cublas , cusparse	3	76	February 18, 2025
Banded sparse matrix linear eqution solve with CUDA GPU-Accelerated Libraries	0	2231	January 16, 2013
cuSPARSE to solve multiple independent sparse linear systems in parallel GPU-Accelerated Libraries	4	2197	March 3, 2014
why cusparse is just 2x faster than mkl CUDA Programming and Performance	1	1059	December 20, 2011

Does cusolver for sparse cholesky necessarily slower than single-thread CPU?

Related topics