Why cusparseDcsrsv_solve so slow?

duanjiawei · March 8, 2018, 2:17am

I’m using a GTX1080Ti to implement the BiCGSTAB algorithm according to Dr. Maxim Naumov’s papers, but the performance is not pleased. There are two problems:

I have implemented this method on GTX 1050 2G, it is 30-40% slower than GTX 1080Ti, but it can hold many instances simultaneously with little performance loss(e.g. run 1 takes 4s on GTX1050, run 3 together takes 5s; but run 1 takes 3s on GTX 1080Ti, run 3 takes 8s)
There are so many gaps when solve the sparse matrix using cusparseDcsrsv_solve, see below, so in fact only limited compute time has been used. Is there a way to improve this?
https://s1.ax1x.com/2018/03/08/92KdKS.png

image_2.png2880×1548 136 KB

Topic		Replies	Views
Low performance of cuSOLVER compared to CSparse? CUDA Programming and Performance	0	416	June 27, 2020
Help Improving Performance using cuSolver/cuSparse Routines GPU-Accelerated Libraries cuda , nsight , performance , python , pycuda	0	771	December 15, 2023
Device version of cusolverSpScsrlsvqr is extremely slower than host version CUDA Programming and Performance cuda , performance	3	820	October 15, 2020
Calling cuSparse library on Tesla A100 with CUDA11.1 is much slower than that on Tesla P100 with CUDA9.0 GPU-Accelerated Libraries cuda , nvbugs	1	1083	December 1, 2020
Growing number of PCG iterations with cusparseDbsrsv2_solve GPU-Accelerated Libraries	4	916	August 7, 2019
Problems solving linear system with multiple right-hand side GPU-Accelerated Libraries cusolver , cusparse	0	524	June 13, 2023
Does cusolver for sparse cholesky necessarily slower than single-thread CPU? GPU-Accelerated Libraries	2	661	September 18, 2020
cusparse vs cusolver different result to solve Ax = b GPU-Accelerated Libraries	0	1073	October 4, 2016
cusparseLtMatmul is slower than cublasGemmEx GPU-Accelerated Libraries cublas , cusparse	0	669	April 21, 2023
Performance of CuSparse Dgtsv() (three-diagonal matrices solver). GPU-Accelerated Libraries	1	1161	July 31, 2013

Why cusparseDcsrsv_solve so slow?

Related topics