CUSPARSE implementation of SpMV

slt · June 14, 2022, 11:11am

Hi all,

I am using CUSPARSE to implement the Preconditioned Conjugate Gradient. In the solver, the SpMV product is used many times.
I am developing an optimization of the solver for which it would be important for me to know if CUSPARSE implements the SpMV product in its scalar version or in the vector one, or if it is any other variant (https://www.nvidia.com/docs/IO/77944/sc09-spmv-throughput.pdf).

Would it be possible to obtain this information?

Thank you,
Sergi

fbusato · June 14, 2022, 5:20pm

cuSPARSE does not implement the algorithm proposed in the paper that you point out. cusparseSpMV follows a nonzero-splitting approach. See “Merge-Based Parallel Sparse Matrix-Vector Multiplication”

slt · June 15, 2022, 7:36am

cusparsecsrmv() in the CUDA Toolkit version 10.2.89 also uses this Merge-Based approach?

The paper you cited states version 7.5 of cuSPARSE uses vectorization. I understand at some point the implementation changed to the nonzero-splitting.

Also, the nonzero splitting is per-thread, per-warp or per-SM?

Thank you,
Sergi

EDIT: Added question and changed misunderstood text.

fbusato · June 21, 2022, 5:49pm

cusparsecsrmv() was a deprecated API as it has been replaced by cusparseSpMV(). cusparsecsrmv() used a subwarp to row mapping. While for nonzero splitting, all approaches at the state of the art use a per-thread strategy

Topic		Replies	Views
About how to control thread in cuSPARSE GPU-Accelerated Libraries cusparse	7	185	June 13, 2025
SpMV based on CSC format GPU-Accelerated Libraries cusparse	2	561	July 17, 2023
SPMV vs cusparseCsrmvEx CUDA Programming and Performance cuda	0	474	September 9, 2020
SpMV library (cusp, cusparse) CUDA Programming and Performance	7	5926	December 1, 2011
Batched multiplication with sparse matrices and dense vectors GPU-Accelerated Libraries cusparse	4	608	March 15, 2024
Nvidia-smi does not show GPU use while using cusparse library methods for CSR SpMV operation GPU-Accelerated Libraries	1	553	August 10, 2019
Hello and help with cusparse cusparse CUDA Programming and Performance	2	905	January 20, 2012
Sparse Matrix-Vector Multiplication on CUDA CUDA Programming and Performance	79	314731	November 22, 2010
slow performance cusparse spmv CUDA Programming and Performance	14	3221	December 9, 2013
BSR implimnatation GPU-Accelerated Libraries	1	1084	November 12, 2015

CUSPARSE implementation of SpMV

Related topics