Implementation of sparse triangular solver in cuSparse

Astolfo · December 27, 2019, 9:58am

I am writing a sparse triangular solver (Ax = b) based on the paper of “Parallel Solution of Sparse Triangular Linear Systems
in the Preconditioned Iterative Methods on the GPU”, here is the link https://research.nvidia.com/sites/default/files/publications/nvr-2011-001.pdf
I’ve used the concept of level sets and chains mentioned in the paper, and use CSR to store and access the matrix, and as the paper says, each thread processes one row.
However, my result is a lot worse than cusparseDcsrsv2_solve of cuSparse when the DAG levels is (> 10). I am looking for more implementation details of kernel function, which is left out in the paper. And is there any trick on memory access in the kernel ?

Topic		Replies	Views
cuSPARSE to solve multiple independent sparse linear systems in parallel GPU-Accelerated Libraries	4	2174	March 3, 2014
cuSPARSE BSR Matrix Solver GPU-Accelerated Libraries cuda , cusolver , cusparse	2	36	October 23, 2024
Cusparse for solving the sparse linear equation Ax=b Legacy PGI Compilers	8	1973	August 30, 2019
Bad performance using CUSP conjugate gradient... GPU-Accelerated Libraries	4	1711	July 24, 2019
Example using cusparse and cusolverSpDcsrlsvchol GPU-Accelerated Libraries cusolver , cusparse	7	43	October 8, 2024
Solve upper blocked triangular system with cusparseDbsrsv2_solve of cusparse GPU-Accelerated Libraries	0	477	August 6, 2019
About Hardware Memory Compression GPU-Accelerated Libraries cusparse	4	64	August 7, 2024
BSR implimnatation GPU-Accelerated Libraries	1	1030	November 12, 2015
cuSPARSE generic SpSM much slower than legacy csrsm2 GPU-Accelerated Libraries cublas , cusparse	1	40	October 17, 2024
CUSPARSE implementation of SpMV GPU-Accelerated Libraries cusparse	3	1151	June 21, 2022

Implementation of sparse triangular solver in cuSparse

Related topics