Hello, CUDA developers.
I have a simple question on the use of CUBLAS and CUSPARSE.
My whole matrix consists of thousands of triangular dense block matrices.
In this case is there any better routines that I can use other than csrsv or csrsv2?
Or will the analysis phase will detect the independence between the block matrices?
It seems that there is a batched version of trsm, but my problem is basically trsv.
I can put the RHS matrix in trsm as the vector, but I presume it will not be efficient.
Thank you!