What 'cusparseXcscsort' or 'cusparseXcsrsort' use for?

I notice that in cusparse, there is cusparseXcscsort. What is this function used for? There is few references on Google. Can anyone give a specific example?

The CSR format used by cusparse has the concept of both unsorted and sorted elements within a given row; for the sorted case, the order of elements presented will be such that the column index is in increasing order.

Likewise the CSC sorted format assumes that within a given column, the order of elements presented will be such that the row index is in increasing order.

If you look at the example given in the function description, this is not the case:

csrColInd[nnz] = { 2, 1, 0, 0, 2, 1, 1, 2, 0}; // on device
csrVal[nnz] =    { 3, 2, 1, 4, 6, 5, 8, 9, 7}; // on device

The presented order is: 2,1,0, 0, 2, 1, 1, 2, 0

when, if it were sorted, it should be: 0, 1, 2, 0, 1, 2, 0, 1, 2

Therefore at the completion of that example code, since the sorting is done in place, we would expect the column indices and the values to be reordered like this:

csrColInd[nnz] = { 0, 1, 2, 0, 1, 2, 0, 1, 2}; // on device
csrVal[nnz] =    { 1, 2, 3, 4, 5, 6, 7, 8, 9}; // on device

Although cusparse advertises support for sorted and unsorted cases, AFAIK this was not always true. Furthermore, it appears that some functions such as geam2 still expect sorted input.

Reasons why unsorted format and sorted format might both be of interest are given in a deprecated function description:

This function transfers unsorted CSR format to CSR format, and vice versa. The operation is in-place.

This function is a wrapper of csrsort and gthr. The usecase is the following scenario.

If the user has a matrix A of CSR format which is unsorted, and implements his own code (which can be CPU or GPU kernel) based on this special order (for example, diagonal first, then lower triangle, then upper triangle), and wants to convert it to CSR format when calling CUSPARSE library, and then convert it back when doing something else on his/her kernel. For example, suppose the user wants to solve a linear system Ax=b by the following iterative scheme

The code heavily uses SpMv and triangular solve. Assume that the user has an in-house design of SpMV (Sparse Matrix-Vector multiplication) based on special order of A. However the user wants to use CUSAPRSE library for triangular solver.

Its evident that that now-deprecated function description was written in an era when cusparse still mostly expected a sorted order.

Hi, is there any performance difference known for cusparse SpMv and Spgemm when use sorted vs unsorted CSR matrix?

Nothing is documented that I know of. I don’t have any knowledge about perf differences.

In general, sorted CSR provides better performance thanks to a better cache utilization, especially for SpMV. However, I could expect a few cases where this is not true.

1 Like