Which matrix compressed format is fast using cuSparse library

Using cupsarseSpMM, matrix-multiplication is performed A(compressed) * B.
In document, using row-major is faster than colum-major.
I compress A matirx to CSR and CSC format respectively (row-major order).
And matrix-multiplications are performed A (CSR format) * B and A (CSC format) * B.
I thought the speed difference would be negligible, but the CSR form was 1.5 ~ 2 as fast.
Why is the speed difference greater than i expected?

Hi @jeus5771. Can you tell us what SpMM algorithm you used?

Thank you for your reply.
I used CUSPARSE_SPMM_ALG_DEFAULT algorithm both SpMM.
And M, N, K values are 2048, 2048, 2048. A matrix’s saprsity is setted 99%

CSC SpMM has the same performance as CSR SpMM A^T * B because you cannot perform the computation in the same way as CSR A * B.
Indeed, you need to use atomic operations to get the final result and this affects the performance.

Does it mean that the A matirx’s csc and csr format must have the same memory layout(csrValA, cscValA) to get the same performance?

no. It means that SpMM CSC (CSR^T) and CSR have entirely different algorithms. The reason is that they represent the same matrix in different ways (by-columns, or by-rows)