When I can freely choose a matrix storage format when working with CuSPARSE, which one should I choose for best performance?
I will get my matrix on the CPU from another library in a format which is not supported by CuSPARSE, will then transform it to a CuSPARSE format and finally transfer it to the GPU to compute several matrix-vector-products for usage with a large distributed equation solver. I might also want to compute a preconditioner on this matrix.
The matrix originates from an FEM and will therefore have an (unregularly) banded structure. The sparsity pattern will not change between multiple iterations of the above mentioned process, so I can use a precomputed indx map for the transformation between CPU and GPU format.