I am using the function ‘cusparseScsr2csc’ of the CUSPARSE library to convert a matrix from CSR format to CSC format.
The matrix has about 512^3 non-zero single precision floating point values.
The memory for both the input CSR matrix and the output CSC matrix is properly allocated on the GPU but ‘cusparseScsr2csc’ fails with a CUSPARSE_STATUS_ALLOC_FAILED error.
My question: what is the memory overhead required by ‘cusparseScsr2csc’ to compute the conversion? It seems significant…
Thanks for your help,