Cusparse coo2csr not working with arrays in host memory, but works when in device memory

amirbarda · July 29, 2021, 1:26pm

I am quite new to cuda, and I am interested in using it’s sparse solver for a project.
From the documentation I understand that I need to convert my COO-formatted sparse matrices to CSR format matrices for use in the sparse solver, So I am using the supplied cusparseXcoo2csr in the cusparse library:

cusparseStatus_t
cusparseXcoo2csr(cusparseHandle_t   handle,
                 const int*         cooRowInd,
                 int                nnz,
                 int                m,
                 int*               csrRowPtr,
                 cusparseIndexBase_t idxBase)

From testing it out, It seems to work when the cooRowInd and the csrRowPtr parameters are in the the device memory:

    int* csrRowPtr =0;

    int nnz = 9;

    int n = 4;

    int* cooRowIndex = 0;

    std::vector<int> temp_coo_ind = {0,0,0,1,2,2,2,3,3};

    cudaError_t cudaStat1 = cudaMalloc ((void**)&cooRowIndex , nnz * sizeof(cooRowIndex[0]));

    cudaStat1 = cudaMemcpy (cooRowIndex , temp_coo_ind.data() ,
                             (size_t) (nnz * sizeof (cooRowIndex[0])) , cudaMemcpyHostToDevice ) ;

    cudaStat1 = cudaMalloc ((void**) &csrRowPtr , (n+1) * sizeof (csrRowPtr[ 0 ])) ;

    int* h_csrRowPtr = (int*)malloc((n+1)*sizeof(int));

    auto status = cusparseXcoo2csr(cusparseHandle, /* used in residual evaluation */
                                   cooRowIndex, //on device memory
                                   nnz,
                                   n,
                                   csrRowPtr, //on device memory
                                   CUSPARSE_INDEX_BASE_ZERO );

    cudaMemcpy (h_csrRowPtr , csrRowPtr ,
                (size_t) ((n+1) * sizeof (h_csrRowPtr[0])) , cudaMemcpyDeviceToHost );

    std::vector<int> vec_csrRow(h_csrRowPtr,h_csrRowPtr+n+1); //vec_csrRow = [0,3,7,4,9] - OK!

when the cooRowInd and the csrRowPtr parameters are in the host memory, csrRowPtr is remains unchanged:

    int* csrRowPtr =0;

    int nnz = 9;

    int n = 4;

    int* cooRowIndex = 0;

    std::vector<int> temp_coo_ind = {0,0,0,1,2,2,2,3,3};

    cudaError_t cudaStat1 = cudaMalloc ((void**)&cooRowIndex , nnz * sizeof(cooRowIndex[0]));

    cudaStat1 = cudaMemcpy (cooRowIndex , temp_coo_ind.data() ,
                             (size_t) (nnz * sizeof (cooRowIndex[0])) , cudaMemcpyHostToDevice ) ;

    cudaStat1 = cudaMalloc ((void**) &csrRowPtr , (n+1) * sizeof (csrRowPtr[ 0 ])) ;

    int* h_csrRowPtr = (int*)malloc((n+1)*sizeof(int));

    auto status = cusparseXcoo2csr(cusparseHandle, /* used in residual evaluation */
                                   temp_coo_ind.data(), //on host memory
                                   nnz,
                                   n,
                                   h_csrRowPtr, //on host memory
                                   CUSPARSE_INDEX_BASE_ZERO );

    std::vector<int> vec_csrRow(h_csrRowPtr,h_csrRowPtr+n+1); // vec_csrRow != [0,3,7,4,9]

Why is this? What am I doing wrong here, and how can I get the conversion to work in host memory? thanks!

Robert_Crovella · July 29, 2021, 2:18pm

Because the data is expected to be in device memory. I’m not aware of cusparse providing conversion routines that work in host memory.

amirbarda · July 29, 2021, 8:21pm

Thank you!
I suspected that was the case, but it is not stated definitively in the cusparse documentation, and I had to make sure because the cusolver library does have routines that work in host memory.

system · September 27, 2021, 8:21pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
cusparseSpMM fails with complex data GPU-Accelerated Libraries cusparse	3	440	June 6, 2023
cusparseSpMatDescr_t lifecycle and resource management GPU-Accelerated Libraries cusparse	7	1294	March 14, 2022
CUSP and Cusparse are slower than CPU GPU-Accelerated Libraries	0	447	July 24, 2019
Problem with cusparseDcsrsv_solve() function GPU-Accelerated Libraries	2	1224	July 30, 2015
cusparse function are blocked by internal (hidden) cudaFree on all stream... GPU-Accelerated Libraries	3	1077	April 14, 2019
Bugs when trying to perform tranpose of a matrix using cuSPARSE GPU-Accelerated Libraries	2	731	October 12, 2021
cannot get proper results for cusparseSnnz GPU-Accelerated Libraries	1	1032	January 6, 2016
cuSPARSE (cusparseXcoo2csr) problem GPU-Accelerated Libraries	0	880	March 10, 2015
Problem in basic dense to csr format conversion using CUSPARSE GPU-Accelerated Libraries	3	927	July 28, 2015
Problem with "cusparseDcsrsv" GPU-Accelerated Libraries	5	2104	May 23, 2013

Cusparse coo2csr not working with arrays in host memory, but works when in device memory

Related topics