Hi,

I’ve put together a little demo of my problem. See the attached file. Here is the output of my program:

Initializing CUSPARSE…done

This tests shows that the CUSPARSE format conversion

functions are not working as expected. We have a matrix in device memory that we want to

convert to CSR, but things don’t work correctly. The example below is taking from

page 10 of the CUSPARSE Library PDF. This was tested on CUDA 3.2 Nov 2010.

Yes I know that the matrix is already sparse, but the use case is that we already have a sparse matrix in device

memory that we want to convert to CSR format.

h_A =

1 4 0 0 0

0 2 3 0 0

5 0 0 7 8

0 0 9 0 6

Calling cusparseSnnz with lda = 4. We are using CUSPARSE_DIRECTION_ROW (e.g nnzPerVector stores nnz per row)

nnz= 9 - CORRECT!

h_nnzPerVector - WRONG

1 3 3 2

Should be: 2 2 3 2

Calling cusparseSdense2csr

h_csrValA - WRONG

1 4 7 9 2 5 8 3 6

Should be: 1 4 2 3 5 7 8 9 6

h_csrRowPtrA - WRONG, though first and last enteries are correct

0 1 4 7 9

Should be: 0 2 4 7 9

h_csrColIndA - WRONG

0 0 3 4 1 2 3 1 4

Should be: 0 1 1 2 0 3 4 2 4

So as you can see the results are just wrong. If we instead do things by column (forgot the exact setup)

then we will get the correct results for the above variables. But the problem then is that if you later

want to use your CSR in say a call to cusparseScsrmv then you would have to specify for transA that the matrix

is a transpose. This brings the multiplication down to a crawl, and a regular CUBLAS dense multiply is 15x faster! Go Figure!

So I think this *may* be a bug. Any help greatly appreciated

cusparse-conversion-test-krunal.zip (9.83 KB)