cusparseSpMV() gives out the wrong result on the latest driver

Hi,

I’m using the cusparseSpMV() function in one of my projects and it was working as expected until now. I updated my driver from NVIDIA APP to the latest 572.83 version today. And when I tested my code, suddenly the resultant vector from the cusparseSpMV() function is all zero! I have to write a custom kernel that does the spMV computation 1 thread per row to get it right.

Does anyone else experience the same issue with the latest driver? I tested with both CUDA 12.4 and 12.6 on my 4090, which is an ada architecture. OS is Win 11.

This is supposed to work and this is the first time such an issue has been reported. I’d like to figure this out.

Just to double check, could you provide the output of cusparseGetVersion() and could you please try with the latest version of cuSPARSE (12.8 update 1).

What matrix format, matrix, data types, and other parameters are you using?

If you can make a small program that reproduces the issue, that would be very helpful. Perhaps something derived from CUDALibrarySamples/cuSPARSE/spmv_csr at master · NVIDIA/CUDALibrarySamples · GitHub ?

A couple more things to check:

If you set an environment variable CUSPARSE_LOG_LEVEL=5, then a lot more information should be printed to stdout. What does it say?

Are the cusparse functions and cudaGetLastError() all returning “success” and no errors are printed out?

Hi Edwards,

Thank you for replying. The output of cusparseGetVersion() is 12300 when compiled with 12.4 and 12.6.
When I installed the latest CUDA 12.8 and try to re-compile the code, the output of cusparseGetVersion() is 12508, and the code is working now for all three versions of nvcc.

This is the first time I encountered something like this, and there weren’t any error messages printed out when I was debugging last night, which is exactly why I was so confused. Could you please elaborate more what is happening here? Is it because the older version of cuSPARSE is deprecated on the latest driver? Cause my code was working just fine on 12.4 and 12.6 before I updated the driver

The output of cusparseGetVersion() is 12300 when compiled with 12.4 and 12.6.

That is a little strange. Perhaps your PATH was causing the same cusparse DLL to be loaded in both cases? CTK 12.6 should have cuSPARSE version 12.5.4, not 12.3.0.

Could you please elaborate more what is happening here? Is it because the older version of cuSPARSE is deprecated on the latest driver?

I can’t explain it. Old versions of cuSPARSE are supposed to work with new drivers. Everything is supposed to be compatible in that way. Backwards compatibility is only supposed to break at major CTK versions, which shouldn’t apply here.

the code is working now for all three versions of nvcc.

I’m happy to hear that. I would still like to understand what went wrong and reproduce this myself. Could you provide any additional information about how to reproduce this?

You are right, I think the issue is indeed the PATH setting that is causing my cusparse lib not loaded correctly. I later uninstalled both CUDA 12.4 and 12.8, and the cuSPARSE version is now 12.5.4 with cuda 12.6 being the only nvcc version. And the cusparseSpMV is able to execute correctly. I don’t know what is causing the PATH issue though, so not sure how to reproduce this error. But thank you very much for your help!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.