I just moved from CUDA 10.1 and using cusparseZ(D)csrmm2 (deprecated) to CUDA 11.0 using cusparseSpMM now. I also see a significant performance decrease. A similar issue has also been reported in (Performance Downgrade when changing [deprecated] cusparsecsrmm() to cusparseSpMM() ).
Is there any progress here?
A bug has been filed.
regarding code snippet. The code snippet given in the initial post of “dleonard” I am referring to in my post matches pretty well. Is this enough?