Any sparse GEMM cu file guideline or blog suggest?

202476410arsmart · July 6, 2022, 11:39am

Hi! I am learning sparse GEMM…Is it possible to suggest some blogs or guidelines? Thank you!!

I have found out cusparse is not open sourced…Sad…

Robert_Crovella · July 6, 2022, 5:40pm

this may be of interest

this maybe of interest although it focuses on a ASIC implementation

armadillo is a C++ library that includes sparse matrix routines in source form, and CUSP is a CUDA library that includes sparse matrix routines in source form.

202476410arsmart · July 7, 2022, 6:59am

Thank you very much!!! Actually I meet a sparse case like this: (very simple)
(dense) 30720 * 3072 @ (dense with sparsity) 3072 * 64 some columns of B can be all zero. Seems we can just skip those columns!

But actually we can not change the data, the input data should still have zero columns and output should still have zero columns…Just like the method metioned here in cutlass: https://developer.nvidia.com/blog/cutlass-linear-algebra-cuda/ We want to coalesced read B, skip some column will be uncoalesced…I am thinking maybe we can create B transposely…And skip some lines instead. So we will read A and B in a same way. (Slow…)

(threadIdx.x / 32) * m + blockIdx.y * 128 + threadIdx.x % 32 * 4 (read global A and write to register)

Well…Any suggestions? (Haha…maybe too complicated…Anyway, thank you~)

Robert_Crovella · July 7, 2022, 4:41pm

no suggestions

The last time I tried to write an actual sparse multiplication routine (it was spmv, not spmm) was about 10 years ago, I got to within 10 percent of cusparse and decided I was wasting my time.

202476410arsmart · July 8, 2022, 12:47am

Haha, you are right. Well, thank you!!!

system · July 22, 2022, 12:47am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.