Coding problem of CSR sparse matrix multiplication

Hello everyone, I’ve recently start learning CUDA programming and I have met a problem.

At this point, I’m trying to understand how CSR sparse matrix multiplication works and how to measure or qualify it’s performance. I would be very grateful if you can provide a code and explain it.

Whether you can help me or not, thank you for clicking in and wish you have a good day.

For now, what I know is the concepts of CSR and those math stuff. Because I do not have much CUDA coding background, it is really hard for me to understanding that.