Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt

Originally published at: https://developer.nvidia.com/blog/exploiting-ampere-structured-sparsity-with-cusparselt/

Deep neural networks achieve outstanding performance in a variety of fields, such as computer vision, speech recognition, and natural language processing. The computational power needed to process these neural networks is rapidly increasing, so efficient models and computation are crucial. Neural network pruning, removing unnecessary model parameters to yield a sparse network, is a useful…


Does the cuSPARSELt also support the SM86 for RTX3090 GPU?
Currently, I only see it limits its support for SM80.
How could I leverage CuSPARSELt on SM86?

Hi Daniel,

Yes, cuSPARSELt is currently limited to SM80. The support for SM86 will be added soon.

1 Like

Ok, Thanks.

I recently found the example of the sparse Tensorcore GEMM example (15_ampere_sparse_tensorop_gemm) on CUTLASS.

However, it seems that it only supports INT4 input and int32 output on SM86, when I change the data type to float or half or int8 as the input, it can successfully compile but always fail to launch during the initialization of GEMM object.

Is there any reason or solution for this? Thanks!

sorry, I’m not involved in the CUTLASS project. My suggestion is to use the official Github issue panel for your question.