Accelerating Sparsity for GEMM

Hello, is the 2:4 Structured-Sparse of the Ampere architecture only effective for GEMM? Winograd and FFT don’t work?

Accessing 2:4 structured sparsity depends on the library.
As far as Math Libs, access is available through cuSPARSELt.

Thanks, When I run the trtexec command with sparsity enabled , the inference speed increased by about 1%, I don’t know why? I used ResNet50 with apex/ASP sparsity pruning.

The short answer is that RN50 is full of operations that aren’t math-limited GEMMs (CONVs), so it’ll never see a fantastic end-to-end speedup. The long answer is that everything will depend on:

  • What hardware you’re using
  • What clocks you’re using, and how efficiently the hardware is cooled
  • What version of TRT you’re using
  • What data type you’re using
  • What batch size you’re using
  • If you’re used ASP and saved the model correctly

I would start with this TRT blog and try ResNeXt-101 model within.

Thanks! I will try ResNeXt-101 model