Accelerating Sparsity for GEMM

user99745 · May 16, 2022, 2:44am

Hello, is the 2:4 Structured-Sparse of the Ampere architecture only effective for GEMM? Winograd and FFT don’t work?

mnicely · May 16, 2022, 4:55pm

Accessing 2:4 structured sparsity depends on the library.
As far as Math Libs, access is available through cuSPARSELt.

user99745 · May 17, 2022, 1:33am

Thanks, When I run the trtexec command with sparsity enabled , the inference speed increased by about 1%, I don’t know why? I used ResNet50 with apex/ASP sparsity pruning.

mnicely · May 17, 2022, 5:42pm

The short answer is that RN50 is full of operations that aren’t math-limited GEMMs (CONVs), so it’ll never see a fantastic end-to-end speedup. The long answer is that everything will depend on:

What hardware you’re using
What clocks you’re using, and how efficiently the hardware is cooled
What version of TRT you’re using
What data type you’re using
What batch size you’re using
If you’re used ASP and saved the model correctly

I would start with this TRT blog and try ResNeXt-101 model within.

user99745 · May 18, 2022, 1:47am

Thanks! I will try ResNeXt-101 model

Topic		Replies	Views
Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt Technical Blog	10	1166	March 14, 2022
Sparse tensor math speedup on Ampere TensorRT tensorrt , cuda	1	347	December 20, 2023
Structured sparsity not working with explicit quantization TensorRT tensorrt	5	939	March 31, 2022
2:4 sparsity doesnot improve inference performance on RTX 3090 TensorRT tensorrt	14	3034	September 9, 2022
Sparsity does not provide any speedup for TensorRT on DLA Jetson AGX Orin cudnn	6	746	January 22, 2024
Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT Technical Blog	13	2705	June 2, 2023
Mixed Precision Algorithm in Ampere Is Slower Than Volta GPU-Accelerated Libraries cublas	1	492	October 6, 2021
TensorRT/Faster Transformer for GPT2/MT-NLG with Sparsity TensorRT	4	949	April 3, 2023
GTC 2020: Accelerating Sparsity in the NVIDIA Ampere Architecture Presentations	0	622	May 21, 2020
Enabling sparsity to model between other devices using tensorrt TensorRT tensorrt , ai-training	1	758	September 7, 2023

Accelerating Sparsity for GEMM

Related Topics