TensorRT/Faster Transformer for GPT2/MT-NLG with Sparsity

hemant.hbti · October 11, 2022, 12:51pm

Hi,
Is NVIDIA working on TensorRT/Faster Transformer implementation for GPT2 or Other larger model e.g., Megatron-Turing Natural Language Generation model (MT-NLG) to support 2-4 Sparsity?

As of now GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT states sparsity is available only for BERT and Encoder.

spolisetty · October 12, 2022, 12:02pm

Hi,

Please refer to the following post,

Thank you.

hemant.hbti · October 12, 2022, 2:54pm

Thank You,
As per the link the 2-4 Structured Sparsity is only for Megatron.
Is there any plan to have Sparsity for GPT2 6.7Billion model ?

spolisetty · November 2, 2022, 9:00am

Hi,

Currently, we are not sure about it, It may be available in future releases.

Thank you.

JeffWang16 · April 3, 2023, 10:08am

Hi， @spolisetty

Why NV doesn’t support Sparsity for onnx BERT model? Could you describe the reason? Thank you.

Topic		Replies	Views
Structure Sparsity not working with BERT large TensorRT	11	1022	July 7, 2022
Sparsity does not provide any speedup for TensorRT on DLA Jetson AGX Orin cudnn	6	864	January 22, 2024
Structured sparsity not working with explicit quantization TensorRT tensorrt	5	954	March 31, 2022
Problem with structured sparsity and explicit quantization (PTQ) on Tiny-Yolov7 TensorRT	5	746	May 26, 2023
Real-Time Natural Language Processing with BERT Using NVIDIA TensorRT (Updated) Technical Blog	0	517	July 20, 2021
Sparse convolution using tensorrt TensorRT	3	1633	January 20, 2023
Sparsity on Onnx Model TensorRT	1	60	December 31, 2024
Accelerating Sparsity for GEMM GPU-Accelerated Libraries cusparse	4	727	May 18, 2022
Stuctured sparsity 2:4 does not improve inference performance on Jetson Orin TensorRT tensorrt	6	863	October 17, 2023
TF2OD on Jetson Nano 4GB (TF-ONNX-TRT) seemingly not supported due to TensorRT version Jetson Nano tensorrt	4	1066	December 22, 2021

TensorRT/Faster Transformer for GPT2/MT-NLG with Sparsity

Related topics