Does weight pruning help improve the inference speed of pruned models on TX2?

yatchiu · August 1, 2019, 11:46am

Hello, I would like to prune my models and run thems on TX2. I would use weight pruning. That is to make the weights of models as sparse as possible.

It seems that whether the speed would be improved depends.

A sparse PyTorch model does not necessarily run faster than a dense one. But an ONNX one could. (Software)

It seems also about hardware. I am wondering if a sparse one could run faster on TX2？

AastaLLL · August 2, 2019, 6:17am

Hi,

This depends on the software you used.

Take TensorRT as example, weight pruning may not improve the performance obviously.
We don’t check the sparsity before inference so the lunched process remains the same.

Suppose you can try layer pruning, which will directly improve the performance.
Thanks.

Topic		Replies	Views
Does network pruning speed up inference speed? TensorRT	6	1688	January 7, 2022
TensorRT optimization for pruning TensorRT	5	3541	June 15, 2020
Should pruning a model prior to converting it to tensorRT make inference faster? Jetson TX2 tensorrt	12	2824	October 18, 2021
Pruning model Jetson Nano jetson-inference	5	1147	February 9, 2022
Tensorrt performance General Topics and Other SDKs tensorrt	0	485	March 30, 2022
Channel pruning on TensorRT does not get speed up TensorRT	2	615	June 29, 2021
Pruning fasterrcnn : How does it affect the inference speed TAO Toolkit	2	394	October 12, 2021
Pruning .onnx and convert to .engine Jetson Xavier NX tensorrt	5	1696	May 6, 2022
TensorRT with pruned model TensorRT tensorrt	4	821	April 20, 2022
Pruning Deep Learning (Training & Inference)	0	348	May 5, 2020

Does weight pruning help improve the inference speed of pruned models on TX2?

Related topics