Tool for pruning

dbrazey · January 29, 2019, 4:43pm

Hello,

TensorRT is a tool to speed up neural networks inference.
I was wondering if there exists a nvidia tool to prune neural networks, in order to speed up inference and reduce the memory size.

Thanks

NVES · January 29, 2019, 5:02pm

this is specific to TensorFlow based model, consider graph surgeon:

https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/graphsurgeon/graphsurgeon.html

Topic		Replies	Views
Pruning Deep Learning (Training & Inference)	0	388	May 5, 2020
Does network pruning speed up inference speed? TensorRT	6	1838	January 7, 2022
Tensor RT and weights pruning GPU-Accelerated Libraries	0	1170	August 10, 2017
Should pruning a model prior to converting it to tensorRT make inference faster? Jetson TX2 tensorrt	12	3038	October 18, 2021
Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT Technical Blog	13	3014	June 2, 2023
TensorRT optimization for pruning TensorRT	5	3705	June 15, 2020
Pruning .onnx and convert to .engine Jetson Xavier NX tensorrt	5	1837	May 6, 2022
Pruning with pytorch model and making tensorrt engine Application Optimization	0	62	February 6, 2025
FCN Alexnet model pruning Jetson TX2	4	1060	October 18, 2021
Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT Technical Blog	1	54	July 24, 2025

Tool for pruning

Related topics