Is it possible to my own model optimization technique into TensorRT?

Aizzaac · November 12, 2020, 4:13pm

Hello

TensorRT is an algorithm optimizer. I would like to know if I can use my own optimization techniques. For example, let’s say I desing and algorithm to compress further the AI models and I want to use it with the TX2. How can I make TensorRT use it? Do I have to implement it using CUDA or PyCUDA?

Thank you

AastaLLL · November 13, 2020, 3:07am

Hi,

Would you mind to share more detail about the optimization?

If it is applied on the model architecture, you can do it directly and pass the pruned or compressed model to TensorRT.
If it is an implementation optimization, you can try to write your own code as plugin(C++):

Thanks.

Aizzaac · November 13, 2020, 3:20pm

Thank you very much for your answer.
I would like to implement the algorithm that is being discussed here:
https://deepai.org/publication/deep-compression-compressing-deep-neural-networks-with-pruning-trained-quantization-and-huffman-coding

Some more questions:
Are those plugins related to the graphsurgeon API?
Is it possible to see all the optimisations that are being applied by TensorRT?

Thank you

AastaLLL · November 17, 2020, 5:48am

Hi,

There are three stages mentioned in the paper: pruning, trained quantization and Huffman coding.
Based on the description, the optimization is to modify the network rather than an implementation for inference.

So ideally, you should follow the paper to get a pruned and quantization model first.
And compressed the model with Huffman code to get the output of the paper.

And when inference, you can feed the model directly to TensorRT after Huffman decoding.

The optimization is independent to the TensorRT implementation.
You don’t need to combine it into TensorRT.

Thanks

Topic		Replies	Views
Best websites or books on TensorRT optimization TensorRT	2	183	December 5, 2024
FCN Alexnet model pruning Jetson TX2	4	1060	October 18, 2021
TensorRT optimization for pruning TensorRT	5	3706	June 15, 2020
Does network pruning speed up inference speed? TensorRT	6	1839	January 7, 2022
Pruning with pytorch model and making tensorrt engine Application Optimization	0	62	February 6, 2025
TensorRT 3: Faster TensorFlow Inference and Volta Support Technical Blog	16	663	December 8, 2020
Tool for pruning TensorRT	2	1652	October 12, 2021
Different ways to convert TensorFlow model to TensorRT Jetson Nano tensorrt , tensorflow , jetson-inference , onnx	3	2504	October 19, 2022
Onnx-tensorrt, TensorRT and TensorRT OSS TensorRT	1	911	September 13, 2021
Is there an NVIDIA tool to check the content of the TRT engine? TensorRT tensorrt	2	858	October 12, 2021

Is it possible to my own model optimization technique into TensorRT?

Related topics