**TensorRT Version **: 8.0
**GPU Type: **: Jetson os[Maxwell]
Operating System + Version: jetson nano

i pruned my deeplearning model, and i wil change this model to Tensor RT.
then does this pruning network will speed up Inferencing on tensor RT framework??? [ my current os is jetson nano

if yes, is there any github or blog that i can refer to???

Request you to share the model, script, profiler and performance output if not shared already so that we can help you better.
Alternatively, you can try running your model with trtexec command.

While measuring the model performance, make sure you consider the latency and throughput of the network inference, excluding the data pre and post-processing overhead.
Please refer below link for more details:


thanks to reply.

i want to ask is before i try measuring performance, i want to seek advice for Tensor RT optimization.
i used the underline method to prune my model, and i think this pruning method takes sparse layers to accelerate model inference.

so, in tensor RT layers does calculation for sparse layers of convolution network support so my pruining work will accelerate in tensorRT? or what i’m doing is automatively done in tensor RT so my work will not improve speed?

If you channel prune models in the right way (and then compress them), you won’t get any increase in speed in TensorRT. GPUs are simply very good at dense math, so unless sparsity is appropriately structured or weights are very sparse, sparse computations are unlikely to improve performance.


thanks for reply @spolisetty , your answer helps me a lot