Lower than expected time-consuming optimizations

Z_xyuu · January 8, 2024, 2:20pm

Recently, I’ve been using ASP in Apex to sparsify my network model, and it reduces the time-consuming by over 40% on a GPU 3060, but why is it that on Orin only reduces the time-consuming by 18%?

AastaLLL · January 9, 2024, 3:41am

Hi,

Which frameworks do you use for inference?
Is it TensorRT?

Thanks.

Z_xyuu · January 9, 2024, 3:49am

It’s TensorRT.
Sorry, I just find that other factors made the time-consuming longer before sparsifying on GPU. Finally, I find ASP also reduces the time-consuming by about 18% on GPU.
Is it a normal reducing range on Orin by using ASP?
Thanks.

AastaLLL · January 10, 2024, 6:15am

Hi,

The ratio should depend on the model architecture.

Which precision did you use for inference?
It’s recommended to try INT8 or FP16.

Thanks.

Z_xyuu · January 10, 2024, 7:29am

Thanks, I used INT8.

AastaLLL · January 11, 2024, 6:05am

Hi,

Could you run the trtexec with --verbose and share the log with us?
It should contain the sparsity information.

Thanks.

Z_xyuu · January 11, 2024, 7:53am

Could you please offer some of your previous empirical benefits obtained by sparsifying networks such as ResNet or others?

Thanks!

AastaLLL · January 16, 2024, 7:08am

Hi,

We don’t have a dense vs. sparse perf comparison with GPU.
But there are some data with DLA:

Thanks.

system · February 13, 2024, 12:50pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Stuctured sparsity 2:4 does not improve inference performance on Jetson Orin TensorRT tensorrt	6	912	October 17, 2023
Inference slow even using TensorRT Jetson AGX Orin tensorrt	15	1946	November 6, 2023
Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT Technical Blog	13	2857	June 2, 2023
Automatic Sparsity(ASP) on Jetson AGX Orin for Pytorch model Jetson AGX Orin cuda , pytorch	8	1328	February 1, 2023
TensorRT model inference fully on DLA is slow due to abnormally slow cudaEventSynchronize time Jetson AGX Orin tensorrt , cuda , dla	10	1634	January 17, 2024
Sparsity does not provide any speedup for TensorRT on DLA Jetson AGX Orin cudnn	6	987	January 22, 2024
Discrepancy Between Claimed and Actual Sparse INT8 Performance of Tensor Cores on Jetson AGX Orin Jetson AGX Orin tensorrt , performance	15	465	September 11, 2024
DLA-v2 is slower than DLA-v1 Jetson AGX Orin tensorrt , jetson-inference	8	2708	July 6, 2022
2:4 sparsity doesnot improve inference performance on RTX 3090 TensorRT tensorrt	14	3392	September 9, 2022
Pruning Without Improvement Jetson AGX Xavier tensorrt , jetson-inference	5	606	September 5, 2021

Lower than expected time-consuming optimizations

Related topics