Trtexec layer optimization

simon.eichhammer · March 31, 2021, 12:33pm

Hi,

im using an Jetson Xavier AGX with Jetpack 4.3. I converted a modified squeezenet caffemodel to an tensorrtengine with trtexec. I used following command:

./trtexec --deploy=deploy.prototxt --model=squeezenet.caffemodel --output=out1,out2 --fp16 --useDLACore=0 --allowGPUFallback --verbose

And got among others following output:

[02/31/2021-14:11:25] [V] [TRT] Applying generic optimizations to the graph for inference.
[02/31/2021-14:11:25] [V] [TRT] Original: 169 layers
[02/31/2021-14:11:25] [V] [TRT] After dead-layer removal: 169 layers
[02/31/2021-14:11:26] [W] [TRT] Internal DLA error for layer conv7_3. Switching to GPU fallback.
[02/31/2021-14:11:26] [W] [TRT] Internal DLA error for layer conv7_3. Switching to GPU fallback.
[02/31/2021-14:11:26] [W] [TRT] Internal DLA error for layer conv7_3. Switching to GPU fallback.
[02/31/2021-14:11:26] [W] [TRT] Internal DLA error for layer conv1_6. Switching to GPU fallback.
[02/31/2021-14:11:26] [W] [TRT] Internal DLA error for layer conv1_6. Switching to GPU fallback.
[02/31/2021-14:11:26] [W] [TRT] Internal DLA error for layer conv1_6. Switching to GPU fallback.
[02/31/2021-14:11:26] [V] [TRT] After DLA optimization: 11 layers
[02/31/2021-14:11:26] [V] [TRT] After scale fusion: 11 layers
[02/31/2021-14:11:26] [V] [TRT] After vertical fusions: 11 layers
[02/31/2021-14:11:26] [V] [TRT] After final dead-layer removal: 11 layers
[02/31/2021-14:11:26] [V] [TRT] After tensor merging: 11 layers
[02/31/2021-14:11:26] [V] [TRT] Eliminating concatenation concat_stage6
[02/31/2021-14:11:26] [V] [TRT] Generating copy for conv7_3 to concat6
[02/31/2021-14:11:26] [V] [TRT] Generating copy for conv7_3 to concat6
[02/31/2021-14:11:26] [V] [TRT] Generating copy for conv4_4 to concat6
[02/31/2021-14:11:26] [V] [TRT] After concat removal: 13 layers
.
.
.
[02/31/2021-14:11:48] [V] [TRT] After reformat layers: 25 layers

My questions are the following:

Is it possible that the network gets so dramatically reduced or could this be a hint for an error?
What is the explanation for the different layer numbers?
Does the outcoming network has 25 layers?

Thanks in advance.

AastaLLL · April 1, 2021, 2:43am

Hi,

1. It should be okay. TensorRT tends to merge layer for acceleration.

2. You can check this tutorial to know more about TensorRT mechanism.

3. Yes.

Thanks.

simon.eichhammer · April 7, 2021, 1:50pm

Hi thanks for your answer.
But if i generate the TensorRT engine just with the GPU i get 105 after reformat layers.

The generated TensorRT engine with GPU is also working and show the desired output, but the TensorRT engine for DLA doesn’t work and don’t show anything.

Topic		Replies	Views
Trtexec problem Jetson AGX Xavier tensorrt , jetson-inference	6	1684	September 27, 2021
TensorRT inference problem Jetson AGX Xavier	8	953	October 18, 2021
[TensorRT] Running a simple onnx model on Jetson Xavier DLA Jetson Xavier NX tensorrt , onnx	12	2936	August 10, 2022
Wrong result from DLA Jetson AGX Xavier nvbugs , dla	8	842	October 18, 2021
Unable to build tensorrt engine with DLA enabled on Jetson Xavier NX Jetson Xavier NX tensorrt , cudnn	7	297	May 15, 2024
Convert model to TensorRT with DLA \| DLA Node compilation Failed TensorRT	3	910	October 12, 2021
Jetson Orin: All layers pushed to GPU, zero layers on DLA Jetson AGX Orin tensorrt , dla	7	1031	April 26, 2023
Can not make tensorrt work on DLA (Jetson Xavier) Jetson AGX Xavier tensorrt , dla	3	578	October 18, 2021
tensorRT 5 on xavier and DLA core dumped (Solved) Jetson AGX Xavier	4	2357	October 18, 2021
Huge speed difference between engines built from scratch and engines built from onnx Jetson AGX Xavier tensorrt , nvbugs	11	852	August 3, 2021

Trtexec layer optimization

Related topics