How to increase inference speed on JETSON NANO (4GB)

JPCHZ · September 16, 2021, 5:44am

@Morganh @andiyael
I am using a SSD(caffe) model for person detection on Jetson Nano.
I have 2 files:

Mobile_SSD_deploy.caffemodel
Mobile_SSD_deploy_prototxt.txt

I am using Python for this project.
I am using OpenCV with CUDA backend for video processing.

When I run the script using the model the inference is very slow. Please suggest me how can I improve the performance and run it smoothly on Jetson Nano.

AastaLLL · September 16, 2021, 6:34am

Hi,

First, please make sure you have maximized the device performance:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Do you also use OpenCV for DNN inferencing?
If yes, it’s recommended to use TensorRT instead.

For example, below is the command for MNIST model benchmarking.
You can check the performance before the switching.

$ /usr/src/tensorrt/bin/trtexec --deploy=/usr/src/tensorrt/data/mnist/mnist.prototxt --model=/usr/src/tensorrt/data/mnist/mnist.caffemodel --output=prob
$ /usr/src/tensorrt/bin/trtexec --deploy=/usr/src/tensorrt/data/mnist/mnist.prototxt --model=/usr/src/tensorrt/data/mnist/mnist.caffemodel --output=prob --fp16

Thanks.

JPCHZ · September 16, 2021, 7:00am

Hi @AastaLLL
Thanks for the quick reply.
I have tried the following command:

/usr/src/tensorrt/bin/trtexec --deploy=/home/hp/Documents/model/MobileNetSSD_deploy.prototxt.txt --model=/home/hp/Documents/model/MobileNetSSD_deploy.caffemodel --output=prob

But I am getting the following error:
Caffe Parser: Invalid axis in softmax layer - TensorRT expects NCHW input. Negative axis is not supported in TensorRT, please use positive axis indexing
error parsing layer type Softmax index 116

I am using the following files:

MobileNetSSD_deploy.caffemodel (22.1 MB)
MobileNetSSD_deploy.prototxt.txt (28.7 KB)

JPCHZ · September 20, 2021, 4:49am

@AastaLLL @Morganh please help.

Thanks.

AastaLLL · October 5, 2021, 7:08am

Hi,

We check your issue on a JetPack 4.6 environment.

The error is caused by the non-supported Flatten layer.
To workaround this, you can rewrite the Flatten layer with a Reshape layer at the same axis.

For example:

--- a/MobileNetSSD_deploy.prototxt.txt
+++ b/MobileNetSSD_deploy.prototxt.txt
@@ -1183,11 +1183,16 @@ layer {
 }
 layer {
   name: "conv11_mbox_loc_flat"
-  type: "Flatten"
+  type: "Reshape"
   bottom: "conv11_mbox_loc_perm"
   top: "conv11_mbox_loc_flat"
-  flatten_param {
-    axis: 1
+  reshape_param {
+    shape {
+      dim: 0
+      dim: -1
+      dim: 1
+      dim: 1
+    }
   }
 }
 layer {

After this update, we can run your model with following command successfully.

$ /usr/src/tensorrt/bin/trtexec --deploy=MobileNetSSD_deploy.prototxt.txt --model=MobileNetSSD_deploy.caffemodel --output=detection_out --output=keep_count

Attached the modified prototxt file for your reference: MobileNetSSD_deploy.prototxt.txt (29.4 KB)

Thanks.

Topic		Replies	Views
Performance statistics of Jetson Nano on deep learning inference Jetson Nano	7	3643	October 18, 2021
Increase the inference on jetson nano using tensort Jetson TX2 tensorrt , jetson-inference	2	438	November 17, 2021
jetson-inference facenet testing & Other models testing Jetson Nano	7	1821	October 18, 2021
Low FPS on Jetson Nano using TensorRT Jetson Nano tensorrt , tensorflow	7	1254	August 27, 2020
Question about inference speed Jetson Nano	2	601	October 18, 2021
Run custom Resnet SSD using TensorRT Jetson Nano	6	1439	October 18, 2021
How to speed up ssd_mobilenet_v2_fpn with tensorRT in jetson nano? Jetson Nano	2	540	October 15, 2021
Optimize caffemodel to run faster on Jetson TX2 Jetson TX2	4	861	October 18, 2021
Bad performance of jetson-inference with ssd-mobilenet-v2 Jetson Nano jetson-inference	2	734	October 18, 2021
Keras MobileNets .h5 model inference on Jetson Nano: GPU is 10x slower than CPU Jetson Nano	3	1579	October 15, 2021

How to increase inference speed on JETSON NANO (4GB)

Related topics