Performance statistics of Jetson Nano on deep learning inference

awiegersm · March 23, 2019, 11:13am

I would be interested in finding out how the speed of deep learning inference on the Jetson Nano in the Nvidia blog(https://devblogs.nvidia.com/jetson-nano-ai-computing/) about the Jetson Nano(https://devblogs.nvidia.com/wp-content/uploads/2019/03/imageLikeEmbed-1024x510.png) can be reproduced.

For example: SSD Mobilenet SSD-V2(300x300) on the Jetson Nano performs at 39 fps which is faster than the TensorRT performance on the Jetson TX2 I have access to which performs at about 20 fps(this is similar in performance as the benchmarks([url]GitHub - NVIDIA-AI-IOT/tf_trt_models: TensorFlow models accelerated with NVIDIA TensorRT) listed in the Nvidia tf_trt_repository.

The review on Phoronix ranks Jetson Nano deep learning inference performance consistently below Jetson TX2 performance: NVIDIA Jetson Nano: A Feature-Packed Arm Developer Kit For $99 USD - Phoronix . This seems about right as the Nano has an older Maxwell architecture with half the amount of CUDA cores

Are there any new TensorRT optimizations on the Nano? How can the performance statistics from the Nvidia blog be reproduced?

AastaLLL · March 25, 2019, 4:24am

Hi,

Please use pure TensorRT rather than TF-TRT for benchmark.
And the result should generate with the caffe-based model.

Thanks.

awiegersm · March 25, 2019, 9:04pm

Thanks for your reply. I will try TensorRT. Is there any performance reason for using Caffe? I prefer Tensorflow because I have difficulties getting the Mobilenets to converge in Caffe.

AastaLLL · March 29, 2019, 8:36am

Hi,

You should be able to get similar performance result with pure TensorRT.

TensorRT starts from caffemodel so we keep using it to compare with our previous score.
And this should not yield too much difference since they all convert into TensorRT in the end.

Another reason is that caffemodel is NCHW format which is more friendly to GPU.
Thanks.

BMohit · April 9, 2019, 3:53am

Hello,
The fps numbers from (https://devblogs.nvidia.com/wp-content/uploads/2019/03/imageLikeEmbed-1024x510.png) are generated while running Jetson Nano in maximum performance mode with jetson_clocks.sh ?

AastaLLL · April 12, 2019, 8:03am

Hi,

YES. And please set the nvpmodel to the performance mode first.

sudo nvpmodel -m 0

Thanks.

dusty_nv · April 12, 2019, 4:49pm

See here for the instructions on running SSD-Mobilenet-v2 with TensorRT:

https://devtalk.nvidia.com/default/topic/1049802/jetson-nano/object-detection-with-mobilenet-ssd-slower-than-mentioned-speed/post/5327974/#5327974

Topic		Replies	Views
How to increase inference speed on JETSON NANO (4GB) Jetson Nano opencv , jetson-inference , deep-learning	5	2559	October 15, 2021
Question about inference speed Jetson Nano	2	653	October 18, 2021
Low FPS on Jetson Nano using TensorRT Jetson Nano tensorrt , tensorflow	7	1343	August 27, 2020
Low FPS with TensorRT enabled Tensorflow Object Detection API Models Jetson Nano tensorrt , tensorflow , ssd	3	1365	October 15, 2021
How to get processing performance of 30FPS with "ssd_mobilenet_v2" Jetson Nano ai-training	4	992	October 18, 2021
How to create a benchmark model for ssd_mobilenet_v2 Jetson Nano ai-training	5	1052	October 18, 2021
Performance difference of tensorRT versus nvcaffe+cuDNN GPU-Accelerated Libraries	2	2667	February 1, 2018
Deep Learning Inference Benchmarking Instructions Jetson Nano	1	871	March 16, 2020
Bad performance of jetson-inference with ssd-mobilenet-v2 Jetson Nano jetson-inference	2	782	October 18, 2021
.numpy() is very slow on Jetson Nano Jetson Nano tensorrt	4	1523	October 18, 2021

Performance statistics of Jetson Nano on deep learning inference

Related topics