Hi all, below you will find the procedures to run the Jetson Nano deep learning inferencing benchmarks from this blog post with TensorRT.
note: for updated JetPack 4.4 benchmarks, please use github.com/NVIDIA-AI-IOT/jetson_benchmarks
While using one of the recommended power supplies, make sure you Nano is in 10W performance mode (which is the default mode):
$ sudo nvpmodel -m 0
$ sudo jetson_clocks
Using other lower-capacity power supplies may lead to system instabilities or shutdown during the benchmarks.
SSD-Mobilenet-V2
- Copy the ssd-mobilenet-v2 archive from here to the ~/Downloads folder on Nano.
$ cd ~/Downloads/ $ wget --no-check-certificate 'https://nvidia.box.com/shared/static/8oqvmd79llr6lq1fr43s4fu1ph37v8nt.gz' -O ssd-mobilenet-v2.tar.gz $ tar -xvf ssd-mobilenet-v2.tar.gz $ cd ssd-mobilenet-v2 $ sudo cp -R sampleUffSSD_rect /usr/src/tensorrt/samples $ sudo cp sample_unpruned_mobilenet_v2.uff /usr/src/tensorrt/data/ssd/ $ sudo cp image1.ppm /usr/src/tensorrt/data/ssd/
- Apply the following patches to the sample, depending on your JetPack version:
JetPack 4.4 or newer
- patch for
/usr/src/tensorrt/samples/sampleUffSSD_rect/sampleUffSSD.cpp
20,21d19 < using namespace sample; < using namespace std; 23c21 < /*static Logger gLogger;*/ --- > static Logger gLogger; 171c169 < builder->setMaxWorkspaceSize(1024 * 1024 * 128); // We need about 1GB of scratch space for the plugin layer for batch size 5. --- > builder->setMaxWorkspaceSize(128_MB); // We need about 1GB of scratch space for the plugin layer for batch size 5.
- patch for
/usr/src/tensorrt/samples/sampleUffSSD_rect/Makefile
3d2 < EXTRA_DIRECTORIES = ../common
JetPack 4.3 or JetPack 4.2.1
- patch for
/usr/src/tensorrt/samples/sampleUffSSD_rect/sampleUffSSD.cpp
19a20 > using namespace std; 21c22 < static Logger gLogger; --- > /*static*/ Logger gLogger; 169c170 < builder->setMaxWorkspaceSize(128_MB); // We need about 1GB of scratch space for the plugin layer for batch size 5. --- > builder->setMaxWorkspaceSize(1024 * 1024 * 128); // We need about 1GB of scratch space for the plugin layer for batch size 5.
- patch for
- Compile the sample
$ cd /usr/src/tensorrt/samples/sampleUffSSD_rect $ sudo make
- Run the sample to measure inference performance
$ cd /usr/src/tensorrt/bin $ sudo ./sample_uff_ssd_rect
Image Classification (ResNet-50, Inception V4, VGG-19)
- The resources needed to run these models are available here. Copy each of these .prototxt files to the /usr/src/tensorrt/data/googlenet folder on your Jetson Nano.
- ResNet-50
$ cd /usr/src/tensorrt/bin $ ./trtexec --output=prob --deploy=../data/googlenet/ResNet50_224x224.prototxt --fp16 --batch=1
- Inception V4
$ cd /usr/src/tensorrt/bin $ ./trtexec --output=prob --deploy=../data/googlenet/inception_v4.prototxt --fp16 --batch=1
- VGG-19
$ cd /usr/src/tensorrt/bin $ ./trtexec --output=prob --deploy=../data/googlenet/VGG19_N2.prototxt --fp16 --batch=1
U-Net Segmentation
- Copy the output_graph.uff model file from here to the home folder on your Jetson Nano or any directory of your preference.
- Run the U-Net inference benchmark:
$ cd /usr/src/tensorrt/bin $ sudo ./trtexec --uff=~/output_graph.uff --uffInput=input_1,1,512,512 --output=conv2d_19/Sigmoid --fp16
Pose Estimation
- Copy the pose_estimation.prototxt file from here to the /usr/src/tensorrt/data/googlenet folder of your Nano.
- Run the OpenPose inference benchmark:
$ cd /usr/src/tensorrt/bin/ $ sudo ./trtexec --output=Mconv7_stage2_L2 --deploy=../data/googlenet/pose_estimation.prototxt --fp16 --batch=1
Super Resolution
- Download the require files to run inference on the Super Resolution neural network.
$ sudo wget --no-check-certificate 'https://nvidia.box.com/shared/static/a99l8ttk21p3tubjbyhfn4gh37o45rn8.gz' -O Super-Resolution-BSD500.tar.gz
- Unzip the downloaded file
$ sudo tar -xvf Super-Resolution-BSD500.tar.gz
- Run the Super Resolution inferencing benchmark:
$ cd /usr/src/tensorrt/bin $ sudo ./trtexec --output=output_0 --onnx=<path to the .onnx file in the unzipped folder above> --fp16 --batch=1
Tiny YOLO v3
- Install pre-requisite packages
$ sudo apt-get install libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev libgflags-dev
- Download trt-yolo-app
$ cd ~ $ git clone -b restructure https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps
- If you are using JetPack 4.3 or newer, apply the following git patch to the deepstream_reference_apps source:
diff --git a/yolo/config/yolov3-tiny.txt b/yolo/config/yolov3-tiny.txt index ec12c53..47e46a6 100644 --- a/yolo/config/yolov3-tiny.txt +++ b/yolo/config/yolov3-tiny.txt @@ -47,7 +47,7 @@ # nms_thresh : IOU threshold for bounding box candidates. Default value is 0.5 #Uncomment the lines below to use a specific config param -#--precision=kINT8 +--precision=kHALF #--calibration_table_path=data/calibration/yolov3-tiny-calibration.table #--engine_file_path= #--print_prediction_info=true diff --git a/yolo/lib/ds_image.cpp b/yolo/lib/ds_image.cpp index 36a394c..9e4ff5b 100644 --- a/yolo/lib/ds_image.cpp +++ b/yolo/lib/ds_image.cpp @@ -88,7 +88,7 @@ DsImage::DsImage(const std::string& path, const int& inputH, const int& inputW) cv::copyMakeBorder(m_LetterboxImage, m_LetterboxImage, m_YOffset, m_YOffset, m_XOffset, m_XOffset, cv::BORDER_CONSTANT, cv::Scalar(128, 128, 128)); // converting to RGB - cv::cvtColor(m_LetterboxImage, m_LetterboxImage, CV_BGR2RGB); + cv::cvtColor(m_LetterboxImage, m_LetterboxImage, cv::COLOR_BGR2RGB); } void DsImage::addBBox(BBoxInfo box, const std::string& labelName) @@ -106,7 +106,7 @@ void DsImage::addBBox(BBoxInfo box, const std::string& labelName) = cv::getTextSize(labelName, cv::FONT_HERSHEY_COMPLEX_SMALL, 0.5, 1, nullptr); cv::rectangle(m_MarkedImage, cv::Rect(x, y, tsize.width + 3, tsize.height + 4), color, -1); cv::putText(m_MarkedImage, labelName.c_str(), cv::Point(x, y + tsize.height), - cv::FONT_HERSHEY_COMPLEX_SMALL, 0.5, cv::Scalar(255, 255, 255), 1, CV_AA); + cv::FONT_HERSHEY_COMPLEX_SMALL, 0.5, cv::Scalar(255, 255, 255), 1, cv::LINE_AA); } void DsImage::showImage() const @@ -142,4 +142,4 @@ std::string DsImage::exportJson() const json << "}"; } return json.str(); -} \ No newline at end of file +} diff --git a/yolo/lib/trt_utils.h b/yolo/lib/trt_utils.h index 359bfea..96a5a39 100644 --- a/yolo/lib/trt_utils.h +++ b/yolo/lib/trt_utils.h @@ -28,11 +28,12 @@ SOFTWARE. #define __TRT_UTILS_H__ /* OpenCV headers */ -#include <opencv/cv.h> +//#include <opencv/cv.h> #include <opencv2/core/core.hpp> #include <opencv2/dnn/dnn.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> +#include <opencv2/imgcodecs/legacy/constants_c.h> #include <set> diff --git a/yolo/lib/yolo.cpp b/yolo/lib/yolo.cpp index 117a49f..2b7435e 100644 --- a/yolo/lib/yolo.cpp +++ b/yolo/lib/yolo.cpp @@ -423,7 +423,7 @@ void Yolo::createYOLOEngine(const nvinfer1::DataType dataType, Int8EntropyCalibr << " precision : " << m_Precision << " and batch size :" << m_BatchSize << std::endl; m_Builder->setMaxBatchSize(m_BatchSize); - m_Builder->setMaxWorkspaceSize(1 << 20); + m_Builder->setMaxWorkspaceSize(1024 * 1024 * 8); if (dataType == nvinfer1::DataType::kINT8) {
- Install other requirements
$ cd ~/deepstream_reference_apps/yolo $ sudo sh prebuild.sh
- Compile and install app
$ cd apps/trt-yolo $ mkdir build && cd build $ cmake -D CMAKE_BUILD_TYPE=Release .. $ make && sudo make install $ cd ../../..
- For the sample image data set, you can download 500 images (need to be in .png) format to any folder on your Jetson Nano, just use 1 image file, or use a test set of 5 images that we've provided here.
- Navigate your terminal to:
$ cd ~/deepstream_reference_apps/yolo/data
- Open the file âtest_images.txtâ
- In the above file, you need to provide the full path to each of the 500 images you downloaded. For example, if your first image is located in the Downloads directory, the path you would enter in line 1 would be:
/home/<username>/Downloads/<image file name>.png
- Alternatively, you could provide the path to just one image and copy that line 500 times in that file.
- A sample set of images (5 images of varying resolutions, repeated 100 times) along with the test_images.txt file have been uploaded here. You can use this data set if you donât want to download your own images.
- Go to the folder âconfigâ and open file âyolov3-tiny.txt'
- In the file yolov3-tiny.txt, search for â--precision=kINT8â and replace âkINT8â with âkHALFâ to change the inference precision to FP16 mode. Also you will need to uncomment this line. (if you applied the patch for JetPack 4.3 above, this step has already been done)
- Save the file
- Now run the Tiny YOLO inference:
$ cd ~/deepstream_reference_apps/yolo $ sudo trt-yolo-app --flagfile=config/yolov3-tiny.txt