RetinaNet on Jetson Nano

gutierrez.ge · March 23, 2021, 4:44pm

I trained a RetinaNet model using TLT and in the laptop I am able to generate the .engine and run it inside the triton container, I am now trying to move it into a Jetson Nano flashed using SDK Manager but I am facing this issue when running it:

[ERROR] UffParser: Validator error: FirstDimTile_4: Unsupported operation _BatchTilePlugin_TRT

In the Jetson I am running it using the docker container: deepstream-l4t:5.1-21.02-iot
On my host machine I have Ubuntu 18.04, DeepStream 5.1, TLT v3 with CUDA 11.1
On the Jetson I have Jetpack 4.5.1 with DeepStream 5.1 and CUDA 10.2

How can I convert the .etlt model into a .engine within the Jetson Nano?
Thanks in advance.

AastaLLL · March 24, 2021, 2:58am

Hi,

Please check the document below:

https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/quickstart/deepstream_integration.html#convert-to-trt-engine

Thanks.

gutierrez.ge · March 29, 2021, 4:00pm

I found the solution, here are the proper steps to get RetinaNet working on a Jetson Nano, basically, you have to get TensorRT installed properly, JetPack does not have this properly installed or at all, and to do that you need CMAKE v3.13, then CMAKE and TensorRT system variables needs to be properly configured for the tlt-converter to work.

Instructions for Jetson

This are the instructions for running this project within a Jetson device.

System Requirements

Jetson Nano 4GB
128 GB SD Card
Jetpack 4.5.1
- DeepStream 5.1 used via Docker image (No need to install it)

Prepare the Jetson

Update and upgrade all packages

apt update
apt upgrade

Check cmake version

cmake --version

If version is lower than 3.13, then install curl and remove current cmake version, otherwise proceed to step 4 and skip step 5

apt install unzip curl libssl-dev libcurl4-openssl-dev
apt remove cmake

Remove any unused packages

apt autoremove

Install cmake 3.13

Installing cmake could take time (30 min - 1 hour)

cd /opt/nvidia
wget http://www.cmake.org/files/v3.13/cmake-3.13.0.tar.gz
tar xpvf cmake-3.13.0.tar.gz cmake-3.13.0/
rm cmake-3.13.0.tar.gz
cd cmake-3.13.0/
./bootstrap --system-curl
make -j4
echo 'export PATH=/opt/nvidia/cmake-3.13.0/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

Install TensorRT

TensorRT must be installed in order to properly convert certain networks like RetinaNet
This could take about 30 minutes to 1 hour

cd /opt/nvidia/
git clone -b release/7.1 https://github.com/NVIDIA/TensorRT.git ./tensorrt
cd tensorrt
git submodule update --init --recursivesudo 
mkdir -p build && cd build
cmake .. -DGPU_ARCHS="53" -DTRT_LIB_DIR=/usr/lib/aarch64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DCMAKE_CUDA_COMPILER:PATH=/usr/local/cuda/bin/nvcc
make -j$(nproc)
echo 'export LD_LIBRARY_PATH=/opt/nvidia/tensorrt/build:$LD_LIBRARY_PATH' >> ~/.bashrc
echo 'export PATH=$LD_LIBRARY_PATH:$PATH' >> ~/.bashrc
source ~/.bashrc

Instal TLT Converter

This step is optional, as you could let the DeepStream run the converter for you, but if you wish to install it, then follow this steps

cd /opt/nvidia/
wget https://developer.nvidia.com/cuda102-trt71-jp45
unzip cuda102-trt71-jp45
mv cuda10.2_trt7.1_jp4.5/tlt-converter ./tlt-converter
chmod u+x tlt-converter
rm cuda102-trt71-jp45
rm -rdf cuda10.2_trt7.1_jp4.5

Convert the model

Execute the following command to convert the model from .etlt to fp32
Update parameters as required specially if you wish to convert into fp16 or int8

Note: this command will not work inside the container, if you wish to do it inside the container let the DeepStream run it for you instead.

/opt/nvidia/tlt-converter \
               -k <Your NGC API KEY>  \
               -d 3,<Model Input Image Height>,<Model Input Image Width> \
               -o NMS \
               -e ~/models/trt.fp32.engine \
               -t fp32 \
               -i nchw \
               -m 8 \
               ~/models/retinanet_resnet18.etlt

Run DeepStream container

You could use deepstream-l4t base or iot, in my case I use iot image and it will be downloaded if you don’t have it yet.

Update parameters as required, for example make sure the device/video parameter works for your environment and also the shared directory “~/” is pointing to the right place.

xhost +
sudo docker run -it --rm --net=host --runtime nvidia  -e DISPLAY=$DISPLAY \
    -v ~/:/opt/nvidia/deepstream/deepstream-5.1/mysharedpath \
    -v /opt/nvidia/rensorrt/build/:/opt/nvidia/deepstream/deepstream-5.1/tensorrt/ \
    --device /dev/video0:/dev/video0 \
    -w /opt/nvidia/deepstream/deepstream-5.1/mysharedpath \
    -v /tmp/.X11-unix/:/tmp/.X11-unix \
    nvcr.io/nvidia/deepstream-l4t:5.1-21.02-iot

Preparing to run DeepStream Application

Once inside the container, we need to setup the system to use our new version of TensorRT instead of the one that comes within the image, this command has to be executed every time the docker image is used.

export LD_LIBRARY_PATH=opt/nvidia/deepstream/deepstream-5.1/tensorrt/
export PATH=$LD_LIBRARY_PATH:$PATH

To run DeepStream use the following command

deepstream-test5-app \
          -c /opt/nvidia/deepstream/deepstream-5.1/mysharedpath/deepstream_app_config.txt

If DeepStream will generate the .engine file for you make sure you are using tlt-model-key and tlt-encoded-model, DeepStream will generate the .engine file in the same location as your tlt-encoded-model.

AastaLLL · March 30, 2021, 9:28am

Good to know you solve it.
Thanks a lot to feedback this with us!

ahsanmuhammadakram1996 · April 29, 2021, 10:57am

Hi there!
I have trained yolo_v4 in TLT 3.0 and now am trying to deploy it in deepstream 5.1 using a sample python app, i.e: deepstream-test1. Currently, I am facing issues with the config files. Would you be kind enough to assist me with that?
Thank you

gutierrez.ge · April 30, 2021, 6:09pm

Yolo3 and 4 require lot of effort to make it work as you need to build a custom plugin (in C++) to extract the information from the network which is very painful, for such I moved to something more out of the box like RetinaNet where the configuration on deepstream is very straight forward, Sorry I can’t help you on this.

ahsanmuhammadakram1996 · May 1, 2021, 9:00am

Thanks for the response. Please do let me know if you find some configuration for the sample yolov4 TLT notebook.
Good day

Topic		Replies	Views
Tlt-convert on jetson nano TAO Toolkit	6	1849	October 12, 2021
MaskRCNN on Xavier - UffParser: Validator error Unsupported operation _GenerateDetection_TRT TAO Toolkit jetson-inference	12	1248	October 12, 2021
Retinanet trained using TLT not deployable with DS-5.0 on a jetson nano TAO Toolkit	3	592	October 12, 2021
Validator error: FirstDimTile_4: Unsupported operation _BatchTilePlugin_TRT TensorRT	9	2777	July 7, 2022
Integrating Tao Models (detectnet_v2) into Deepstream SDK TAO Toolkit tao , deepstream , jetson-nano	11	976	March 24, 2023
Cannot run model exported from TLT on Jetson's DLA TAO Toolkit tensorrt	7	444	October 12, 2021
Nvidia Transfer Learning Toolkit tlt-converter for TensorRT 6 TAO Toolkit	16	1781	October 12, 2021
The tlt-converter does not work well with TensorRT 6 (Jetson TX2) TAO Toolkit	7	770	October 12, 2021
Can I install TRT 8.5 on Jetpack4.6.3? Jetson Nano tensorrt	10	469	September 1, 2023
How to load the FaceDetect model in deepstream testapp1 DeepStream SDK	13	306	August 11, 2023