I found the solution, here are the proper steps to get RetinaNet working on a Jetson Nano, basically, you have to get TensorRT installed properly, JetPack does not have this properly installed or at all, and to do that you need CMAKE v3.13, then CMAKE and TensorRT system variables needs to be properly configured for the tlt-converter to work.
Instructions for Jetson
This are the instructions for running this project within a Jetson device.
System Requirements
- Jetson Nano 4GB
- 128 GB SD Card
- Jetpack 4.5.1
- DeepStream 5.1 used via Docker image (No need to install it)
Prepare the Jetson
- Update and upgrade all packages
apt update
apt upgrade
- Check cmake version
cmake --version
- If version is lower than 3.13, then install curl and remove current cmake version, otherwise proceed to step 4 and skip step 5
apt install unzip curl libssl-dev libcurl4-openssl-dev
apt remove cmake
- Remove any unused packages
apt autoremove
- Install cmake 3.13
Installing cmake could take time (30 min - 1 hour)
cd /opt/nvidia
wget http://www.cmake.org/files/v3.13/cmake-3.13.0.tar.gz
tar xpvf cmake-3.13.0.tar.gz cmake-3.13.0/
rm cmake-3.13.0.tar.gz
cd cmake-3.13.0/
./bootstrap --system-curl
make -j4
echo 'export PATH=/opt/nvidia/cmake-3.13.0/bin:$PATH' >> ~/.bashrc
source ~/.bashrc
Install TensorRT
TensorRT must be installed in order to properly convert certain networks like RetinaNet
This could take about 30 minutes to 1 hour
cd /opt/nvidia/
git clone -b release/7.1 https://github.com/NVIDIA/TensorRT.git ./tensorrt
cd tensorrt
git submodule update --init --recursivesudo
mkdir -p build && cd build
cmake .. -DGPU_ARCHS="53" -DTRT_LIB_DIR=/usr/lib/aarch64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DCMAKE_CUDA_COMPILER:PATH=/usr/local/cuda/bin/nvcc
make -j$(nproc)
echo 'export LD_LIBRARY_PATH=/opt/nvidia/tensorrt/build:$LD_LIBRARY_PATH' >> ~/.bashrc
echo 'export PATH=$LD_LIBRARY_PATH:$PATH' >> ~/.bashrc
source ~/.bashrc
Instal TLT Converter
This step is optional, as you could let the DeepStream run the converter for you, but if you wish to install it, then follow this steps
cd /opt/nvidia/
wget https://developer.nvidia.com/cuda102-trt71-jp45
unzip cuda102-trt71-jp45
mv cuda10.2_trt7.1_jp4.5/tlt-converter ./tlt-converter
chmod u+x tlt-converter
rm cuda102-trt71-jp45
rm -rdf cuda10.2_trt7.1_jp4.5
Convert the model
Execute the following command to convert the model from .etlt to fp32
Update parameters as required specially if you wish to convert into fp16 or int8
Note: this command will not work inside the container, if you wish to do it inside the container let the DeepStream run it for you instead.
/opt/nvidia/tlt-converter \
-k <Your NGC API KEY> \
-d 3,<Model Input Image Height>,<Model Input Image Width> \
-o NMS \
-e ~/models/trt.fp32.engine \
-t fp32 \
-i nchw \
-m 8 \
~/models/retinanet_resnet18.etlt
Run DeepStream container
You could use deepstream-l4t base or iot, in my case I use iot image and it will be downloaded if you don’t have it yet.
Update parameters as required, for example make sure the device/video parameter works for your environment and also the shared directory “~/” is pointing to the right place.
xhost +
sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY \
-v ~/:/opt/nvidia/deepstream/deepstream-5.1/mysharedpath \
-v /opt/nvidia/rensorrt/build/:/opt/nvidia/deepstream/deepstream-5.1/tensorrt/ \
--device /dev/video0:/dev/video0 \
-w /opt/nvidia/deepstream/deepstream-5.1/mysharedpath \
-v /tmp/.X11-unix/:/tmp/.X11-unix \
nvcr.io/nvidia/deepstream-l4t:5.1-21.02-iot
Preparing to run DeepStream Application
Once inside the container, we need to setup the system to use our new version of TensorRT instead of the one that comes within the image, this command has to be executed every time the docker image is used.
export LD_LIBRARY_PATH=opt/nvidia/deepstream/deepstream-5.1/tensorrt/
export PATH=$LD_LIBRARY_PATH:$PATH
To run DeepStream use the following command
deepstream-test5-app \
-c /opt/nvidia/deepstream/deepstream-5.1/mysharedpath/deepstream_app_config.txt
If DeepStream will generate the .engine file for you make sure you are using tlt-model-key and tlt-encoded-model, DeepStream will generate the .engine file in the same location as your tlt-encoded-model.