I found the solution, here are the proper steps to get RetinaNet working on a Jetson Nano, basically, you have to get TensorRT installed properly, JetPack does not have this properly installed or at all, and to do that you need CMAKE v3.13, then CMAKE and TensorRT system variables needs to be properly configured for the tlt-converter to work.
Instructions for Jetson
This are the instructions for running this project within a Jetson device.
- Jetson Nano 4GB
- 128 GB SD Card
- Jetpack 4.5.1
- DeepStream 5.1 used via Docker image (No need to install it)
Prepare the Jetson
- Update and upgrade all packages
- Check cmake version
- If version is lower than 3.13, then install curl and remove current cmake version, otherwise proceed to step 4 and skip step 5
apt install unzip curl libssl-dev libcurl4-openssl-dev
apt remove cmake
- Remove any unused packages
- Install cmake 3.13
Installing cmake could take time (30 min - 1 hour)
tar xpvf cmake-3.13.0.tar.gz cmake-3.13.0/
echo 'export PATH=/opt/nvidia/cmake-3.13.0/bin:$PATH' >> ~/.bashrc
TensorRT must be installed in order to properly convert certain networks like RetinaNet
This could take about 30 minutes to 1 hour
git clone -b release/7.1 https://github.com/NVIDIA/TensorRT.git ./tensorrt
git submodule update --init --recursivesudo
mkdir -p build && cd build
cmake .. -DGPU_ARCHS="53" -DTRT_LIB_DIR=/usr/lib/aarch64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DCMAKE_CUDA_COMPILER:PATH=/usr/local/cuda/bin/nvcc
echo 'export LD_LIBRARY_PATH=/opt/nvidia/tensorrt/build:$LD_LIBRARY_PATH' >> ~/.bashrc
echo 'export PATH=$LD_LIBRARY_PATH:$PATH' >> ~/.bashrc
Instal TLT Converter
This step is optional, as you could let the DeepStream run the converter for you, but if you wish to install it, then follow this steps
mv cuda10.2_trt7.1_jp4.5/tlt-converter ./tlt-converter
chmod u+x tlt-converter
rm -rdf cuda10.2_trt7.1_jp4.5
Convert the model
Execute the following command to convert the model from .etlt to fp32
Update parameters as required specially if you wish to convert into fp16 or int8
Note: this command will not work inside the container, if you wish to do it inside the container let the DeepStream run it for you instead.
-k <Your NGC API KEY> \
-d 3,<Model Input Image Height>,<Model Input Image Width> \
-o NMS \
-e ~/models/trt.fp32.engine \
-t fp32 \
-i nchw \
-m 8 \
Run DeepStream container
You could use deepstream-l4t base or iot, in my case I use iot image and it will be downloaded if you don’t have it yet.
Update parameters as required, for example make sure the device/video parameter works for your environment and also the shared directory “~/” is pointing to the right place.
sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY \
-v ~/:/opt/nvidia/deepstream/deepstream-5.1/mysharedpath \
-v /opt/nvidia/rensorrt/build/:/opt/nvidia/deepstream/deepstream-5.1/tensorrt/ \
--device /dev/video0:/dev/video0 \
-w /opt/nvidia/deepstream/deepstream-5.1/mysharedpath \
-v /tmp/.X11-unix/:/tmp/.X11-unix \
Preparing to run DeepStream Application
Once inside the container, we need to setup the system to use our new version of TensorRT instead of the one that comes within the image, this command has to be executed every time the docker image is used.
To run DeepStream use the following command
If DeepStream will generate the .engine file for you make sure you are using tlt-model-key and tlt-encoded-model, DeepStream will generate the .engine file in the same location as your tlt-encoded-model.