Can't execute TRT engine on Jetson Nano

Hi,

I’ve been trying to run a retrained model from the Tensorflow Object Detection model zoo, the SSD Mobilenet V2 FPNLite 320x320. I’ve trained it with my own dataset and converted it to ONNX and TRT on my computer sucessfully.

Now I’m trying to run this model as a TRT engine on my Jetson Nano. I’ve sucessfully converted the ONNX model to TRT via trtexec --onnx=<model>, but when I try to run it I get the following error:

This happens both with trtexec and the inference.py script provided here, that is supposed to be the officialy supported one (I used it on my PC).

Any clues on why this might be happening?
Thanks a lot.

This is a sample when running TF2OD with JetPack 4.6.1 and docker.

# launch docker container
sudo docker run --runtime nvidia -it --rm -v /data:/data --network host nvcr.io/nvidia/l4t-tensorflow:r32.7.1-tf2.7-py3

# Install packages
apt-get update
apt-get install --no-install-recommends -y git cmake

pip3 install gdown
gdown https://drive.google.com/uc?id=1JMVY3hRKW8r6EytPoRqvB8U9t82pz0po -O onnx_graphsurgeon-0.3.17-py2.py3-none-any.whl
pip3 install --no-deps onnx_graphsurgeon-0.3.17-py2.py3-none-any.whl
wget https://nvidia.box.com/shared/static/pmsqsiaw4pg9qrbeckcbymho6c01jj4z.whl -O onnxruntime_gpu-1.11.0-cp36-cp36m-linux_aarch64.whl
pip3 install --no-deps onnxruntime_gpu-1.11.0-cp36-cp36m-linux_aarch64.whl
pip3 install -U onnx==1.11.0
pip3 install -U tf2onnx==1.10.1
pip3 install -U pillow==8.4.0

# download TensorRT sample code
mkdir ~/github
cd ~/github
git clone -b release/8.2 --recursive https://github.com/NVIDIA/TensorRT

# Install Tensorflow models
cd ~/github/TensorRT/samples/python/tensorflow_object_detection_api
git clone https://github.com/tensorflow/models
cp -r models/research/object_detection .
protoc object_detection/protos/*.proto --python_out=.
sed -i 's/tile_node/#tile_node/g' create_onnx.py

# download TF2OD model
wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz
tar -xvf ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz

# TF2OD to ONNX
python3 create_onnx.py --pipeline_config ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/pipeline.config --saved_model ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/saved_model --onnx ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.onnx 

# ONNX to TensorRT
python3 build_engine.py --onnx ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.onnx --engine ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.engine --precision fp16

# download cat image from wikipedia
mkdir image_in
wget https://upload.wikimedia.org/wikipedia/commons/0/0b/Cat_poster_1.jpg -O image_in/Cat_poster_1.jpg

# infer
python3 infer.py --engine ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.engine --input image_in --output image_out --preprocessor fixed_shape_resizer -t 0.4

# copy results into the shared directory
cp image_out/* /data/

In the jetson localhost terminal, you can see the inference results in the /data/ directory

1 Like

Hey!

Thanks for the quick response. Will this work on JetPack 4.6? Anyway I think I can update to 4.6.1, but it’d be quicker the other way. I’ll test it and tell you if it works.

For JetPack 4.6, change docker container

# launch docker container
sudo docker run --runtime nvidia -it --rm -v /data:/data --network host nvcr.io/nvidia/l4t-tensorflow:r32.6.1-tf2.5-py3
1 Like

For JetPack 4.6 when running either pip3 install -U onnx==1.11.0 or pip3 install -U tf2onnx==1.10.1 it throws the following error:

Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-36tx39w_/onnx/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-zfj0zurg-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-36tx39w_/onnx/

Hi @naisy, I managed to run my model using JetPack 4.6.1, thanks a lot!

I’m concerned about the GPU usage though. When running inferences I only get around a 5% usage, as jtop shows:

This is while running my SSD Mobilenet model, which takes around 3 - 5 seconds per inference. Is this an issue on how the docker is built, of my model, of the script…?

Thanks again!

Large sized images, such as wikipedia image, require more time for pre-processing and post-processing.
Also, when you run infer.py, it loads the import libraries, TensorRT model, allocates the memory for the model, and initializes the memory.
Therefore, inference on a single image is inefficient.

You can try with multiple 300x300 images.

1 Like

You were right, it was the size of the images. Now the GPU is being fully used and it takes around 0.1 seconds to do an inference.

Thanks for everything! You can close now.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.