Exporting a yolov5s best.pb Model to .engine Format (TensorRT)

Description

I want to convert a PyTorch model into a TensorRT model, but I have the impression that the device where I’m trying to perform the conversion doesn’t have enough memory, causing the conversion to fail. I would like to know if there is any way I can deal with this Python version (3.6.9) and this hardware architecture (NVIDIA Tegra X2, 3832MiB) to get the .engine (TensorRT) model and make it work properly.

The model has been trained on a dedicated machine (AWS EC2 g4dn.xlarge) using the YOLOv5 repository. The resulting .pt files from the command are attached below.

The export from PyTorch format (.pb) to TensorRT format (.engine) has been attempted on the TX2 NX device.
Throughout the export process, these errors and warnings appear:

[TensorRT] ERROR: Tactic Device request: 277MB Available: 264MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 277 detected for tactic 5.
[TensorRT] ERROR: Tactic Device request: 356MB Available: 267MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 356 detected for tactic 4.

Command used for model training (result: best.pt):
python train.py --data data/config_training.yaml --weights yolov5s.pt --img 640 --epochs 300 --device 0 --batch -1 --patience 15

Environment

TensorRT Version: 8.0.1.6
GPU Type: NVIDIA® Jetson™ TX2 NX - 4 Core ARM A57 Complex and 2 core NVIDIA Denver2 64-bit CPU - 4GB LPDDR4 - NVIDIA Tegra X2, 3832MiB
Nvidia Driver Version: nvidia-smi not installed
CUDA Version: cuda_10.2_r440.TC440_70.29663091_0
CUDNN Version:
Operating System + Version: Linux aaeon-desktop 4.9.253-tegra #1 SMP PREEMPT Mon Jul 26 12:19:28 PDT 2021 aarch64 GNU/Linux
Python Version: 3.6.9
PyTorch Version: 1.8.0
ONNX Version: 1.11.0
Baremetal or Container (if container which image + tag): yolov5==6.2.1

Relevant Files

For exportation to model.engine

For testing the model.engine

Steps To Reproduce

  1. Generate a virtual environment with python 3.6.9
python3 -m venv ./newenvdeleteafter
source newenvdeleteafter/bin/activate
  1. Install YOLOv5 (you may be prompted to install additional libraries)
pip install yolov5==6.2.1
  1. Run the command for export ( I’ve also tried with sudo):
    yolov5 export --weights best.pt --include [engine] --device 0 --imgsz [800,608] --data mydata.yaml --half --workspace 1

During exportation, log shows:

First warning:

WARNING: ⚠️ Python 3.7.0 is required by YOLOv5, but Python 3.6.9 is currently installed

Second warning:

[TensorRT] ERROR: Tactic Device request: 277MB Available: 264MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 277 detected for tactic 5.
[TensorRT] ERROR: Tactic Device request: 356MB Available: 267MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 356 detected for tactic 4.
  1. Loading the model.engine and showing results
    badResults

Concerns

On my PC with a larger amount of memory and Python v.3.10, the export process does not yield any warnings and works well. Because I can’t increase the memory of my TX2 NX, I would like to understand how relevant the warnings are and what would be the best way to avoid them. I believe upgrading to Python 3.7 is one option, but I would like to know if there are possibilities to make it work on 3.6.9

Below the result when i export the model on my PC.

I hope the question is clear. I apologize if the information is not accurate or is imprecise. I am not an expert in this matter and am learning as we develop this project. Thank you very much in advance.

I’ve also shared this information in the YOLOv5 GitHub repository

Hi @nicolasrodriguezlucena , I belive its best if you try using python3.7 first. Try and setup a virtual enviroment, this is what I’ve used. And I had better luck using directly python or trtexec to export the models. The python way is pretty similar to this one. And the trtexec method is just sending the cmd similar to this one:

trtexec --onnx=migrated.onnx --saveEngine=trt_engine.trt 

You can add the --fp16 flag to try and make it a little lighter.
Also I’ve seen those warnings before and even with that, the models worked fine.

Regards,
Andres
Embedded SW Engineer at RidgeRun
Contact us: support@ridgerun.com
Developers wiki: https://developer.ridgerun.com
Website: www.ridgerun.com

Hi @andres.artavia, thank you for your suggestion.

I have recently been installing Python 3.7.12 on the Jetson TX2 NX, and in a virtual environment, I have tried to install PyTorch from the source (1.8.0). The issue is that when I attempt to install torchvision (0.9.0), it says it does not recognize the torch installation. I’m not sure what could be causing this. I have posted about it on Stack Overflow and also in the unofficial PyTorch Discord help channel. I’m taking the opportunity to provide evidence in this thread as well, in case someone is searching for this specific problem regarding the Jetson TX2 NX architecture. Any ideas regarding the torch installation?

Update.

I have detected an error.

I ran the command:

sudo find / -type d -name "torch";

And I found the path: /home/aaeon/pytorch/torch/lib/python3.7/site-packages/caffe2/torch
For this reason, I added the path to the bash using export.

However, within the virtual environment path of python3.7.12, the site-packages directory does not contain torch, which means it has not been installed correctly within the environment.

Running pip list does indicate that torch 1.8.0 is installed.
I’ve decided to re-compile pytorch again: python setup.py bdist_wheel, but still not working

I’ve not ecountered that same issue. But the config that worked when I compiled torch on the tx2 nx, with torch version 1.8 and python3.8 was:

export USE_NCCL=0
export USE_DISTRIBUTED=1
export USE_QNNPACK=1
export USE_PYTORCH_QNNPACK=1
export TORCH_CUDA_ARCH_LIST="6.2"
export PYTORCH_BUILD_NUMBER=1

Lastly you can try if possible, to do a local install, instead of using the venv, to verify the path that its using.

Thanks @andres.artavia,

I finally managed to install PyTorch.

After running the command python setup.py install, I had to go to the /dist/ folder and execute pip install torch-1.8.0a0+37c1f4a-cp37-cp37m-linux_aarch64.whl.

After that, I also succeeded in installing TorchVision 0.9.0, but now I’m facing the issue of not having a valid candidate for tensorrt.whl. Everything I’ve read suggests that I need to compile TensorRT from source, but I don’t know how to do it*, and I’m not even sure which version would be compatible with my architecture (aarch64) and CUDA 10.2.

*I saw here that I can build it from the source but the info inside the github readme is not so clear for me. I tried it but i failed.

If someone can give me more detailed steps I would be really glad about it.
For example, i don’t know if TensorRT 8.2 is suitable for python 3.7, pytorch 1.8 or I will need to move to another TensorRT version.

Thank you very much!

I actually did it quite recent, it’s a bit involved but here is what I noted from back then:
Trt main build

git clone -b master https://github.com/nvidia/TensorRT TensorRT
cd TensorRT
export TRT_LIBPATH=/usr/lib/aarch64-linux-gnu/
export TRT_OSSPATH=$(pwd)
git submodule update --init --recursive
cd $TRT_OSSPATH
mkdir -p build && cd build
cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=10.2
sudo CC=/usr/bin/gcc make -j$(nproc)

Python bindings(Replace with your version, the headers are from python’s website):

export EXT_PATH=~/external
mkdir -p $EXT_PATH && cd $EXT_PATH
git clone https://github.com/pybind/pybind11.git
wget https://www.python.org/ftp/python/3.8.0/Python-3.8.0.tar.xz
tar -xvf Python-3.8.0.tar.xz

Copy headers(replace with your version):

cd $EXT_PATH/pybind11/include
cp /usr/include/python3.8/Python.h .
cp -r $EXT_PATH/Python-3.8.0/Include/* .

Apply this patch:

diff --git a/python/build.sh b/python/build.sh
index cedbf3ad..25c07fc9 100755
--- a/python/build.sh
+++ b/python/build.sh
@@ -35,8 +35,9 @@ cmake .. -DCMAKE_BUILD_TYPE=Release \
          -DEXT_PATH=${EXT_PATH} \
          -DCUDA_INCLUDE_DIRS=${CUDA_ROOT}/include \
          -DTENSORRT_ROOT=${ROOT_PATH} \
-         -DTENSORRT_BUILD=${ROOT_PATH}/build/
-make -j12
+         -DTENSORRT_BUILD=${ROOT_PATH}/build/ \
+         -DPY_INCLUDE=/home/nvidia/external/pybind11/include/
+make -j6

Finally compile(replace with your version):

PYTHON_MAJOR_VERSION=3 PYTHON_MINOR_VERSION=8 TARGET_ARCHITECTURE=aarch64 ./build.sh

Then install the python wheel.
The steps were obtained from here.
Check out the trt version, I belive I used this version.
Regards,
Andres

Thanks @andres.artavia ,

I’ve used your guide and modified some things for me:

  • python version --> 3.7.12
export EXT_PATH=~/external
mkdir -p $EXT_PATH && cd $EXT_PATH
git clone https://github.com/pybind/pybind11.git
wget https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tar.xz
tar -xvf Python-3.7.12.tar.xz
  • python folder --> 3.7m
cd $EXT_PATH/pybind11/include
cp /usr/include/python3.7m/Python.h .
cp -r $EXT_PATH/Python-3.7.12/Include/* .
  • user folder --> aaeon
-DPY_INCLUDE=/home/aaeon/external/pybind11/include/
  • python minor version --> 7
PYTHON_MAJOR_VERSION=3 PYTHON_MINOR_VERSION=7 TARGET_ARCHITECTURE=aarch64 ./build.sh

I got some errors with the last command:

~/TensorRT/python$ PYTHON_MAJOR_VERSION=3 PYTHON_MINOR_VERSION=7 TARGET_ARCHITECTURE=aarch64 ./build.sh
~/TensorRT/python/build ~/TensorRT/python
Building for TensorRT version: 8.2.0, library version: 8
CMake Error at /opt/cmake-3.27.9-linux-aarch64/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:277 (message):
  CMAKE_CUDA_ARCHITECTURES must be non-empty if set.
Call Stack (most recent call first):
  CMakeLists.txt:48 (project)


-- Configuring incomplete, errors occurred!
CMake Error: Unable to open check cache file for write. /home/aaeon/TensorRT/python/CMakeFiles/cmake.check_cache
make: *** No targets specified and no makefile found.  Stop.
Generating python 3.7 bindings for TensorRT 8.2.0.6
~/TensorRT/python/packaging ~/TensorRT/python/build ~/TensorRT/python
~/TensorRT/python/build ~/TensorRT/python
/home/aaeon/venvs/venv37/lib/python3.7/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
~/TensorRT/python

CMakeLists.txt:48

Line 48: message(STATUS "TENSORRT_BUILD: ${TENSORRT_BUILD}")

I tried searching for solutions, but I found none.
Any help is appreciated. Thank you very much for the effort and time invested.

Hi,
Did you compiled TensorRT first. I believe you have to compile TRT and then compile the TRT python binds.
Regards,
Andres