Hi! I have trained a yolov4_tiny model with TAO 5 framework. With the tao deploy yolo_v4_tiny gen_trt_engine command I have converted the obtained .onnx file to the .engine file. When running inference on the same machine that was used for training with the obtained .engine file I get errors. I provide the details about the system configuration as well as the commands I used below:
The inference environment should have the same TensorRT version as the environment where the engine is built. I assume you are running with your own script. You can run your script under the docker where the engine is built.
$ tao deploy yolov_v4_tiny run /bin/bash
$ python your.py
Or run the engine via below command directly.
$ tao deploy yolov_v4_tiny inference xxx
Thank you for your reply.
Yes, running tao deploy yolov_v4_tiny inference xxx works for me (I have not tried running under the same docker as the engine yet).
What prevents me from using the tao ... inference module is that I am already running my script in a Isaac Sim container. Being able to use tao deploy yolov_v4_tiny inference xxx means I should install tao in the Isaac SIm container, which is not the best solution I guess…
If I ensure that the same versions of TensorRT are used in training and inference environments, do you think I could use scripts such as in TRTInferencer to run the inference with this script directly, without calling tao model yolo_v4_tiny inference module? Thank you!
Thank you for your reply.
Your solution has worked. I would like to clarify the steps I have taken.
First, I have installed the tensorrt version used in TAO 5, which is 18.104.22.168 as specified in the release notes.
It can be done with pip: pip install tensorrt==22.214.171.124
Then, if you want everything to work out-of-the-box, make sure your ONNX model does not have custom layers. For this, you can perform some operations on the ONNX graph. I performed the operation described in the following post. Without doing this I was getting the following error: [09/06/2023-15:10:06] [TRT] [E] 1: [pluginV2Runner.cpp::load::300] Error Code 1: Serialization (Serialization assertion creator failed.Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry). I guess there is a possibility to register the custom layers with REGISTER_TENSORRT_PLUGIN API, as described in this post, however I have not tried it yet. After obtaining a trimmed .onnx file, I could successfully deserialize the engine file with runtime.deserialize_cuda_engine(engine_data).