Errors while reading TensorRT engine file produced by TAO 5

Hi! I have trained a yolov4_tiny model with TAO 5 framework. With the tao deploy yolo_v4_tiny gen_trt_engine command I have converted the obtained .onnx file to the .engine file. When running inference on the same machine that was used for training with the obtained .engine file I get errors. I provide the details about the system configuration as well as the commands I used below:

• Hardware AWS g4dn.2xlarge ( 1 NVIDIA T4 GPU, 8 CPUs, 32 GiB RAM)
• Network Type Yolo_v4_tiny
• Training spec file
yolov4_tiny.txt (1.8 KB)

• How to reproduce the issue ?
Converting .onnx to .engine:

 tao deploy yolo_v4_tiny gen_trt_engine -m /tao-experiments/models/cardbox_detection_with_yolov4_tiny.onnx -e /tao-experiments/specs/yolov4_tiny.txt -r /tao-experiments/results/trt_gen --key $KEY --engine-file /tao-experiments/models/cardbox_detection_with_yolov4_tiny.engine

Reading the .engine file

import tensorrt as trt
engine_path = "tao-experiments/results/trt_gen/carbox_detection_with_yolov4_tiny.engine"
with open(engine_path, 'rb') as f:
    engine_data = f.read()
runtime = trt.Runtime(trt.Logger(trt.Logger.WARNING))
engine = runtime.deserialize_cuda_engine(engine_data)

Error received:

[08/29/2023-17:54:00] [TRT] [E] 1: [runtime.cpp::parsePlan::314] Error Code 1: Serialization (Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed.)

Thank you In advance, any help will be appreciated!

The inference environment should have the same TensorRT version as the environment where the engine is built. I assume you are running with your own script. You can run your script under the docker where the engine is built.
i.e.,
$ tao deploy yolov_v4_tiny run /bin/bash
$ python your.py

Or run the engine via below command directly.
$ tao deploy yolov_v4_tiny inference xxx

1 Like

Dear Morganh,
Thank you for your reply.
Yes, running tao deploy yolov_v4_tiny inference xxx works for me (I have not tried running under the same docker as the engine yet).
What prevents me from using the tao ... inference module is that I am already running my script in a Isaac Sim container. Being able to use tao deploy yolov_v4_tiny inference xxx means I should install tao in the Isaac SIm container, which is not the best solution I guess…
If I ensure that the same versions of TensorRT are used in training and inference environments, do you think I could use scripts such as in TRTInferencer to run the inference with this script directly, without calling tao model yolo_v4_tiny inference module? Thank you!

Yes, you can leverage the inference code. Just make sure the tensorrt version is the same.

1 Like

Dear Morganh,
Thank you for your reply.
Your solution has worked. I would like to clarify the steps I have taken.
First, I have installed the tensorrt version used in TAO 5, which is 8.5.3.1 as specified in the release notes.
It can be done with pip:
pip install tensorrt==8.5.3.1
Then, if you want everything to work out-of-the-box, make sure your ONNX model does not have custom layers. For this, you can perform some operations on the ONNX graph. I performed the operation described in the following post. Without doing this I was getting the following error:
[09/06/2023-15:10:06] [TRT] [E] 1: [pluginV2Runner.cpp::load::300] Error Code 1: Serialization (Serialization assertion creator failed.Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry). I guess there is a possibility to register the custom layers with REGISTER_TENSORRT_PLUGIN API, as described in this post, however I have not tried it yet. After obtaining a trimmed .onnx file, I could successfully deserialize the engine file with runtime.deserialize_cuda_engine(engine_data).

Thanks for the info. Glad to know it works.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.