Errors while reading ONNX file produced by TAO 5

Hi! I have trained a yolov4_tiny model with TAO 5 framework. With the tao model yolo_v4_tiny export command I have exported the obtained hdf5 file during training to onnx file. When running inference on the same machine that was used for training with the obtained .onnx file I get errors. I provide the details about the system configuration as well as the commands I used below:

• Hardware AWS g4dn.2xlarge ( 1 NVIDIA T4 GPU, 8 CPUs, 32 GiB RAM)
• Network Type Yolo_v4_tiny
• Training spec file
yolov4_tiny.txt (1.8 KB)

• How to reproduce the issue ?

Exporting the .hdf5 file to .onnx:

tao model yolo_v4_tiny export -m /tao-experiments/results/yolov4_tiny_cardbox/weights/yolov4_resnet18_epoch_036.hdf5 -o /tao-experiments/models/cardbox_detection_with_yolov4.onnx -e /tao-experiments/specs/yolov4_tiny.txt -k $KEY --gen_ds_config --verbose

Code for the ONNX model inference:

import onnx
import onnxruntime
model_path = "tao-experiments/models/cardbox_detection_with_yolov4_tiny.onnx"
ort_session = onnxruntime.InferenceSession(model_path, None, providers=['CPUExecutionProvider'])

Error received:

File "test_engine.py", line 12, in <module>
    ort_session = onnxruntime.InferenceSession(model_path, None, providers=['CPUExecutionProvider'])
  File "/home/d.HUMENIUK/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/d.HUMENIUK/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 424, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from tao-experiments/models_28_08_23/cardbox_detection_with_yolov4.onnx failed:This is an invalid model. In Node, ("BatchedNMS_N", BatchedNMSDynamic_TRT, "", -1) : ("box": tensor(float),"cls": tensor(float),) -> ("BatchedNMS": tensor(int32),"BatchedNMS_1": tensor(float),"BatchedNMS_2": tensor(float),"BatchedNMS_3": tensor(float),) , Error No Op registered for BatchedNMSDynamic_TRT with domain_version of 12

Code for ONNX model verification:

import onnx
import onnxruntime
model_path = "tao-experiments/models_29_08_23/cardbox_detection_with_FasterRCNN.onnx"
onnx_model = onnx.load(model_path)
try:
    onnx.checker.check_model(onnx_model)
except onnx.checker.ValidationError as e:
    print("The model is invalid: %s" % e)
else:
    print("The model is valid!")

Error received:

The model is invalid: Field 'shape' of 'type' is required but missing.

I would be grateful for your help!

P.S. I have also tried training the Yolov4 model as well as the RCNN and I get the same error for the produced ONNX file (I used a specification file dedicated to this models, different from the one I attached)

Please check if netron can open the onnx files.
Then, please use tao deploy to generate tensorrt engine based on onnx file and then run inference.
Reference:
https://github.com/NVIDIA/tao_tutorials/blob/main/notebooks/tao_launcher_starter_kit/yolo_v4_tiny/yolo_v4_tiny.ipynb,
https://github.com/NVIDIA/tao_deploy/blob/main/nvidia_tao_deploy/cv/yolo_v4/scripts/gen_trt_engine.py.

1 Like

Dear Morganh,
Thank you very much for your reply.
Yes, netron can successfully open the model.
I am attaching the exported “.png” file with the model architecture.
In my application, it would be very useful to be able to run the inference without using the tao model yolo_v4_tiny inference as I want to run the inference in the Isaac Sim standalone app. And the Isaac Sim is running in a container in the AWS instance. Do you think it would be possible, or running the inference with the tensorrt engine is the only way?
Thank you!

It is not the only way but it is suggested to run inference with the tensorrt engine. May I know the doc link when you “run the inference in the Isaac Sim standalone app”? Also, could you please share the .onnx files? Thanks.

1 Like

Dear Morganh,
Thank you for your reply.
I am attaching the .onnx file.
cardbox_detection_with_yolov4_tiny.onnx (22.6 MB)
In my application, I want to read images from a camera and pass them to the model to run the inference. Reading the images from the camera in a standalone app is well described in the documentation: camera example.
It was also shown that it is possible to run the inference in Isaac Sim: inference in Isaac Sim..
What I would like to do is run the inference on the image obtained from a camera in Isaac Sim with the object detection model that I produced with TAO 5.

It seems like the ONNX model contains an unsupported layer that performs NMS (Non-Max-Suppresion). This is some proprietary piece that is not supported by a popular ONNX runtime framework like OnnxRuntime.

I just got the same error. This raises several questions for me:

  1. Can I export the model without the included NMS layer? I can perform NMS of Soft-NMS by myself.
  2. If not, how do you propose we run the ONNX model? If it does not support OnnxRuntime, is there any other runtime that is supported that runs on non-Nvidia hardware?

Best regards,
Ben

1 Like

Please trim the exported onnx file to a new onnx file. Then perform NMS by yourself.

1 Like

Thank you!

I tried to extract a sub-model using the Python ONNX utils as documented here. But as usual it’s not going to be easy (error that occurs).

I also found documentation on how to insert NMS for another type of network into the ONNX graph. But again this is very model specific.

Before spending a day on this I would like to know if there is another way? My second question, how would you run an exported ONNX model? As far as I can tell there is no runtime that can support it because it does not comply with the ONNX opset v15. Also, the used NMS layer it not supported by any opset.

1 Like

Please use below way to trim the onnx. For example, to trim https://forums.developer.nvidia.com/uploads/short-url/tc4tmv15H16NiFPOqOa1QNswBnM.onnx

import onnx_graphsurgeon as gs
import numpy as np
import onnx

model = onnx.load("cardbox_detection_with_yolov4_tiny.onnx")
graph = gs.import_onnx(model)

tensors = graph.tensors()

graph.inputs = [tensors["Input"].to_variable(dtype=np.float32, shape=( "N", 3,512,768))]
graph.outputs = [tensors["box"].to_variable(dtype=np.float32), tensors["cls"].to_variable(dtype=np.float32)]

graph.cleanup()

onnx.save(gs.export_onnx(graph), "cardbox_detection_with_yolov4_tiny_cut.onnx")


Then, no issue when run it with

import onnx
import onnxruntime
#model_path = "./cardbox_detection_with_yolov4_tiny.onnx"
model_path = "./cardbox_detection_with_yolov4_tiny_cut.onnx"
ort_session = onnxruntime.InferenceSession(model_path, None, providers=['CPUExecutionProvider'])

The BatchedNMS implementation can refer to https://github.com/NVIDIA/TensorRT/tree/23.08/plugin/batchedNMSPlugin

3 Likes

Dear Morganh and bke,
Thank you very much for your inputs.
The last script you provided @Morganh works well and I could successfully open the trimmed model and observe its outputs. I did it on the AWS g4dn.2xlarge instance. Thank you very much for providing this answer! If you don’t have further questions @bke, I think we can mark this issue as solved. Thank you again!

@Morganh, highly appreciated, this allows us to proceed. @d.humeniuk, you can mark the post as solved.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.