Hi! I have trained a yolov4_tiny model with TAO 5 framework. With the tao model yolo_v4_tiny export
command I have exported the obtained hdf5 file during training to onnx file. When running inference on the same machine that was used for training with the obtained .onnx file I get errors. I provide the details about the system configuration as well as the commands I used below:
• Hardware AWS g4dn.2xlarge ( 1 NVIDIA T4 GPU, 8 CPUs, 32 GiB RAM)
• Network Type Yolo_v4_tiny
• Training spec file
yolov4_tiny.txt (1.8 KB)
• How to reproduce the issue ?
Exporting the .hdf5 file to .onnx:
tao model yolo_v4_tiny export -m /tao-experiments/results/yolov4_tiny_cardbox/weights/yolov4_resnet18_epoch_036.hdf5 -o /tao-experiments/models/cardbox_detection_with_yolov4.onnx -e /tao-experiments/specs/yolov4_tiny.txt -k $KEY --gen_ds_config --verbose
Code for the ONNX model inference:
import onnx
import onnxruntime
model_path = "tao-experiments/models/cardbox_detection_with_yolov4_tiny.onnx"
ort_session = onnxruntime.InferenceSession(model_path, None, providers=['CPUExecutionProvider'])
Error received:
File "test_engine.py", line 12, in <module>
ort_session = onnxruntime.InferenceSession(model_path, None, providers=['CPUExecutionProvider'])
File "/home/d.HUMENIUK/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/d.HUMENIUK/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 424, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from tao-experiments/models_28_08_23/cardbox_detection_with_yolov4.onnx failed:This is an invalid model. In Node, ("BatchedNMS_N", BatchedNMSDynamic_TRT, "", -1) : ("box": tensor(float),"cls": tensor(float),) -> ("BatchedNMS": tensor(int32),"BatchedNMS_1": tensor(float),"BatchedNMS_2": tensor(float),"BatchedNMS_3": tensor(float),) , Error No Op registered for BatchedNMSDynamic_TRT with domain_version of 12
Code for ONNX model verification:
import onnx
import onnxruntime
model_path = "tao-experiments/models_29_08_23/cardbox_detection_with_FasterRCNN.onnx"
onnx_model = onnx.load(model_path)
try:
onnx.checker.check_model(onnx_model)
except onnx.checker.ValidationError as e:
print("The model is invalid: %s" % e)
else:
print("The model is valid!")
Error received:
The model is invalid: Field 'shape' of 'type' is required but missing.
I would be grateful for your help!
P.S. I have also tried training the Yolov4 model as well as the RCNN and I get the same error for the produced ONNX file (I used a specification file dedicated to this models, different from the one I attached)