ActionRecognitionNet deployment error : Failed to parse onnx model (IIfConditionalOutputLayer)

Genu · August 30, 2023, 12:28pm

Hello,
I’ve recently trained an action recognition model in tao launcher kit sample notebook using custom data.I’ve successfully exported the model to onnx format using this command:

!tao model action_recognition export \
                   -e $SPECS_DIR/test_i3d_export.yaml \
                   -k $KEY \
                   results_dir=$RESULTS_DIR/rgb_3d_ptm \
                   export.checkpoint=$RESULTS_DIR/rgb_3d_ptm/train/rgb_i3d_128_modelv2.tlt \
                   export.onnx_file=$RESULTS_DIR/export/rgb_i3d_128_model.onnx

I tried to generate an inference engine for the exported model using

!tao deploy trtexec  --onnx=$RESULTS_DIR/export/rgb_i3d_128_model.onnx \
        --maxShapes=input_rgb:16x3x128x224x224 \
        --minShapes=input_rgb:1x3x128x224x224 \
        --optShapes=input_rgb:4x3x128x224x224 \
        --best \
        --saveEngine=$RESULTS_DIR/export/rgb_i3d_128_modelv2.engine \
        --verbose

but I got the following error while parsing the model layers :

[08/30/2023-09:30:56] [V] [TRT] Registering layer: /If_OutputLayer for ONNX node: /If_OutputLayer
[08/30/2023-09:30:56] [E] Error[4]: /If_OutputLayer: IIfConditionalOutputLayer inputs must have the same shape.

I checked the onnx model using the netron app and I found that “then” & “else” branches of the “if” node have different shapes which is currently not supported in tensorrt,I find this error quite strange since I’ve followed Action Recognition documentation as it is and still got this error.
I tried to work around this error by using

onnxsim

Screenshot from 2023-08-30 13-23-14

which simplified the"If" node but the simplified model results are different from the .tlt version and it is no longer reliable.Any suggestions on how I could solve this?

• Hardware: RTX 3060 Driver version 535.86
• Network Type: ActionRecognitionNet
• TLT Version: TAO 5.0.0-deploy
• TensorRT version: 8.5.3 + CUDA 12

Morganh · August 30, 2023, 4:44pm

Could you please share test_i3d_export.yaml ?

Genu · August 31, 2023, 7:41am

Here is the spec file you requested:
test_i3d_export.yaml (577 Bytes)
I’m using i3d pretrained weights specified in this repo.

Morganh · August 31, 2023, 7:55am

Do you mean you were training with “i3d pretrained weights” as pretrained model ? Could you please share the link of this weights? Thanks.

Genu · August 31, 2023, 8:25am

Thank you for your answer. Yes,I used the “i3d kinetics pretrained model” found in the repo I mentioned.
Here is the spec file I used for training:
train_rgb_3d_128_i3d.yaml (856 Bytes).

Morganh · August 31, 2023, 8:56am

To narrow down, please follow TRTEXEC with ActionRecognitionNet - NVIDIA Docs to generate engine with fp16 mode to check if it works.

Genu · August 31, 2023, 9:18am

I did try the fp16 precision and still got the same error,the problem seems to be with the model parsing,tensorrt doesn’t support the “If” node having different shapes in its branches. So it is related to the model architecture which is different from the original pretrained model used in the actionrecognitionnet example.

Morganh · August 31, 2023, 9:24am

We will check further. Thanks for the finding.

Morganh · September 4, 2023, 8:28am

Based on https://github.com/NVIDIA/tao_pytorch_backend/blob/main/nvidia_tao_pytorch/cv/action_recognition/scripts/export.py, please try below new export.py.
export.py (8.2 KB) .
You can login docker and then run following commands.

$ docker run --runtime=nvidia -it nvcr.io/nvidia/tao/tao-toolkit:5.0.0-pyt /bin/bash
Then inside the docker, 
root@cd9a119e7e02:/opt/nvidia/tools# mv /usr/local/lib/python3.8/dist-packages/nvidia_tao_pytorch/cv/action_recognition/scripts/export.py /usr/local/lib/python3.8/dist-packages/nvidia_tao_pytorch/cv/action_recognition/scripts/export.py.bak
root@cd9a119e7e02:/opt/nvidia/tools# cp export.py /usr/local/lib/python3.8/dist-packages/nvidia_tao_pytorch/cv/action_recognition/scripts/export.py

root@cd9a119e7e02:/opt/nvidia/tools# action_recognition export xxx

Genu · September 8, 2023, 11:00am

Thank you for your reply @Morganh , I’ve tried this solution and I’ve successfully generated the engine. It seems that the approach I’ve taken was the right one

The simplified model had nothing to do with the unreliable results, it was the preprocessing configuration in deepstream that affected the output.I’ve tried inferencing directly with the engine in a script and the results were similar to “.tlt” version.
Also, can you confirm that simplifying the onnx model doesn’t affect its performance?
Thank you for your help.

Morganh · September 8, 2023, 4:42pm

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Above-mentioned solution from my last comment is to solve export regression issue due to pytorch upgrade. The squeeze operation in torch leading to If node in ONNX model which cannot be parsed correctly by tensorrt.
For performance, you can run trtexec to check the fps.

system · September 25, 2023, 1:26am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Errors while reading ONNX file produced by TAO 5 TAO Toolkit	11	1712	September 6, 2023
Onnx python post-processing vs. TAO train post processing TAO Toolkit	6	571	October 24, 2023
LPRNet can't use exported engine file TAO Toolkit	18	2506	December 28, 2021
Issues with ONNX to TensorRT Conversion for the Faster R-CNN Mobilenet V3 Model TensorRT tensorrt , python , onnx	4	1739	July 7, 2023
Multitask_classification export to onnx inference TAO Toolkit onnx	9	174	July 23, 2024
Trtexec create engine failed from onnx when adding dynamic shapes TensorRT	5	2111	June 22, 2021
Error when run executeV2() with dynamic shape model from Onnx file TensorRT	10	3254	September 14, 2023
Usability of the exported onnx file TAO Toolkit	3	600	October 3, 2023
Error code 4 internal error unnamed layer TensorRT cudnn	9	1340	October 4, 2024
Unable to export QAT yolov3 in int8 TAO Toolkit	7	550	April 25, 2023

ActionRecognitionNet deployment error : Failed to parse onnx model (IIfConditionalOutputLayer)

Related topics