Inference error while using tensorrt engine on jetson nano

randhar · December 14, 2021, 9:45am

Hello,
I am having the following error while running inference on a trt engine
The engine file is for object detection model ‘ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8’ from tensorflow model zoo.

Error:
[TensorRT] ERROR: 2: [pluginV2DynamicExtRunner.cpp::execute::115] Error Code 2: Internal Error (Assertion status == kSTATUS_SUCCESS failed.)

Environment

TensorRT version: 8.0.1.6
onnx version: 1.10.2
Jetpack: 4.6

Relevant Files

I have uploaded my onnx and trt engine file for the model, and the script I am using for inference.
And I follow the code given in the link: https://developer.nvidia.com/blog/speeding-up-deep-learning-inference-using-tensorflow-onnx-and-tensorrt/ to convert onnx to trt file.

inference.py (5.2 KB)
model.trt (12.4 MB)
model.onnx (10.4 MB)

Please have a look.
Thank you.

AastaLLL · December 15, 2021, 2:59am

Hi,

Have you run this model with onnxruntime before?
It seems that there are some BatchMultiClassNonMaxSuppression which is not supported by TensorRT.

Thanks.

randhar · December 15, 2021, 10:23am

Hello,

Yes I tried running the model with onnxruntime and it gives the following error:

onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from model_final.onnx failed:This is an invalid model. In Node, (“nms/non_maximum_suppression_first”, EfficientNMS_TRT, “”, -1) : (“StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead/scale:0_1”: tensor(float),“StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead/slicer:0_0”: tensor(float),“nms/anchors:0”: tensor(float),) → (“num_detections”: tensor(int32),“detection_boxes”: tensor(float),“detection_scores”: tensor(float),“detection_classes”: tensor(int32),) , Error No Op registered for EfficientNMS_TRT with domain_version of 11

Can you give any idea how to move forward from here?

Thank you

AastaLLL · December 23, 2021, 6:37am

Hi,

Do you convert the model into ONNX with tf2onnx?
Could you share the detailed steps with us?

Thanks.

randhar · December 23, 2021, 9:10am

Hello,
Yes I used tf2onxx for conversion of saved graph to the onnx format. The following command with opset 11 was used for conversion:

python -m tf2onnx.convert --saved-model tensorflow-model-path --opset 11 --output model.onnx

And the following code was used to create tensorrt engine from the onnx file. This code was available on one of the nvidia jetson nano forum regarding conversion to tensorrt engine.

engine.py (1.0 KB)
create_engine.py (692 Bytes)

AastaLLL · December 29, 2021, 7:49am

Hi,

The model from TF2 Object Detection API requires some customization.
Could you follow the tutorial below to convert the model into TensorRT:

Thanks.

AastaLLL · December 29, 2021, 8:53am

Hi,

Confirmed that we can convert the ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8 model into TensorRT on Jetson.
Below is the detailed steps for your reference:

1. Environment

$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-tensorflow:r32.6.1-tf2.5-py3

2. Install Prerequisites

$ apt-get update
$ apt-get install cmake g++ git libprotobuf-dev protobuf-compiler
$ pip3 install onnx tf2onnx pillow

$ git clone https://github.com/NVIDIA/TensorRT.git
$ cd TensorRT/tools/onnx-graphsurgeon/
$ make install
$ python3 -m pip install --force-reinstall dist/*.whl

$ wget https://nvidia.box.com/shared/static/jy7nqva7l88mq9i8bw3g3sklzf4kccn2.whl -O onnxruntime_gpu-1.10.0-cp36-cp36m-linux_aarch64.whl
$ pip3 install onnxruntime_gpu-1.10.0-cp36-cp36m-linux_aarch64.whl

3. Prepare Source

$ cd ../../samples/python/tensorflow_object_detection_api/
$ git clone https://github.com/tensorflow/models.git
$ cp -r models/research/object_detection .
$ protoc object_detection/protos/*.proto --python_out=.

Apply below change:

diff --git a/samples/python/tensorflow_object_detection_api/create_onnx.py b/samples/python/tensorflow_object_detection_api/create_onnx.py
index b6ac423..7292756 100644
--- a/samples/python/tensorflow_object_detection_api/create_onnx.py
+++ b/samples/python/tensorflow_object_detection_api/create_onnx.py
@@ -254,8 +254,8 @@ class TFODGraphSurgeon:
         concat_node.outputs = []
 
         # Disconnect the last node in second preprocessing branch with parent second TensorListStack node.
-        tile_node = self.graph.find_node_by_op("Tile")
-        tile_node.outputs = []
+        #tile_node = self.graph.find_node_by_op("Tile")
+        #tile_node.outputs = []
 
         # Reshape nodes tend to update the batch dimension to a fixed value of 1, they should use the batch size instead.
         for node in [node for node in self.graph.nodes if node.op == "Reshape"]:

4. Convert Model

$ wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz .
$ tar -xvf ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz

$ python3 create_onnx.py --pipeline_config ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/pipeline.config --saved_model ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/saved_model --onnx model.onnx --input_format NCHW
$ python3 build_engine.py --onnx model.onnx --engine engine.trt --precision fp16

5. Test with TensorRT

$ /usr/src/tensorrt/bin/trtexec --loadEngine=engine.trt
...
[12/29/2021-08:40:07] [I] === Performance summary ===
[12/29/2021-08:40:07] [I] Throughput: 146.935 qps
[12/29/2021-08:40:07] [I] Latency: min = 6.73804 ms, max = 8.90845 ms, mean = 6.79419 ms, median = 6.77783 ms, percentile(99%) = 7.34839 ms
[12/29/2021-08:40:07] [I] End-to-End Host Latency: min = 6.7522 ms, max = 8.92114 ms, mean = 6.80563 ms, median = 6.78809 ms, percentile(99%) = 7.36255 ms
[12/29/2021-08:40:07] [I] Enqueue Time: min = 4.5791 ms, max = 8.78735 ms, mean = 5.32989 ms, median = 5.07129 ms, percentile(99%) = 6.9624 ms
[12/29/2021-08:40:07] [I] H2D Latency: min = 0.0742188 ms, max = 0.0761719 ms, mean = 0.0748875 ms, median = 0.0749512 ms, percentile(99%) = 0.0759277 ms
[12/29/2021-08:40:07] [I] GPU Compute Time: min = 6.65723 ms, max = 8.8291 ms, mean = 6.71306 ms, median = 6.69604 ms, percentile(99%) = 7.26758 ms
[12/29/2021-08:40:07] [I] D2H Latency: min = 0.00512695 ms, max = 0.00732422 ms, mean = 0.00624302 ms, median = 0.00634766 ms, percentile(99%) = 0.00708008 ms
[12/29/2021-08:40:07] [I] Total Host Walltime: 1.09572 s
[12/29/2021-08:40:07] [I] Total GPU Compute Time: 1.0808 s
[12/29/2021-08:40:07] [I] Explanations of the performance metrics are printed in the verbose logs.
[12/29/2021-08:40:07] [I] 
&&&& PASSED TensorRT.trtexec [TensorRT v8001] # /usr/src/tensorrt/bin/trtexec --loadEngine=engine.trt
[12/29/2021-08:40:07] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1377, GPU 16207 (MiB)

Thanks.

randhar · December 30, 2021, 9:43am

Hello,

I tried the solution you mentioned above and tried to debug the error but its still the same. The onnx file was created successfully but during the engine generation there is an error. I have provided the log file for your reference. Please have a look.

log_file.txt (2.2 MB)

Thank you.

AastaLLL · January 3, 2022, 4:12am

Hi,

Do you use the ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8 model from TensorFlow model zoo?
Or a customized version?

We have tested the model and it can work correctly with the above instructions.
If the default version is used, would you mind trying it again with some swap memory?

Thanks.

randhar · January 3, 2022, 2:28pm

Hello,
I am using ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8 model fine-tuned for my own dataset with 3 classes.

As for the swap memory I am already using 8GB of it.

While debugging the error, I found some other people are having the same error:
https://github.com/pskiran1/TensorRT-support-for-Tensorflow-2-Object-Detection-Models/issues/6

Can you please have a look?

Thank you.

AastaLLL · January 5, 2022, 5:22am

Hi,

Would you mind sharing your model with us so we can give it a check?
You may need some updates if a customized model is used.

Thanks.

randhar · January 5, 2022, 9:49am

ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.zip (19.1 MB)

The file contains model as well as config file.

Thank you.

AastaLLL · January 11, 2022, 9:22am

Hi,

Thanks for sharing the model.

We try to convert your model with create_onnx.py script.
But meet the following error:

This ORT build has ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] enabled. Since ORT 1.9, you are required to explicitly set the providers parameter when instantiating InferenceSession. For example, onnxruntime.InferenceSession(..., providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'], ...)
INFO:ModelHelper:Found Concat node 'StatefulPartitionedCall/concat_1' as the tip of WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead
INFO:ModelHelper:Found Concat node 'StatefulPartitionedCall/concat' as the tip of WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead
Traceback (most recent call last):
  File "create_onnx.py", line 671, in <module>
    main(args)
  File "create_onnx.py", line 647, in main
    effdet_gs.process_graph(args.first_nms_threshold, args.second_nms_threshold)
  File "create_onnx.py", line 620, in process_graph
    self.graph.outputs = first_nms(-1, True, first_nms_threshold)
  File "create_onnx.py", line 485, in first_nms
    anchors_tensor = self.extract_anchors_tensor(box_net_split)
  File "create_onnx.py", line 311, in extract_anchors_tensor
    anchors_y = get_anchor(0, "Add")
  File "create_onnx.py", line 300, in get_anchor
    if (node.inputs[1].values).size == 1:
AttributeError: 'Variable' object has no attribute 'values'

It seems that you can convert the ONNX model successfully.
Have you applied any customization based on your model?

Thanks.

randhar · January 12, 2022, 9:01am

Hello,

For me the create_onnx.py works out of the box without any customizations and it runs without any error. May be the problem is with the version of some libraries. Here are the versions I am using.

onnx_graphsurgeon : 0.3.10
tensorflow: 2.6.2
onnx: 1.8.1
tf2onnx: 1.8.1
Python: 3.8.10

May be its the version problem only. Please do have a look.

Thank you

AastaLLL · January 20, 2022, 8:17am

Hi,

Just want to confirm first.

Do you run the sample on JetPack 4.6? Since our default python version is 3.6 rather than 3.8.
If you use python 3.8, have you built the TensorRT python package from the binding source?

Thanks.

randhar · January 20, 2022, 2:23pm

Hello,

I am running create_onnx.py script on my ubuntu machine and not on jetson nano. Only the build_engine.py is being run on jetson nano.

Ubuntu machine has python 3.8 and jetson nano has JetPack 4.6 with default python 3.6.

Thank you.

AastaLLL · January 27, 2022, 7:59am

Hi,

Thanks for the information.
We are still checking this internally. Will share more information later.

Thanks.

randhar · February 11, 2022, 3:05pm

Hello,

Any updates with the engine conversion?

AastaLLL · February 24, 2022, 6:49am

Hi,

Sorry that we are still checking this internally.
Will get back to you later.

Thanks for your patience.

AastaLLL · March 9, 2022, 6:48am

Hi,

Thanks for your patience.

We have a new release for the Jetson platform recently.
Do you mind checking this with JetPack 4.6.1 + TensorRT 8.2 to see if it works?

Thanks.

Topic		Replies	Views
Inference error while using tensorrt engine on jetson nano TensorRT tensorrt	3	1793	February 1, 2022
TensorRT Inference error on Jetson nano Jetson Nano tensorrt	28	2999	February 1, 2022
Conversion from tensorrt to engine fails TensorRT tensorrt , cuda , jetson-inference	1	656	January 4, 2022
How to infer using tensorRT on jetson nano? Jetson Nano tensorrt , deep-learning	4	1060	October 15, 2021
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	855	December 8, 2021
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1452	July 12, 2022
I am trying to convert the ONNX SSD mobilnet v3 model into TensorRT Engine. I am getting the below error Jetson TX2 tensorrt , tensorflow	24	3867	February 17, 2022
Problem converting ONNX model to TensorRT Engine for SSD Mobilenet V2 Jetson Nano tensorrt , nvbugs , ssd , onnx	38	8913	October 18, 2021
Tlt-convert for custom trained YoloV4 model failed on Jetson Nano 4G TAO Toolkit	42	2436	August 27, 2021
Issues while converting ONNX to TRT Jetson Nano tensorrt , onnx	9	1348	October 18, 2021

Inference error while using tensorrt engine on jetson nano

Environment

Relevant Files

1. Environment

2. Install Prerequisites

3. Prepare Source

4. Convert Model

5. Test with TensorRT

Related topics