I’ve taken mobilenetSSD v2 from Tensorflow Model zoo and trainied on a custom dataset, changing input size to 512x288.
I tried all possible strategies I know in order to convert it in TensorRT but I couldn’t succeed.
Firstly, I tried Frozen graph (.pb file) → ONNX → TensorRT. After converting the model in ONNX with tf2onnx tool, the error in parsing the onnx file was “Unsupported ONNX data type: UINT8 (2)”.
I searched on the forums and a lot of users had the same problem, so I tried one of the solution proposed that is using graphsurgeon in order to modify input type, as described here: Unsupported ONNX data type: UINT8 (2) · Issue #400 · onnx/onnx-tensorrt · GitHub.
After modifying the input type and setting to float32, parsing the onnx modified file gives the error: “In function importUpsample:  Assertion failed: scales_input.is_weights() onnx”.
Indeed, I also tried to simplify the onnx with onnx_simplifier, to parse the onnx and to build the engine using trtexec, but the result is always the same.
The second strategy I tried is Frozen graph (.pb file) → UFF → TensorRT. In order to do that, I followed GitHub - AastaNV/TRT_object_detection: Python sample for referencing object detection model with TensorRT example. I was able to reproduce it with the square input (300x300), but my netowrk has rectangular shape (512x288). Therefore, I tried to use GridAnchorRect_TRT plugin available in TensorRT OSS and I was able to execute my network, but predicted bounding boxes are wrong and shifted.
Finally, I tried TF-TRT following Accelerating Inference In TF-TRT User Guide :: NVIDIA Deep Learning Frameworks Documentation. As reported in documentation, there is a simple script that displays which nodes were excluded for the engine: if there are any nodes listed besides the input placeholders, TensorRT engine, and output identity nodes, your engine does not include the entire model. In my case, a lot of nodes are excluded and so I can’t serialize it (as reported in documentation) and then deserialize in order to use in pure TensorRT.
TensorRT Version: 7.2.1 (tried also with 7.2.2)
GPU Type: RTX 2070
CUDA Version: 10.2
CUDNN Version: 8
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.14
The script described in Unsupported ONNX data type: UINT8 (2) · Issue #400 · onnx/onnx-tensorrt · GitHub in order to modify input type using graphsurgeon is:
import onnx_graphsurgeon as gs
import numpy as np
graph = gs.import_onnx(onnx.load(“model.onnx”))
for inp in graph.inputs:
inp.dtype = np.float32
Hi, Request you to share the model and script so that we can try reproducing the issue at our end.
Also we recommend you to check the below samples links, as they might answer your concern
Since our network is private, I can’t share the model with you, but our model is the same of Tensorflow 1 model zoo (except for rectangular shape), so the link to download the network is:
Note: The following methods start from the frozen graph of tensorflow model.
I’m going to report all the steps and scripts I used for the 3 methods:
In order to obtain ONNX file using tf2onnx tool, the command is:
python3 -m tf2onnx.convert --input model_name.pb --output output.onnx --inputs input_layer_name:0 --outputs output_layer_name1:0,output_layer_name2:0
For simplicity, I report the command used with trtexec to create engine and serialize file (the same result is by invoking API functions directly with parser->parseFromFile). The command is:
./trtexec --onnx=/home/aitech/tensor_rt/crowd.onnx --shapes=1X288X512X3 --saveEngine=/home/aitech/tensor_rt/trtexec.plan
(where shapes is in format NHWC, in case of my custom network is the one above)
This gives the error: Unsupported ONNX data type: UINT8 (2).
To get rid of this error, I modified input type with graphsurgeon (code written above).
After that modification, the error is:
In function importUpsample:  Assertion failed: scales_input.is_weights()
Following GitHub - AastaNV/TRT_object_detection: Python sample for referencing object detection model with TensorRT, the scripts I wrote for my rectangular network are:
main.py (3.8 KB) config2.py (2.7 KB)
python3 main.py --img_path path_of_image
As I said before, the bounding boxes are wrong and shifted.
I also checked the link you reported. Since the examples are based on Saved Model file, I wrote the following script which loads a frozen_graph model. However, when I try to serialize the optimezed model to create .plan file, adding the lines described here Accelerating Inference In TF-TRT User Guide :: NVIDIA Deep Learning Frameworks Documentation, a lot of nodes are excluded.
Exclude Node: …
tf_trt_prova.py (1.7 KB)
Note. I successfully converted and executed Tensorflow MobilenetV2 classification models in TensorRT via UFF.
Both errors you got in the ONNX method are for not supported operations by trt. And UFF method we don’t support now.
The documentation is correct: “If there are any nodes listed besides the input placeholders, TensorRT engine, and output identity nodes, your engine does not include the entire model”. This means that TF-TRT is not a good way to convert the model. A better converter (TF2 version of TF-TRT from our latest containier or tf-nigthly) might have better conversion rate, but there is no guarantee.
You can run the conversion the following way
TF_CPP_VMODULE=convert_graph=2,segment=2 python convert_script.py 2>&1 | grep 'Not a TF-TRT candidate'
This will print the list of excluded nodes together with the reason why they are excluded. Based on that output we can tell whether there is any hope to get the complete model converted by TF-TRT.