Model input-output names, no problem on TRT 7.3.1 , but ends up with an error with TRT 8.0.1

Hi there.
I am running a pre-trained and pre-optimized and serialized model on jetson nano in the .Uff format. it is a Tensorflow developed model, and is converted from a .pb model saved model.
The problem arises when I tend to run the model, and I do not know the names of the input and output of the model. I improvised some names, it accepts the input’s name and rejects the output name, and fortunately worked for my jetson nano at hand (a developer kit) with:
python 3.6.9
TensorRT 7.3.1
CUDA 4.5.3-dev
JetPack v.4.5.1 (L4T 32.5.1)
python 3.6.9

the model runs at the end and gives me the right output, even with the wrong output_name provided, and the clear error about the wrong output name before running the model.
The problem starts when trying to replicate the working solution on another jetson nano kit, the TensorRT gives me the same error about not finding the model’s output name (just like before) but this time kills the program immediately.
the new jetson nano has:

Ubuntu 18.04.5 LTS
Jetpack 4.6 (L4T 32.6.1)
TensorRT 8.0.1
CUDA 4.1.1
python 3.6.9

Thanks for your help in advance.

Hi,

Do you have the correct output name?
If yes, would you mind trying the TensorRT 8.0 with the correct output flag?

Thanks.

Hi. well, the whole problem is the output name. I don’t have access to the model names and properties because I’m deploying a pre-trained and made TRT compatible for jetson nano. I haven’t found the possibility to explore the model’s characteristics via the TRT’s python API, so, stuck at this point!

I have just read about the deprecation of the UFF and Caffe parsers from the TensorRT 8.0 onwards! I would appreciate any information regarding this change among TR versions and my problem, which I am exactly using a UFF format for my model.

Hi,

When running convert_to_uff.py, you can use -t flag to get the layer name of the whole model.
Would you mind checking if you can find out the output layer with the tool?

Thanks.

Hi, AastaLLL, and thanks for your response.
If you take another look at my problem definition, there is mentioned that I am not converting anything into the .UFF format. the model is ALREADY converted, in fact, that itself is the bottleneck, otherwise, I had access to the model properties via TF and I could get the names. but that’s not the case.

Hi,

So you only have the uff file and it can work with TRT 7 but fail with TRT 8.
Is that correct?

If yes, would you mind sharing the file with us?
We need to run it locally before giving a further suggestion.

Thanks.

That is completely true.
Here you are the model’s .uff file and the class labels file.

class_labels.txt (10 Bytes)
fire_detector.uff (56.1 MB)

Hi,

Would you mind also sharing the command used for TensorRT 7.1 and 8.0 with us?

Thanks.

I am not using command-line commands for using TensorRT.
What I am doing for running the model is quite the same as the samples provided in this address: (/usr/src/tensorrt/samples/python/Introductory_parser_samples)
as the uff_resnet_50.py sample has used it. with the Tensorrt’s python API, and pycuda.

Hi,

So you use the same python script for TRT 7.1 and TRT 8.0, is that correct?
We are checking this internally, will share more information with you later.

Thanks.

That’s correct. Running the same identical python3 program with TRT 7.1 works, and with TRT 8.0 does NOT work.
And just to remind better, the problem was the wrong output name, which the error arises with the TRRT 7.1, but the program goes on and runs the model, but with TRT 8.0 the error prevents the engine to be created.

Here is the answer.
when you build the model using “builder.build_cuda_engine(network)”,
you have access to the model’s bindings, which are accessible by indexing the model:
engine[0]: the first binding
engine[1]: the second binding, which in my case, (which I was looking for the model’s output name), I could just fetch it using the engine[1], and there’s no more error of the output name.

The documentation which helped me was:
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/Engine.html

The specific part which explained my problem:

The engine can be indexed with [] . When indexed in this way with an integer, it will return the corresponding binding name. When indexed with a string, it will return the corresponding binding index.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.