Nvinfer input formats issue

I am currently using the nvinfer plugin to perform classification for an image. Ran into a weird error regarding the input format.

model-colour-formats=0 (RGB)

0:00:04.980314545 20682     0x35f94b90 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<dr> NvDsInferContext[UID 418]: Error in NvDsInferContextImpl::preparePreprocess() <nvdsinfer_context_impl.cpp:875> [UID = 418]: RGB/BGR input format specified but network input channels is not 3

Changing it to model-colour-formats=2 (GRAY)

0:00:05.401674841 21336      0x684f190 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<dr> NvDsInferContext[UID 418]: Error in NvDsInferContextImpl::preparePreprocess() <nvdsinfer_context_impl.cpp:884> [UID = 418]: GRAY input format specified but network input channels is not 1.

The model input is as follows:

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 299, 299, 3) 0                                            
__________________________________________________________________________________________________

Environment

TensorRT Version : 7.1.3.0
Jetpack Version : 4.5
CUDA Version : 10.2.89
CUDNN Version : 8.0.0.180

Based on this thread, the onnx model has to be converted to NCHW instead of NHWC.

Ran

python3 -m tf2onnx.convert --input ./dr_model.h5 --inputs input_1:0[1,299,299,3] --inputs-as-nchw input_1:0 --outputs sequential_1/dense_2/sigmoid:0 --opset 13 --fold_const --output dr_test.onnx

Running into the following error:

AssertionError: sequential_1/dense_2/sigmoid is not in graph

The last layer of the model from netron and model.summary() is the following:

mixed10 (Concatenate)           (None, 8, 8, 2048)   0           activation_86[0][0]              
                                                                 mixed9_1[0][0]                   
                                                                 concatenate_2[0][0]              
                                                                 activation_94[0][0]              
__________________________________________________________________________________________________
sequential (Sequential)         (None, 1)            2099201     mixed10[0][0]

Not exactly sure what the outputs should be here

Hi,

You can visualize the ONNX model in the below page.
It can help you to find out the correct output node name:

https://netron.app/

Thanks.

i managed to convert the ONNX model by saving the keras model into saved model format first before using tf2onnx.

0:00:05.620264989 22453     0x18a67390 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<dr> NvDsInferContext[UID 418]: Error in NvDsInferContextImpl::preparePreprocess() <nvdsinfer_context_impl.cpp:875> [UID = 418]: RGB/BGR input format specified but network input channels is not 3
ERROR: Infer Context prepare preprocessing resource failed., nvinfer error:NVDSINFER_CONFIG_FAILED

However, the input formats issue is still there even after using --inputs-as-nchw when converting.

Hi,

Since Deepstream SDK uses TensorRT as inference backend.
Could you try to convert the ONNX model with trtexec first?

$ /usr/src/tensorrt/bin/trtexec --onnx=[your/model]

If the issue still occurs, could you share the ONNX file with us?
Thanks.

@AastaLLL

Thanks for your help. I’m working with @kurkur14 on this.

gives:

----------------------------------------------------------------
Input filename:   ./model_name_here.onnx
ONNX IR version:  0.0.7
Opset version:    13
Producer name:    tf2onnx
Producer version: 1.8.5
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/06/2021-15:20:20] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of 'std::out_of_range'
  what():  Attribute not found: axes
Aborted (core dumped)

Netron shows input shape of that .onnx is 1x3x299x299

nvinfer fails with

ERROR: ModelImporter.cpp:472 In function importModel:
[4] Assertion failed: !_importer_ctx.network()->hasImplicitBatchDimension() && "This version of the ONNX parser only supports TensorRT INetworkDefinitions with an explicit batch dimension. Please ensure the network was created using the EXPLICIT_BATCH NetworkDefinitionCreationFlag."
ERROR: Failed to parse onnx file
ERROR: failed to build network since parsing model errors.
ERROR: failed to build network.

I will ask if this is possible.

Hi,

Could you check if adding --explicitBatch flag helps?
If not, it will help if we can directly check on the model.

Thanks.

Thanks, @AastaLLL

I’ve been doing devops stuff recently so haven’t had much time, but I will try this and get back to you. I have permission to share one of the models if needs be.

Hi,

Have you tried this?

Thanks.

Not quite yet. Tasked on something else at the moment but I will get back to it soon hopefully. Sorry for holding the issue open. I appreciate the help in any case.

Thanks! I tried an older version of the .onnx converted with keras2onnx and with this flag, trtexec seems satisfied.

----------------------------------------------------------------
Input filename:   qa_edge_latest.onnx
ONNX IR version:  0.0.7
Opset version:    12
Producer name:    keras2onnx
Producer version: 1.7.0
Domain:           onnxmltools
Model version:    0
Doc string:       
----------------------------------------------------------------
[08/19/2021-13:27:10] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/19/2021-13:27:10] [W] Dynamic dimensions required for input: input_1, but no shapes were provided. Automatically overriding shape to: 1x299x299x3
[08/19/2021-13:27:34] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[08/19/2021-13:29:00] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[08/19/2021-13:29:03] [I] Starting inference threads
[08/19/2021-13:29:07] [I] Warmup completed 0 queries over 200 ms
[08/19/2021-13:29:07] [I] Timing trace has 0 queries over 3.11609 s
[08/19/2021-13:29:07] [I] Trace averages of 10 runs:
[08/19/2021-13:29:07] [I] Average on 10 runs - GPU latency: 122.669 ms - Host latency: 122.833 ms (end to end 122.853 ms, enqueue 54.9077 ms)
[08/19/2021-13:29:07] [I] Average on 10 runs - GPU latency: 81.9146 ms - Host latency: 82.0219 ms (end to end 82.0319 ms, enqueue 11.799 ms)
[08/19/2021-13:29:07] [I] Average on 10 runs - GPU latency: 81.9658 ms - Host latency: 82.0742 ms (end to end 82.0847 ms, enqueue 10.4944 ms)
[08/19/2021-13:29:07] [I] Host Latency
[08/19/2021-13:29:07] [I] min: 81.8126 ms (end to end 81.8221 ms)
[08/19/2021-13:29:07] [I] max: 457.511 ms (end to end 457.617 ms)
[08/19/2021-13:29:07] [I] mean: 94.4128 ms (end to end 94.4261 ms)
[08/19/2021-13:29:07] [I] median: 82.0251 ms (end to end 82.0359 ms)
[08/19/2021-13:29:07] [I] percentile: 457.511 ms at 99% (end to end 457.617 ms at 99%)
[08/19/2021-13:29:07] [I] throughput: 0 qps
[08/19/2021-13:29:07] [I] walltime: 3.11609 s
[08/19/2021-13:29:07] [I] Enqueue Time
[08/19/2021-13:29:07] [I] min: 4.32751 ms
[08/19/2021-13:29:07] [I] max: 457.194 ms
[08/19/2021-13:29:07] [I] median: 11.614 ms
[08/19/2021-13:29:07] [I] GPU Compute
[08/19/2021-13:29:07] [I] min: 81.7048 ms
[08/19/2021-13:29:07] [I] max: 456.928 ms
[08/19/2021-13:29:07] [I] mean: 94.2882 ms
[08/19/2021-13:29:07] [I] median: 81.9185 ms
[08/19/2021-13:29:07] [I] percentile: 456.928 ms at 99%
[08/19/2021-13:29:07] [I] total compute time: 3.11151 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --explicitBatch --onnx=qa_edge_latest.onnx

I will try with DeepStream next. Is there an equivalent option for nvinfer?

Ok. Seems I’m in the same situation my coworker was. The model in the post directly above was converted using keras2onnx and is in format nhwc. DeepStream did not like that, so I converted to .onnx using tf2onnx using more or less the same incantation as my coworker:

python3 -m tf2onnx.convert --saved-model dr --inputs input_1:0[1,299,299,3] --inputs-as-nchw input_1:0 --opset 13 --fold_const --output dr_test.onnx

When I try to trtexec that model which, according to netron has input shape 1x3x299x299 I get:

...
----------------------------------------------------------------
Input filename:   qa_edge_latest.onnx
ONNX IR version:  0.0.7
Opset version:    13
Producer name:    tf2onnx
Producer version: 1.9.1
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
[08/19/2021-16:31:33] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of 'std::out_of_range'
  what():  Attribute not found: axes
Aborted

In this case, --explicitBatch does not help.

I am authorized to share one of the models for the purpose of resolving this. What’s the best way to do that?

Hi,

Could you share the original Keras model as well as the NCHW and NHWC ONNX model?
You can attach the file directly on this topic or share a DRIVE link with us.

If the public release is a concern, you can share it via private message.

Thanks.

I have successfully converted a keras model to a tensorrt model and executed it using the autonomous driving platform donkeycar.
It may be helpful, so I will write the how to.

【Convert h5 to engine】
The points:

  • Use tf2onnx
  • Check inputs of the onnx model using netron
  • Use --shapes options with trtexec
#https://github.com/onnx/tensorflow-onnx
#pip install -U tf2onnx
#python convert_h5_to_engine.py
########################################
# h5 to onnx
########################################
import tensorflow as tf
import onnx
import tf2onnx.convert
from tensorflow.keras.models import Model, load_model
model = load_model('linear.h5')
#model.save('linear.pb')

onnx_model, _ = tf2onnx.convert.from_keras(model)
onnx.save(onnx_model, 'linear.onnx')


########################################
# onnx to engine (sh command)
########################################
import os
command = f'/usr/src/tensorrt/bin/trtexec --onnx=linear.onnx --saveEngine=linear.engine --fp16 --shapes=img_in:1x120x160x3 --explicitBatch'
os.system(command)

code:

【Inference】
The points:

  • I don’t need to convert from HWC to CHW. Use HWC format.
    def preprocess(self, image):
        # image: rgb image

        # RGB convert from [0, 255] to [0.0, 1.0]
        x = image.astype(np.float32) / 255.0
        # HWC to CHW format
        #x = x.transpose((2, 0, 1)) # keras -> ONNX -> TRT8, don't need HWC to CHW. model inputs uses HWC.
        # Flatten it to a 1D array.
        x = x.reshape(-1)
        #x = x.ravel()
        return x

code:


DEMO

Requirements:
・Jetson (AGX Xavier or Nano 4GB or 2GB)
・JetPack 4.6
・PC (Ubuntu or Mac or Windows)

【PC】
Install/Launch donkeycar simulator.
Download unity simulator:

Ubuntu 16.04-20.04: DonkeySimLinux.zip
Mac: DonkeySimMac.zip
Windows: DonkeySimWin.zip

Ubuntu example:

wget https://github.com/tawnkramer/gym-donkeycar/releases/download/v21.07.24/DonkeySimLinux.zip
unzip DonkeySimLinux.zip
cd DonkeySimLinux
chmod 755 donkey_sim.x86_64
./donkey_sim.x86_64

【Jetson】

# on Jetson
# Launch donkeycar docker
wget https://raw.githubusercontent.com/naisy/overdrive/master/docker/run-donkeycar-jetson.sh
chmod 755 run-donkeycar-jetson.sh
sudo su
./run-donkeycar-jetson.sh

# on Docker container
# Download donkeycar linear model
gdown https://drive.google.com/uc?id=1e7HMUoUfvDzUxK8pvAKntp_HkkoiEEVa -O linear.h5
cp ~/projects/donkeycar_tools/convert_h5_to_engine.py ./
cp ~/projects/donkeycar_tools/racer.py ./
python convert_h5_to_engine.py
# wait for the model to be created

# use pc address for host. (simulator server's ip address)
python racer.py --host=192.168.0.xxx --name=naisy --model=linear.engine --delay=0.1

If the vehicle crashes into the inner cone, change the delay to 0.11.
If the vehicle overrun, change the delay to 0.09

【Youtube】
I’m sorry, I haven’t taken the TensorRT version of the video.
However, as with this video, autonomous driving will be possible.

I hope this helps you.

Yes, we do this in Python and it works, but it’s kinda silly in our case since there’s a transpose node from NHWC to NCHW inside the network itself (so two useless transposes).

We’re going Keras → SavedModel format → tf2onnx.convert…

Probably ending up in the same place.

Thanks for your post.

Hi, mdegans

Have you fixed this issue with @naisy’s suggestion?
If not, could you share the original Keras model as well as the NCHW and NHWC ONNX model with us?

Thanks.

Please check your PMs. I sent the model a while ago. Perhaps a notification didn’t get sent out. Client does not want to share the Keras model but I did send the .onnx versions.

Thanks,
Mike

Suggestion is not suitable to DeepStream pipeline. Also input format not the issue. Transpose issue is resolved. Issue now is std::out_of_range in TensorRT. Please review above log and shared .onnx model I sent via PM.

Thanks,
Mike

Hi,

Thanks for your sharing and sorry for the delay.

We are checking this internally.
Will update more informaiton with you later.

Thanks! I appreciate it!