Using nhwc format instead of nchw for deepstream

Hi, I’m modifying the deepstream-test1 app to use my own int8 tensorrt engine. The model takes input in NHWC format but the app expects input in the NCHW format.

INFO: [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT image_arrays:0 640x640x3
1 OUTPUT kINT32 num_detections 0
2 OUTPUT kFLOAT detection_boxes 1024x4
3 OUTPUT kFLOAT detection_scores 1024
4 OUTPUT kFLOAT detection_classes 1024

0:00:08.703674449 4915 0x558fe32520 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::preparePreprocess() <nvdsinfer_context_impl.cpp:874> [UID = 1]: RGB/BGR input format specified but network input channels is not 3

• Hardware Platform Jetson
• DeepStream Version 5.0
• JetPack Version 4.4
• TensorRT Version 7.2
• Issue Type questions
• Requirement details Ability to shift from NCHW input format to NHWC format in deepstream-test1 app

Hi,

In general, Deepstream expected NCHW format.

Which model format do you use?
For the uff-based model, there is a configure to specify NCHW or NHWC input.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html

More, which Jetson do you use?

Thanks.

I followed this example:

I had used automl link to get the saved pb file which i used to generate the onnx. The engine i finally got is hence in the NHWC format.

I’m using Jetson NX.

Hi,

Thanks for the feedback.

The sample is to demonstrate how to convert/run a TF efficientdet with TensorRT.
We will check if it is possible to integrate with Deepstream.

Please noted the sample requires a TensoRT v8.0 (JetPack 4.6) package.
But the Deepstream package for JetPack 4.6 is not available yet.

Thanks.

The sample had option to use older plugins with --legacy_plugins flag. I’m using:
• JetPack Version 4.4
• TensorRT Version 7.2

If deepstream can’t use nhwc, then can you suggest how i could add a transpose layer at the input stage either in the onnx using graphsurgeon etc or in the tensorRT engine directly. I see that tensorRT has a transpose layer but I’m having difficulty in adding that to the existing engine.
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Graph/Layers.html#ishufflelayer

Hi,

Just want to confirm the environment first.
Do you use TensorRT 7.1.3, the default version in JetPack 4.4?

$ cat /usr/include/aarch64-linux-gnu/NvInferVersion.h

Thanks.

I’m really sorry for the confusion. Yes, the tensorRT version is 7.1.3.
It was a different machine with 7.2.1.6

Hi,

You can convert an NCHW model with the create_onnx.py script directly.
Would you mind giving it a try?

$ python create_onnx.py \
    --input_shape '1,3,512,512' \
    --saved_model /path/to/saved_model \
    --onnx /path/to/model.onnx

Thanks.

I’ll do this and post an update.

Hi,

Does it work with the NCHW model?

Thanks.

Hi,
Following your suggestion gave some errors in the engine building process. Output using trtexec:

[09/16/2021-20:26:49] [E] [TRT] /home/DV_RA/torch_to_trt/onnx-tensorrt/ModelImporter.cpp:685: While parsing node number 2 [Conv → “efficientnet-lite4/stem/tpu_batch_normalization/FusedBatchNormV3:0”]:
[09/16/2021-20:26:49] [E] [TRT] /home/DV_RA/torch_to_trt/onnx-tensorrt/ModelImporter.cpp:686: — Begin node —
[09/16/2021-20:26:49] [E] [TRT] /home/DV_RA/torch_to_trt/onnx-tensorrt/ModelImporter.cpp:687: input: “preprocessor/mean:0_1”
input: “efficientnet-lite4/stem/conv2d/Conv2D_weights_fused_bn”
input: “efficientnet-lite4/stem/conv2d/Conv2D_bias_fused_bn”
output: “efficientnet-lite4/stem/tpu_batch_normalization/FusedBatchNormV3:0”
name: “efficientnet-lite4/stem/conv2d/Conv2D”
op_type: “Conv”
attribute {
name: “dilations”
ints: 1
ints: 1
type: INTS
}
attribute {
name: “strides”
ints: 2
ints: 2
type: INTS
}
attribute {
name: “kernel_shape”
ints: 3
ints: 3
type: INTS
}
attribute {
name: “auto_pad”
s: “SAME_UPPER”
type: STRING
}
attribute {
name: “group”
i: 0
type: INT
}

[09/16/2021-20:26:49] [E] [TRT] /home/DV_RA/torch_to_trt/onnx-tensorrt/ModelImporter.cpp:688: — End node —
[09/16/2021-20:26:49] [E] [TRT] /home/DV_RA/torch_to_trt/onnx-tensorrt/ModelImporter.cpp:691: ERROR: /home/DV_RA/torch_to_trt/onnx-tensorrt/builtin_op_importers.cpp:579 In function importConv:
[6] Assertion failed: nchan == -1 || kernelWeights.shape.d[1] * ngroup == nchan
[09/16/2021-20:26:49] [E] Failed to parse onnx file
[09/16/2021-20:26:49] [E] Parsing model failed
[09/16/2021-20:26:49] [E] Engine creation failed
[09/16/2021-20:26:49] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # trtexec --onnx=efficientdet-lite4_tf_iou05_080921.onnx --fp16 --saveEngine=a

Also, I found this. It’s however not useful for me as i don’t have a uff model.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html

Hi,

Thanks for your testing.

We are checking this issue internally.
Will share more information with you later.

Hi,

Thanks for your patience.

We test the sample with the efficientdet_d0_coco17_tpu-32 model.
After generating the NCHW model from create_onnx.py, we can successfully build it into TensorRT engine with the build_engine.py script.

$ python3 create_onnx.py --saved_model efficientdet_d0_coco17_tpu-32/saved_model --onnx model_NCHW.onnx --input_shape '1,3,512,512'
$ python3 build_engine.py --onnx model_NCHW.onnx --engine engine_NCHW.trt --precision fp16 --workspace 1

If you keep meeting the error, please upgrade the environment with JetPack4.6 and try it again.

Thanks.

1 Like