TensorRT fails to parse bounding boxes and ONNX explicit batch

• Hardware Platform (Jetson / GPU)
Orin AGX
• DeepStream Version
6.0
• JetPack Version
5.0.2

Previously, we have converted ONNX files to TensorRT files for use in our solution.
Now we are attempting to use those models with the Nvidia’s inference plugin.

Attempt 1 (TRT):
Taking 1 mp4 video into a deepstream pipeline, I applied an .engine file and got complaints.

Error in NvDsInferContextImpl::parseBoundingBox() - Could not find output coverage layer for parsing object
Error in NvDsInferContextImpl::fillDetectionOutput() - Failed to parse bboxes

Attempt 2 (ONNX):
Taking 1 mp4 video into a deepstream pipeline, I applied an .onnx file and got complaints.
This version of the ONNX parser only supports TensorRT INetworkDefinitions with an explicit batch dimension. Please ensure the network was created using the EXPLICIT_BATCH NetworkDefinitionCreationFlag.

Any advice on how to proceed with either of these?

Can you share your pipepine info and deepstream config file? Thanks.

Can you elaborate the details about you ONNX model? What are the input layers and output layers, the dimensions and the meaning of all the layers.

hi @Fiona.Chen

Not sure if this is the info that you’re looking for?

The default postprocessing(bbox parser) inside gst-nvinfer does not support such output layer. Please customize your own output layer parser. There are some samples of customized postprocessing in NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream (github.com), you can refer to the SSD model deepstream_tao_apps/pgie_ssd_tao_config.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps (github.com), the postprocessing “NvDsInferParseCustomNMSTLT” is customized to process the SSD model output layers. deepstream_tao_apps/nvdsinfer_custombboxparser_tao.cpp at master · NVIDIA-AI-IOT/deepstream_tao_apps (github.com)

For your model, please consult the guy who provide the model to you for the processing algorithm.

Would the custom output layer parser you mentioned be applied to the TRT output or the ONNX one? (ie would it help resolve Attempt 1 or Attempt 2 listed above?)

The ONNX model is OK. It is the postprocessing function in gst-nvinfer plugin does not match the model causes the problem. Please customize your own postprocessing.

There are three parts inside “gst-nvinfer" : preprocessing, inferenceing and postprocessing.

Inferencing is based on TensorRT, it can support ONNX, UFF, Caffe Model and Caffe Prototxt, …

Please read Gst-nvinfer — DeepStream 6.2 Release documentation and the source code in /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinfer, /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer

To solve the Attempt 2, please provide your gst-nvinfer configuration for your model. And please upgJetPack and DeepStream to the latest version.

TRT attempt

exampel pipeline:
gst-launch-1.0 nvstreammux name=m width=1920 height=1080 batch-size=1 buffer-pool-size=4 ! queue ! nvvideoconvert ! nvinfer config-file-path=config_infer_primary-vizgard_engine.txt batch-size=1 interval=0 ! nvmultistreamtiler width=1920 height=1080 rows=1 columns=1 ! nvdsosd ! queue ! nvvideoconvert ! nvv4l2h264enc ! h264parse ! mp4mux ! filesink location="1sintel-engine-1mp4.mp4" filesrc location='sintel_snip.mp4' ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! 'video/x-raw(memory:NVMM), format=(string)I420' ! queue ! m.sink_0

nvinfer config snippet:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-engine-file=v_yolov7_7classes.engine
batch-size=1
process-mode=1
model-color-format=0
##0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid
force-implicit-batch-dim=1
#parse-bbox-func-name=NvDsInferParseCustomResnet
#custom-lib-path=/path/to/libnvdsparsebbox.so
##1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
#scaling-filter=0
#scaling-compute-hw=0

ONNX attempt:

example pipeline:
gst-launch-1.0 nvstreammux name=m width=1920 height=1080 batch-size=1 buffer-pool-size=4 ! queue ! nvvideoconvert ! nvinfer config-file-path=config_infer_primary-v_onnx.txt batch-size=1 interval=0 ! nvmultistreamtiler width=1920 height=1080 rows=1 columns=1 ! nvdsosd ! queue ! nvvideoconvert ! nvv4l2h264enc ! h264parse ! mp4mux ! filesink location="1sintel-onnx-1mp4.mp4" filesrc location='sintel_snip.mp4' ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! 'video/x-raw(memory:NVMM), format=(string)I420' ! queue ! m.sink_0

nvfinfer config snippet:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
onnx-file=v_yolov7_7classes.onnx
model-engine-file=v_yolov7_7classes.onnx_b1_gpu0_fp16.engine
#labelfile-path=labels.txt
#int8-calib-file=cal_trt.bin
batch-size=1
process-mode=1
model-color-format=0
##0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid
force-implicit-batch-dim=1
#parse-bbox-func-name=NvDsInferParseCustomResnet
#custom-lib-path=/path/to/libnvdsparsebbox.so
##1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
#scaling-filter=0
#scaling-compute-hw=0

From your ONNX graph, it is explicit input, batch size is 1.
Please remove “force-implicit-batch-dim=1”.

Removed “force-implicit-batch-dim” and that seems to have helpeda bit.

Now the issue is:

Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32
Weights [name=Conv_214 + PWN(PWN(Sigmoid_215), Mul_216).weight] had the following issues when converted to FP16:
WARNING: [TRT]:  - Subnormal FP16 values detected

Please ignore the TensorRT warning if the engine is generated correctly.

Letting it run regardless of the warnings ended up throwing the same errors as the Attempt 1 (TRT) in my first post.
I will try and pursue some of the advice given on that.

ERROR: [TRT]: 3: Cannot find binding of given name: conv2d_bbox
0:19:49.046828174 74497 0xaaaaceb02c60 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1868> [UID = 1]: Could not find output layer 'conv2d_bbox' in engine
ERROR: [TRT]: 3: Cannot find binding of given name: conv2d_cov/Sigmoid
0:19:49.046896175 74497 0xaaaaceb02c60 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1868> [UID = 1]: Could not find output layer 'conv2d_cov/Sigmoid' in engine
0:19:49.580717551 74497 0xaaaace9af000 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::parseBoundingBox() <nvdsinfer_context_impl_output_parsing.cpp:59> [UID = 1]: Could not find output coverage layer for parsing objects
0:19:49.580947441 74497 0xaaaace9af000 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:735> [UID = 1]: Failed to parse bboxes

According to the post from @alex247 , your ONNX model output layer name is “output” but not “conv2d_bbox”, and the model is not resnet like model. You need to custmize your own postprocessing instead of using the default postprocessin inside gst-nvinfer. Your gst-nvinfer configuration is wrong. I think I’ve talked about your wrong postprocessing in the previous post.

As you are using yolov7 model, you can refer to our Yolov7 sample. NVIDIA-AI-IOT/yolo_deepstream: yolo model qat and deploy with deepstream&tensorrt (github.com)

Since your yolov7 model is generated by yourselves, please make sure you know how to parsing and do clustering with your model’s output layer and customize your own postprocessing. Please consult the guy who provide the model to you.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.