Error in Yolov4 engine conversion,

Hi @Morganh

I am getting following error during generation of Yolo engine through tlt-converter.

./tlt-converter -k nvidia_tlt  \
>                     -d 3,544,960 \
>                     -o BatchedNMS \
>                     -e /export/trt.fp16.engine \
>                     -t fp16 \
>                     -i nchw \
>                     -m 8 \
>                      yolov4_resnet18.etlt
[WARNING] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[INFO] ModelImporter.cpp:135: No importer registered for op: BatchedNMSDynamic_TRT. Attempting to import as plugin.
[INFO] builtin_op_importers.cpp:3659: Searching for plugin: BatchedNMSDynamic_TRT, plugin_version: 1, plugin_namespace: 
[ERROR] INVALID_ARGUMENT: getPluginCreator could not find plugin BatchedNMSDynamic_TRT version 1
ERROR: builtin_op_importers.cpp:3661 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[INFO] Detected input dimensions from the model: (-1, 3, 544, 960)
[ERROR] Model has dynamic shape but no optimization profile specified.
Aborted (core dumped)

Please help me out thanks.

Where did you generate trt engine, in Jetson devices?

Yes @Morganh I was trying to generate engine file on NX-Xavier.

Can you add β€œ-p” option? See YOLOv4 β€” TAO Toolkit 3.22.05 documentation

More, please build TRT OSS plugin. See https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/yolo_v4.html#generating-an-engine-using-tao-converter

Hi @Morganh
I have follow the same steps mentioned in https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps and able to successfully build TRT OSS for jetson device But got following error while running yolo4 config.

./apps/tao_detection/ds-tao-detection  -c configs/yolov4_tao/pgie_yolov4_tao_config.txt -i $DS_SRC_PATH/samples/streams/sample_720p.h264
Now playing: configs/yolov4_tao/pgie_yolov4_tao_config.txt
Opening in BLOCKING MODE
Opening in BLOCKING MODE 
Opening in BLOCKING MODE
Opening in BLOCKING MODE 
0:00:01.281728859 29425   0x55b94d8030 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1716> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: UffParser: Unsupported number of graph 0
parseModel: Failed to parse UFF model
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:05.481046611 29425   0x55b94d8030 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1736> [UID = 1]: build engine file failed
Bus error (core dumped)

And the configuration file is :

################################################################################
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=/home/smarg/Documents/Pritam/Models/yolov4/yolov4_labels.txt
#model-engine-file=/home/smarg/Documents/Pritam/Models/yolov4//yolov4_resnet18.etlt_b1_gpu0_fp16.engine
int8-calib-file=/home/smarg/Documents/Pritam/Models/yolov4/cal.bin
tlt-encoded-model=/home/smarg/Documents/Pritam/Models/yolov4/yolov4_resnet18.etlt
tlt-model-key=nvidia_tlt
infer-dims=3;544;960
maintain-aspect-ratio=1
uff-input-order=0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=4
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=../../post_processor/libnvds_infercustomparser_tao.so

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Thanks.

Hi @Morganh

I have also tried with -p option but getting please check the encoding key to make sure it’s correct

Command :
./tlt-converter -k nvidia_tlt -d 3,544,960 -p image_input,1x3x544x960,1x3x544x960,1x3x544x960 -o BatchedNMS -e /export/trt.fp16.engine -t fp16 -i nchw -m 8 yolov4_resnet18.etlt

[ERROR] UffParser: Unsupported number of graph 0
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Network must have at least one output
[ERROR] Network validation failed.
[ERROR] Unable to create engine
Segmentation fault (core dumped)

See YOLOv4 β€” TAO Toolkit 3.22.05 documentation
The input name for YOLOv4 is Input.
Could you modify?

Hi @Morganh
With Input name β†’ Input also getting the same issue.

Is the key correct?

Yes I am using Nvidia-Pretrained model.

Hi @Morganh any suggestion ? How can I resolve this issue.?

I am checking. Will update to you if there is any. Thanks.

Okay. Thanks.

Can you rebuild the libnvinfer_plugin.so again ? I can generate the trt engine successfully in NX with this official yolo_v4 etlt model.

Step:

$ git clone -b 21.03 https://github.com/nvidia/TensorRT
$ cd TensorRT/
$ git submodule update --init --recursive
$ export TRT_SOURCE=pwd
$ cd $TRT_SOURCE
$ mkdir -p build && cd build
$ /usr/local/bin/cmake … -DGPU_ARCHS=72 -DTRT_LIB_DIR=/usr/lib/aarch64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=pwd/out

$ ll /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so*
Previously My NX is using libnvinfer_plugin.so.7.1.3
$ sudo cp libnvinfer_plugin.so.7.2.2 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.1.3
$ sudo ldconfig

$ ./tlt-converter -k nvidia_tlt -d 3,544,960 -e trt.fp16.engine -t fp16 -p Input,1x3x544x960,1x3x544x960,1x3x544x960 yolov4_resnet18.etlt

Okay Thanks I will Try.

Your jetpack version is also β†’ 4.5.1 [L4T 32.5.1] ?

No, my NX is installed 4.4.

$ apt-cache show nvidia-jetpack
Package: nvidia-jetpack
Version: 4.4.1-b50
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 4.4.1-b50), nvidia-opencv (= 4.4.1-b50), nvidia-cudnn8 (= 4.4.1-b50), nvidia-tensorrt (= 4.4.1-b50), nvidia-visionworks (= 4.4.1-b50), nvidia-container (= 4.4.1-b50), nvidia-vpi (= 4.4.1-b50), nvidia-l4t-jetson-multimedia-api (>> 32.4-0), nvidia-l4t-jetson-multimedia-api (<< 32.5-0)
Homepage: Autonomous Machines | NVIDIA Developer
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_4.4.1-b50_arm64.deb
Size: 29412
SHA256: ec502e1e3672c059d8dd49e5673c5b2d8c606584d4173ee514bbc4376547a171
SHA1: 75a405f1ad533bfcd04280d1f9b237b880c39be5
MD5sum: 1267b31d8b8419d9847b0ec4961b15a4
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Okay,

I am using 4.5.

**apt-cache show nvidia-jetpack**

Package: nvidia-jetpack
Version: 4.5.1-b17
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 4.5.1-b17), nvidia-opencv (= 4.5.1-b17), nvidia-cudnn8 (= 4.5.1-b17), nvidia-tensorrt (= 4.5.1-b17), nvidia-visionworks (= 4.5.1-b17), nvidia-container (= 4.5.1-b17), nvidia-vpi (= 4.5.1-b17), nvidia-l4t-jetson-multimedia-api (>> 32.5-0), nvidia-l4t-jetson-multimedia-api (<< 32.6-0)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_4.5.1-b17_arm64.deb
Size: 29372
SHA256: 378f7588e15c35692eb1bed6f336be74f4f396d88fad45af67c68e22b63be04b
SHA1: e41f26a3d8326e9952915eee12fa37e17de3245f
MD5sum: 31b2bd9d0f214f74acaeb3d8e4279e9d
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Package: nvidia-jetpack
Version: 4.5-b129
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 4.5-b129), nvidia-opencv (= 4.5-b129), nvidia-cudnn8 (= 4.5-b129), nvidia-tensorrt (= 4.5-b129), nvidia-visionworks (= 4.5-b129), nvidia-container (= 4.5-b129), nvidia-vpi (= 4.5-b129), nvidia-l4t-jetson-multimedia-api (>> 32.5-0), nvidia-l4t-jetson-multimedia-api (<< 32.6-0)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_4.5-b129_arm64.deb
Size: 29360
SHA256: 002646e6d81d13526ade23d7c45180014f3cd9e9f5fb0f8896b77dff85d6b9fe
SHA1: cb17547b902b2793e0df86d561809ecdbf7e401f
MD5sum: 06962c42e462f643455d6194d1a2d641
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Can you share
$ ll /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so*

Sure. My NX is also using libnvinfer_plugin.so.7.1.3.
Will update you after upgrading on TensorRT-7.2.2 as you have mentioned steps above.

Wait for a moment. I am afraid you did not replace the plugin correctly.

The expected is as below.
$ ll /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so*
lrwxrwxrwx 1 root root 26 6月 6 2020 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so β†’ libnvinfer_plugin.so.7.1.3*
lrwxrwxrwx 1 root root 26 10月 12 15:12 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7 β†’ libnvinfer_plugin.so.7.1.3*
lrwxrwxrwx 1 root root 26 10月 12 15:12 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.0.0 β†’ libnvinfer_plugin.so.7.1.3*
-rwxr-xr-x 1 root root 10009144 10月 12 15:06 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.1.3*

Please follow step 4 of YOLOv4 β€” TAO Toolkit 3.22.05 documentation