TLT-deepstream sample app error : Deepstream deplyment : Error build engine file failed

deepstream-custom
DeepStreamSDK 5
CUDA Driver Version: 10.2
CUDA Runtime Version: 10.2
TensorRT Version: 7.0.0.11
cuDNN Version: 7.6.5
NVIDIA GTX 960m

I want to integrate TLT-fasterrcnn-resnet50-model trained for ‘hand’ into deepstream sdk for inference
i get the .etlt file by using tlt-export command in docker.
I have built the TRT-OSS for the cropAndResizePlugin for frcnn to deploy into deepstream from deepstream-tlt-apps,
I can also run the sample for deepstream-custom.
but when i try to use my etlt file in the pgie_frcnn_tlt_config.txt for deepstream-custom i get the following error :

command : “deepstream-custom -c pgie_frcnn_tlt_config.txt -i …/…/streams/sample_720p.h264 -d”

Now playing: pgie_frcnn_tlt_config.txt
0:00:00.352753517 21517 0x563527917d80 INFO nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1591> [UID = 1]: Trying to create engine from model files
WARNING: …/nvdsinfer/nvdsinfer_model_builder.cpp:759 FP16 not supported by platform. Using FP32 mode.
ERROR: …/nvdsinfer/nvdsinfer_func_utils.cpp:31 [TRT]: Parameter check failed at: …/builder/Network.cpp::addInput::957, condition: isValidDims(dims, hasImplicitBatchDimension())
ERROR: …/nvdsinfer/nvdsinfer_func_utils.cpp:31 [TRT]: UFFParser: Failed to parseInput for node input_1
ERROR: …/nvdsinfer/nvdsinfer_func_utils.cpp:31 [TRT]: UffParser: Parser error: input_1: Failed to parse node - Invalid Tensor found at node input_1
parseModel: Failed to parse UFF model
ERROR: tlt/tlt_decode.cpp:274 failed to build network since parsing model errors.
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:797 Failed to create network using custom network creation function
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:862 Failed to get cuda engine from custom library API
0:00:00.982060040 21517 0x563527917d80 ERROR nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1611> [UID = 1]: build engine file failed
Segmentation fault

my config file is following : (pgie_frcnn_tlt_config.txt) :

> [property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=./nvdsinfer_customparser_frcnn_tlt/frcnn_labels.txt
#tlt-encoded-model=./models/frcnn/faster_rcnn_resnet10.etlt
tlt-encoded-model=./models/frcnn/frcnn_kitti_epoch8_fp16.etlt
#tlt-model-key=tlt
tlt-model-key=c2NuOGlxOGlxMmhvbW05aG85YjVmbW8xN2Y6N2ZmMzNhZjMtYjdmOS00ZDFmLTk5NGEtY2YyQ5
uff-input-dims=3;640;640;0
uff-input-blob-name=input_image
batch-size=1
0=FP32, 1=INT8, 2=FP16 mode

network-mode=2
num-detected-classes=2
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
output-blob-names=dense_regress_td/BiasAdd;dense_class_td/Softmax;proposal
parse-bbox-func-name=NvDsInferParseCustomFrcnnTLT
custom-lib-path=./nvdsinfer_customparser_frcnn_tlt/libnvds_infercustomparser_frcnn_tlt.so

[class-attrs-all]
#pre-cluster-threshold=0.6
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

What I am missing or doing wrong??
please give me a direction!

Thanks in advance

How many classes did you train? 2 or 1?

Only one that is ‘hand’
Label file is :

Hand
Background

OK.
More, which tlt docker did you use to train to get your TLT-fasterrcnn-resnet50-model?
For 2.0_dp docker, its input should be input_image.

Tlt-1.0

For tlt-1.0, please set to

uff-input-blob-name=input_1

BTW, suggest using tlt_2.0_dp to train.

when I run sample models: Yolov3 TLT models :
./deepstream-custom -c pgie_yolov3_tlt_config.txt -i $DS_SRC_PATH/samples/streams/sample_720p.h264
error has occurred:
Warning: ‘input-dims’ parameter has been deprecated. Use ‘infer-dims’ instead.
Now playing: pgie_yolov3_tlt_config.txt
Opening in BLOCKING MODE
Opening in BLOCKING MODE
0:00:00.213974058 11548 0x558b573d00 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: UffParser: Validator error: FirstDimTile_2: Unsupported operation _BatchTilePlugin_TRT
parseModel: Failed to parse UFF model
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:01.777201415 11548 0x558b573d00 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
Segmentation fault (core dumped)

But when I run frcnn TLT models,everything is normal.

@920086481
Please see https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#intg_yolov3_model and https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps ,

TensorRT OSS (release/7.x branch)

This is ONLY needed when running SSD , DSSD , RetinaNet and YOLOV3 models because BatchTilePlugin required by these models is not supported by TensorRT7.x native package.

1 Like

I followed this link(https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/tree/master/TRT-OSS/Jetson) to build Jetson TensorRT OSS Plugin, and I don’t know what is the relationship between this and TensorRT7.x native package.

See https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/tree/master/TRT-OSS/Jetson, it is necessary to replace “libnvinfer_plugin.so*”.

The reason is mentioned in tlt user guide. (https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#intg_yolov3_model)

Prerequisite for YOLOv3 model

  1. YOLOv3 requires batchTilePlugin, resizeNearestPlugin and batchedNMSPlugin. This plugin is available in the TensorRT open source repo, but not in TensorRT 7.0. Detailed instructions to build TensorRT OSS can be found in TensorRT Open Source Software (OSS).

Thank you for your help, I have replaced “libnvinfer_plugin.so.7.1.3*” with my own build. I found that SSD , DSSD , RetinaNet and Detectnet_v2 can run successfully, but Yolov3 can’t.

Trying to create engine from model files
ERROR: [TRT]: IPluginV2DynamicExt requires network without implicit batch dimension
Segmentation fault (core dumped)

Please share your command along with full log.

nvidia@nvidia-desktop:~/Public/deepstream_tlt_apps$ ./deepstream-custom -c pgie_yolov3_tlt_config.txt -i $DS_SRC_PATH/samples/streams/sample_720p.h264 -d
Warning: ‘input-dims’ parameter has been deprecated. Use ‘infer-dims’ instead.
Now playing: pgie_yolov3_tlt_config.txt

Using winsys: x11
Opening in BLOCKING MODE
0:00:00.201417692 29126 0x5593a1b870 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: IPluginV2DynamicExt requires network without implicit batch dimension
Segmentation fault (core dumped)
nvidia@nvidia-desktop:~/Public/deepstream_tlt_apps$

I didn’t modify any configuration files。

Please create a new TLT forum topic and paste your command/log/config-file.
Thanks.

OK,I’ve created a new TLT forum topic. (TLT-deepstream sample app problems : I found thatFRCNN, SSD , DSSD , RetinaNet and Detectnet_v2 can run successfully, but Yolov3 can’t)