TLT model issues

lee.hemingway · March 23, 2021, 2:28pm

• Hardware Platform (Jetson / GPU) - GeForce RTX 3090
• DeepStream Version - 5.1
• TensorRT Version - 7.2.2 (libnvinfer_plugin.so built from github 21.02 tag)
• NVIDIA GPU Driver Version (valid for GPU only) - 460.32.03
• Issue Type( questions, new requirements, bugs) - Bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

I am experiencing a number of issues running the TLT pre-trained models. I have attached a .tar.gz file can be used to reproduce my issues. The .tar.gz file contains:

docker-compose.yaml, defines docker container built from Dockerfile, mounts in ./models and ./configs directories
Dockerfile, builds environment using nvcr.io/nvidia/deepstream:5.1-21.02-devel. The Dockerfile builds cmake v3.13.5, then builds libnvinfer_plugin.so from the nvidia/TensorRT github repo 21.02 tag, it then builds libnvds_infercustomparser_tlt.so from release/tlt3.0 branch of the NVIDIA-AI-IOT/deepstream_tlt_apps github repo.
configs directory, contains config files which have been modified from the “tlt_pretrained_models” configs included with deepstream, incorrect paths have been fixed, config_infer_primary_*.txt files have been modified to use libnvds_infercustomparser_tlt.so, deepstream_app_source1_detection_*.txt files have been added for each model based on deepstream_app_source1_detection_models.txt.

Ensure nvidia-container-runtime is installed on the host machine and configured as the default docker runtime, host machine is running Ubuntu 18.04. Extract the attached archive. The TLT models must be then downloaded and extracted into the models directory, the models are downloaded from this link: https://nvidia.box.com/shared/static/i1cer4s3ox4v8svbfkuj5js8yqm3yazo.zip found on this page: NVIDIA DeepStream SDK Developer Guide — DeepStream 6.1.1 Release documentation

Run the “./start.sh” script to build and run the container. Run each of the deepstream_app_source1_*.txt config files with “deepstream-app -c filename.txt”.

Every model/config has issues, below I list the issues that I observe with each config:

deepstream_app_source1_detection_dssd.txt: works but misses a lot of detections and labels are wrong

deepstream_app_source1_detection_frcnn.txt:

NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1716> [UID = 1]: Trying to create engine from model files
ERROR: ../nvdsinfer/nvdsinfer_func_utils.cpp:33 [TRT]: UffParser: Could not read buffer.
parseModel: Failed to parse UFF model
ERROR: tlt/tlt_decode.cpp:274 failed to build network since parsing model errors.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:797 Failed to create network using custom network creation function
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:862 Failed to get cuda engine from custom library API
0:00:01.095817111    91 0x55d362a84e00 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1736> [UID = 1]: build engine file failed
Segmentation fault (core dumped)

deepstream_app_source1_peoplenet.txt: same errors as deepstream_app_source1_detection_frcnn.txt

deepstream_app_source1_detection_retinanet.txt runs for a few seconds and then this:

deepstream-app: nvdsinfer_custombboxparser_tlt.cpp:81: bool NvDsInferParseCustomNMSTLT(const std::vector<NvDsInferLayerInfo>&, const NvDsInferNetworkInfo&, const NvDsInferParseDetectionParams&, std::vector<NvDsInferObjectDetectionInfo>&): Assertion `(int) det[1] < out_class_size' failed.
Aborted (core dumped)

deepstream_app_source1_detection_ssd.txt: same error as deepstream_app_source1_detection_retinanet.txt

deepstream_app_source1_detection_yolov3.txt: makes detections but behaves like a classification model instead of outputting detection boxes

deepstream_app_source1_detection_yolov4.txt: makes detections but behaves like a classification model instead of outputting detection boxes

I would appreciate any help in solving these issues. Let me know if you need any further information.

Thanks

nvidia-tlt.tar.gz (8.1 KB)

bcao · March 25, 2021, 2:06am

Hey customer,
you should get all the nvinfer config files and lables(ssd/dssd/retinanet/frcnn/yolov3/yolov4) from https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/tree/release/tlt3.0/configs

And for yolov3/yolov4, we cannot support it on TRT7.2, you can see GitHub - NVIDIA-AI-IOT/deepstream_tao_apps at release/tlt3.0

For peoplenet, will check locally.

bcao · March 25, 2021, 2:26am

For peopleNet, you are using peopleSegNet_resnet50.etlt in your config_infer_primary_peoplenet.txt, it’s not correct, you should use resnet34_peoplenet_pruned.etlt refer /opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models/README

mkdir -p ../../models/tlt_pretrained_models/peoplenet && \
    wget https://api.ngc.nvidia.com/v2/models/nvidia/tlt_peoplenet/versions/pruned_v2.0/files/resnet34_peoplenet_pruned.etlt \
    -O ../../models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt

#tlt-encoded-model=../../models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt
tlt-encoded-model=../../models/tlt_pretrained_models/peopleSegNet/peopleSegNet_resnet50.etlt

lee.hemingway · March 25, 2021, 4:34pm

Hello,

Thank you for your help, I now have most of them working using those config files however I am still having issues with peoplenet and unet:

deepstream_app_source1_peoplenet.txt I have downloaded the model you specified but get the following errors:

0:00:09.186980123    19 0x5559cb4cae00 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1716> [UID = 1]: Trying to create engine from model files
ERROR: ../nvdsinfer/nvdsinfer_func_utils.cpp:33 [TRT]: UffParser: Unsupported number of graph 0
parseModel: Failed to parse UFF model
ERROR: tlt/tlt_decode.cpp:274 failed to build network since parsing model errors.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:797 Failed to create network using custom network creation function
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:862 Failed to get cuda engine from custom library API
0:00:09.242426481    19 0x5559cb4cae00 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1736> [UID = 1]: build engine file failed
Segmentation fault (core dumped)

deepstream_app_source1_detection_unet.txt: runs but makes no detections, I ran the following to generate the engine file:

tlt-converter -e models/unet/unet_resnet18.etlt_b1_gpu0_fp16.engine -p input_1,1x3x608x960,1x3x608x960,1x3x608x960 -t fp16 -k tlt_encode -m 1 tlt_encode models/unet/unet_resnet18.etlt

I have attached an updated .tar.gz file with the new configuration files. I would appreciate your help on these remaining issues.

On the subject of yolov3/yolov4, is there a fix in the works for this? when will it be ready?

Thanks again,
Lee

nvidia-tlt.tar.gz (9.0 KB)

bcao · March 26, 2021, 1:42am

Hey customer, good to know most models can work!

For the remaining issues:
1.peoplenet: let me check
2.Unet: current deepstream-app cannot support unet, you need to run it using ds-tlt app, refer GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream

For segmentation model:
Usage: ds-tlt  config_file <file1> [file2] ... [fileN]

3.For yolov3/4, we will release new etlt models which can run well with TRT7.2

bcao · March 26, 2021, 1:56am

For peoplenet, your nvinfer config files still not correct, you need to keep in mind that peoplenet is a different model than peopleSegNet, you should refer the /opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models/config_infer_primary_peoplenet.txt


[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
tlt-model-key=tlt_encode
tlt-encoded-model=../../models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt
labelfile-path=labels_peoplenet.txt
model-engine-file=../../models/tlt_pretrained_models/peoplenet/resnet34_peoplenet_pruned.etlt_b1_gpu0_fp16.engine
input-dims=3;544;960;0
uff-input-blob-name=input_1
batch-size=1
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=3
cluster-mode=1
interval=0
gie-unique-id=1
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-all]
pre-cluster-threshold=0.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1

lee.hemingway · April 7, 2021, 9:30am

Thank you for your help, that solved my issues.

Topic		Replies	Views
Unable to use custom-trained peoplenet model TAO Toolkit	6	867	April 20, 2022
Deepstream error DeepStream SDK	6	314	February 9, 2024
Issues with running inference on multiple rtsp streams in deepstream-imagedata-multistream DeepStream SDK jetson-inference	24	612	August 7, 2024
Deepstream segmask sample with dashcamnet error DeepStream SDK	12	379	July 21, 2023
Testing deepstream-test3-app with peoplenet model DeepStream SDK	7	986	October 12, 2021
Error from deepstream sample apps DeepStream SDK	9	1008	October 12, 2021
PeopleNet sample app not working (engine) TAO Toolkit	5	564	June 7, 2022
Unsupported operation _MultilevelCropAndResize_TRT (and a few others) DeepStream SDK	10	578	May 31, 2022
Problem with running etlt classification model in deepstream-app DeepStream SDK tensorrt , tao , deepstream	4	440	February 5, 2024
Build engine file failed DeepStream SDK	3	1440	November 2, 2021

TLT model issues

Related topics