Tao Yolo_v4 transfer learning is not working with deepstream 5.0

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
Jetson Xavier NX
• DeepStream Version
5.0
• JetPack Version (valid for Jetson only)
4.5
• TensorRT Version
7.1.3
• Issue Type( questions, new requirements, bugs)
question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
No Idea
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
I did transfer learning for YOLO_V4using the tao toolkit and obtained two files: trt.engine and yolov4_resnet18_epoch_080.etlt. I did installed the tensor OSS using the deepstream_tlt_apps got repository and its custom plugins, git-lfs. the following is my main config file:

[primary-gie]
enable=1
#gpu-id=0
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;1;1;1
bbox-border-color3=0;1;0;1
nvbuf-memory-type=0
interval=0
gie-unique-id=1
#model-engine-file=../../../../../samples/models/Primary_Detector/resnet10.caffemodel_b4_gpu0_int8.engine
#labelfile-path=../../../../../samples/models/Primary_Detector/labels.txt
config-file=/opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test5-5.1/DLModel/yolo_v4_config_file.txt
#infer-raw-output-dir=../../../../../samples/primary_detector_raw_output/

and the following is the properties:

[property]
gpu-id=0
# preprocessing parameters.
net-scale-factor=1
model-color-format=1

# model paths.
#int8-calib-file=<Path to optional INT8 calibration cache>
labelfile-path=./labels.txt
tlt-encoded-model=./resnet10_detector_250Ep.etlt
model-engine-file=./trt.engine
tlt-model-key=nvidia_tao
input-dims=3;384;1248;0 # where c = number of channels, h = height of the model input, w = width of model input, 0: implies CHW format.
infer-dimes=3,544,900
uff-input-blob-name=Input
uff-input-order=0
batch_size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=6
interval=0
gie-unique-id=1
is-classifier=0
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=/home/rudi/deepstream_tlt_apps/post_processor/libnvds_infercustomparser_tlt.so
cluster-mode=3
#enable_dbscan=0

[class-attrs-all]
threshold=0.3
group-threshold=1
#minBoxes=3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

running the app I am receiving the following errors

sudo /opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test5-5.1/deepstream-test5-app -c  /opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test5-5.1/configs/test5_config_file_src_infer_aws_til_yolov4.txt -t --tiledtext
Data read from memory: 0 
Data read from memory: 0 
Data read from memory: 0 

 *** DeepStream: Launched RTSP Streaming at rtsp://localhost:8554/ds-test ***

Warning: 'input-dims' parameter has been deprecated. Use 'infer-dims' instead.
Unknown or legacy key specified 'infer-dimes' for group [property]
Unknown or legacy key specified 'batch_size' for group [property]
Warn: 'threshold' parameter has been deprecated. Use 'pre-cluster-threshold' instead.
Opening in BLOCKING MODE
Opening in BLOCKING MODE 
Opening in BLOCKING MODE
Opening in BLOCKING MODE 

Using winsys: x11 
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so
gstnvtracker: Optional NvMOT_RemoveStreams not implemented
gstnvtracker: Batch processing is OFF
gstnvtracker: Past frame output is OFF
ERROR: [TRT]: coreReadArchive.cpp (38) - Serialization Error in verifyHeader: 0 (Version tag does not match)
ERROR: [TRT]: INVALID_STATE: std::exception
ERROR: [TRT]: INVALID_CONFIG: Deserialize the cuda engine failed.
ERROR: Deserialize engine failed from file: /opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test5-5.1/DLModel/trt.engine
0:00:01.668797625 12142   0x559dac2c60 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1690> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test5-5.1/DLModel/trt.engine failed
0:00:01.668897785 12142   0x559dac2c60 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1797> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test5-5.1/DLModel/trt.engine failed, try rebuild
0:00:01.668938170 12142   0x559dac2c60 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: UffParser: Could not read buffer.
parseModel: Failed to parse UFF model
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:01.786340602 12142   0x559dac2c60 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
Segmentation fault

Hi,

ERROR: [TRT]: coreReadArchive.cpp (38) - Serialization Error in verifyHeader: 0 (Version tag does not match)

The above error indicates you are using different TensorRT software for serialization and de-serialization.

Please noted that the TensorRT engine file doesn’t support portability.
So you will need to recreate the engine once the platform or software version has changed.

Thanks.

Hi ,
I am just exporting the model using tao toolkit. I changed my encryption key and train - prune and retain the model and extract the model as it is mentioned on tlt files:
I am getting the following error:

0:00:00.391122013 13964   0x557acadc60 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: Parameter check failed at: ../builder/Network.cpp::addInput::992, condition: isValidDims(dims, hasImplicitBatchDimension())
ERROR: [TRT]: UFFParser: Failed to parseInput for node input_image
ERROR: [TRT]: UffParser: Parser error: input_image: Failed to parse node - Invalid Tensor found at node input_image
parseModel: Failed to parse UFF model
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:02.335178327 13964   0x557acadc60 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
Segmentation fault

here is the config file:

[property]
gpu-id=0
# preprocessing parameters.
net-scale-factor=1.0
model-color-format=1

# model paths.
#int8-calib-file=<Path to optional INT8 calibration cache>
labelfile-path=./labeles_fastrcnn.txt
tlt-encoded-model=./frcnn_kitti_resnet18_retrain.etlt
#model-engine-file=./trt.engine
#tlt-model-key=NTI3ZTQ1azE0Yjc0bWFmcW81cHRtaXA1OXE6YzFjNmQzYzgtZDM4Mi00YWIxLWJjOGUtNmJhYjQxYjExZTBl
tlt-model-key=NXVodTI0MXNnZGtzdXBic2o0cTIwbmp0bnA6N2IwZDEyMGYtMGZiOS00MDNlLTllOGMtOGMzOTJiYmRlMzk0
input-dims=3;384;1248;0 # where c = number of channels, h = height of the model input, w = width of model input, 0: implies CHW format.
infer-dimes=3,544,900
uff-input-blob-name=Input
uff-input-order=0
batch_size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=3
interval=0
gie-unique-id=1
is-classifier=0
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=/home/rudi/deepstream_tlt_apps/post_processor/libnvds_infercustomparser_tlt.so
cluster-mode=3
#enable_dbscan=0

[class-attrs-all]
threshold=0.3
group-threshold=1
#minBoxes=3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Now the question is, if the training image sizes are important for the tlt models and consequently on deepstream? I accepted the default values for the deepstream config file and tao(tlt) training. I do the training on AWS EC2 instances (ubuntu) and move the files (etlt) to Jestion Xavier AGX. Also I am having just 3 classes (lets say cat, person, dog)
I also changed the model to faster-rcnn.

Regards

I got the answer, The
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
should specificly define for the fasterrcnn application.
These three models have the same output layer named NMS which implementation can refer to TRT OSS nmsPlugin:

  • an output of shape [batchSize, 1, keepTopK, 7] which contains nmsed box class IDs(1 value), nmsed box scores(1 value) and nmsed box locations(4 value)
  • another output of shape [batchSize, 1, 1, 1] which contains the output nmsed box count.

So in the config file :
output-blob-names=NMS
parse-bbox-func-name=NvDsInferParseCustomNMSTLT
custom-lib-path=…/…/post_processor/libnvds_infercustomparser_tao.so
Also, the label file should have on background item after all the classes, if you have just one class (dog) the labels.txt should be
#---------
dog
background
#---------