How to integrate TAO custom trained model 's tao export and tao deploy files with Deep Streem

system spec:
GPU : RTX 3060 12GB

  • I have installed Deep stream on dGPU and i have tested samples and it worked for me.

  • I have trained model with Tao on custom data and i have export .etlt model and also i have used tao deploy which has generated multiple files mentioned in titlte and below.
    Now i am looking in deep stream reference apps repo i followed all commands that mention and run tao pretrained model successfully. so now i am looking in deepstream_app_source1_detection_models.txt file
    whch has some refrence to model weight and other related files

 model-engine-file: ../../models/tao_pretrained_models/yolov4/yolov4_resnet18_395.etlt_b1_gpu0_int8.engine
  int8-calib-file: ../../models/tao_pretrained_models/yolov4/cal.bin.trt8517
  tlt-encoded-model: ../../models/tao_pretrained_models/yolov4/yolov4_resnet18_395.etlt

but i have the following files which make confusion for me because of file extensions, different from the above mentined extensions in configuration file.

  1. cal.bin
  2. cal.tensorfile
  3. labels.txt
  4. nvinfer_config.txt
  5. trt.engine
  6. trt.engine.fp16
  7. trt.engine.int8
  8. yolov4_resnet18_epoch010.etlt

tao export which i have used:

!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_010.tlt \
                    -k $KEY \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_010.etlt \
                    -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                    --target_opset 12 \
                    --gen_ds_config

and for tao deploy

!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_010.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                                   --batch_size 16 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --data_type fp32 \
                                   --engine_file $USER_EXPERIMENT_DIR/export/trt.engine
!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_010.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                                   --batch_size 16 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --data_type fp16 \
                                   --engine_file $USER_EXPERIMENT_DIR/export/trt.engine.fp16
!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_010.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                                   --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
                                   --data_type int8 \
                                   --batch_size 16 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --batches 10 \
                                   --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin  \
                                   --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
                                   --engine_file $USER_EXPERIMENT_DIR/export/trt.engine.int8

kindly help me out next i will use these model on Jetson axavier kit.
and also give me suggestion how to make config file for deep stream or we just modify build configuration file of deep stream.

Please refer to TAO pretrained models DeepStream samples. NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream (github.com)

For deepstream-app, please refer to DeepStream Reference Application - deepstream-app — DeepStream 6.1.1 Release documentation

For nvinfer settings, please refer to Gst-nvinfer — DeepStream 6.1.1 Release documentation

I looked into TAO pretrained models Deep stream samples github repo .

where they use tao converter and they saving ( model engine file ) in this .etlt_b1_gpu0_fp16.engine format but i am using TAO 4.0 version which use tao deploy an it save three output files in the following format.

  1. trt.engine ( FP32)
  2. trt.engine.fp16

you can check tao deploy commands in posted question.
so what i want which file i choce to use instead of this .etlt_b1_gpu0_fp16.engine as mentioned in samples configuration .
what do you suggest?
Thanks for fast reply.

You don’t need tao-conveter or tao deploy command to generate engine files. The DeepStream app can generate engine file from onnx, uff, TAO etlt, … itself.
If you insist to use the engine file generated by tao-deploy, just use it by configurate “model-engine-file” in the nvinfer configuration file. E.G. deepstream_tao_apps/pgie_retina_tao_config.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps (github.com)

Thank you @ Fiona.Chen .

Hi i got this error when specified path in config_infer_primary_yolov4.txt and then run deepstream_app_source1_detection_models.txt file it got me the following error.

/opt/nvidia/deepstream/deepstream-6.1/samples/configs/tao_pretrained_models$ sudo deepstream-app -c deepstrean_app_source1_custom_yolov4.txt 
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is ON
[NvMultiObjectTracker] Initialized
ERROR: [TRT]: 1: [stdArchiveReader.cpp::StdArchiveReader::40] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 213, Serialized Engine Version: 232)
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1528 Deserialize engine failed from file: /opt/nvidia/deepstream/deepstream-6.1/samples/models/tao_pretrained_models/yolov4/n/trt.engine
0:00:01.046042522 2825753 0x5573b6294e60 WARN                 nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.1/samples/models/tao_pretrained_models/yolov4/n/trt.engine failed
0:00:01.144150066 2825753 0x5573b6294e60 WARN                 nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.1/samples/models/tao_pretrained_models/yolov4/n/trt.engine failed, try rebuild
0:00:01.144166156 2825753 0x5573b6294e60 INFO                 nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
parseModel: Failed to parse ONNX model
ERROR: tlt/tlt_decode.cpp:389 Failed to build network, error in model parsing.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:723 Failed to create network using custom network creation function
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:789 Failed to get cuda engine from custom library API
0:00:01.961677566 2825753 0x5573b6294e60 ERROR                nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
ERROR: [TRT]: 2: [logging.cpp::decRefCount::61] Error Code 2: Internal Error (Assertion mRefCount > 0 failed. )
corrupted size vs. prev_size while consolidating
Aborted

config_infer_primary_yolov4.txt i named it nvinfer_config.txt

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path= ../../models/tao_pretrained_models/yolov4/n/labels.txt
model-engine-file=../../models/tao_pretrained_models/yolov4/n/trt.engine
int8-calib-file=../../models/tao_pretrained_models/yolov4/n/cal.bin
tlt-encoded-model=../../models/tao_pretrained_models/yolov4/n/yolov4_resnet18_epoch_010.etlt
tlt-model-key=key=NGpmbHN0ZTNrZHFkOGRxNnFsbW9rbXNxbnU6Yzc5NWM5MjQtZDE1YS00NTYxLTg3YzgtNTU2MWVhNDg1M2M3
infer-dims=3;544;960
maintain-aspect-ratio=1
uff-input-order=0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=3
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=/opt/nvidia/deepstream/deepstream/lib/libnvds_infercustomparser.so
layer-device-precision=cls/mul:fp32:gpu;box/mul_6:fp32:gpu;box/add:fp32:gpu;box/mul_4:fp32:gpu;box/add_1:fp32:gpu;cls/Reshape_reshape:fp32:gpu;box/Reshape_reshape:fp32:gpu;encoded_detections:fp32:gpu;bg_leaky_conv1024_lrelu:fp32:gpu;sm_bbox_processor/concat_concat:fp32:gpu;sm_bbox_processor/sub:fp32:gpu;sm_bbox_processor/Exp:fp32:gpu;yolo_conv1_4_lrelu:fp32:gpu;yolo_conv1_3_1_lrelu:fp32:gpu;md_leaky_conv512_lrelu:fp32:gpu;sm_bbox_processor/Reshape_reshape:fp32:gpu;conv_sm_object:fp32:gpu;yolo_conv5_1_lrelu:fp32:gpu;concatenate_6:fp32:gpu;yolo_conv3_1_lrelu:fp32:gpu;concatenate_5:fp32:gpu;yolo_neck_1_lrelu:fp32:gpu

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

[class-attrs-1]
nms-iou-threshold=0.9

deepstream_app_source1_detection_models.txt

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=1

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
num-sources=1
uri=file://../../streams/sample_1080p_h265.mp4
gpu-id=0

[streammux]
gpu-id=0
batch-size=1
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0

[osd]
enable=1
gpu-id=0
border-width=3
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial

[primary-gie]
enable=1
gpu-id=0
# Modify as necessary
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
gie-unique-id=1
# Replace the infer primary config file when you need to
# use other detection models
#config-file=config_infer_primary_frcnn.txt
#config-file=config_infer_primary_ssd.txt
config-file = nvinfer_config.txt
#config-file=config_infer_primary_dssd.txt
#config-file=config_infer_primary_retinanet.txt
#config-file=config_infer_primary_yolov3.txt
#config-file=config_infer_primary_yolov4.txt
#config-file=config_infer_primary_detectnet_v2.txt
#config-file=config_infer_primary_yolov4-tiny.txt
#config-file=config_infer_primary_efficientdet.txt

[sink1]
enable=0
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=2000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
output-file=out.mp4
source-id=0

[sink2]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400

[tracker]
enable=1
# For NvDCF and DeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
tracker-width=640
tracker-height=384
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
# ll-config-file required to set different tracker types
# ll-config-file=../deepstream-app/config_tracker_IOU.yml
ll-config-file=../deepstream-app/config_tracker_NvDCF_perf.yml
# ll-config-file=../deepstream-app/config_tracker_NvDCF_accuracy.yml
# ll-config-file=../deepstream-app/config_tracker_DeepSORT.yml
gpu-id=0
enable-batch-process=1
enable-past-frame=1
display-tracking-id=1

[tests]
file-loop=0

these are my files.
labels.txt (23 Bytes)
cal.bin (8.1 KB)
nvinfer_config (1).txt (1.7 KB)
deepstrean_app_source1_custom_yolov4.txt (2.6 KB)

trt.engine file link trt.engine - Google Drive

Is the engine generated in the same machine with the same GPU? Have you put the engine file in the path you set “/opt/nvidia/deepstream/deepstream-6.1/samples/models/tao_pretrained_models/yolov4/n/trt.engine”? Please fill the model parameters correctly. Seems your “tlt-model-key” is not correct either.

Thank you so much for you help. indeed the key parameter was not setup correctly.
it should be like this

tlt-model-key=NGpmbHN0ZTNrZHFkOGRxNnFsbW9rbXNxbnU6Yzc5NWM5MjQtZDE1YS00NTYxLTg3YzgtNTU2MWVhNDg1M2M3

i also correct the inference dimensions so it all workout for me thank you

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.