How to Deepstream TensortRT version upgrade

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) nvidia RTX 3070
• DeepStream Version 6.1
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.2(TAO), 8.5(Deepstream)
• NVIDIA GPU Driver Version (valid for GPU only) 510
• Issue Type( questions, new requirements, bugs) questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi, I’m having trouble with the TensorRT version.
I am currently trying to apply the exported trt.engine file to deepstream after learning yolov4 from TAO.

But these errors occurred.

ERROR: [TRT]: 1: [stdArchiveReader.cpp::StdArchiveReader::40] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 205, Serialized Engine Version: 232)
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)

The tensorrt version applied in TAO is version 8.5 and the tensorrt version applied in deepstream is version 8.2.

So I thought I’d upgrade the tensorrt version and downloaded it via pip install, but it didn’t apply.

Please tell me how to upgrade the tensorrt version.

The TensorRT version for a specific DeepStream is fixed, so you upgrading or downgrading TensorRT will cause DeepStream unexpected behavior.

DeepStream 6.2 has TensorRT version 8.5, you can consider upgrading to DeepStream 6.2.

Beside, you can only use etlt model from TAO (setting “tlt-encoded-model” for PGIE config), then DeepStream will create engine for the model.

Thank you for your help.

I applied the deepstream 6.2 version according to the above answer, but there is still a problem.

In deepstream 6.2, TRT version is 8.5.2.2, and in TAO 4.0.1, TRT version is 8.5.1.7, so models exported by TAO cannot be linked in deepstream.

I am attaching the error code as below when driving in deepstream.

ERROR: [TRT]: 6: The engine plan file is not compatible with this version of TensorRT, expecting library version 8.5.2.2 got 8.5.1.7, please rebuild.
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)

could you share the whole log? if loading engine failed, the app will generate a new engine.

root@soohyeon-HP-Z2-Tower-G5-Workstation:/opt/nvidia/deepstream/deepstream-6.2/
sources/project/deepstream_tao_apps# ./apps/tao_detection/ds-tao-detection -c configs/yolov4_tao/pgie_yolov4_tao_config.txt -i file:////opt/nvidia/deepstream/deepstream-6.2/samples/streams/sample_720p.mp4
Request sink_0 pad from streammux
batchSize 1...
WARNING: Overriding infer-config batch-size (4) with number of sources (1)
Now playing: configs/yolov4_tao/pgie_yolov4_tao_config.txt
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
ERROR: [TRT]: 6: The engine plan file is not compatible with this version of TensorRT, expecting library version 8.5.2.2 got 8.5.1.7, please rebuild.
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1533 Deserialize engine failed from file: /opt/nvidia/deepstream/deepstream-6.2/sources/project/deepstream_tao_apps/models/yolov4/custom_trt.engine
0:00:03.609540756   141 0x5631127a2000 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/project/deepstream_tao_apps/models/yolov4/custom_trt.engine failed
0:00:03.669081614   141 0x5631127a2000 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/project/deepstream_tao_apps/models/yolov4/custom_trt.engine failed, try rebuild
0:00:03.669096905   141 0x5631127a2000 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:865 failed to build network since there is no model file matched.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:804 failed to build network.
0:00:05.259398068   141 0x5631127a2000 ERROR                nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
0:00:05.320201647   141 0x5631127a2000 ERROR                nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2029> [UID = 1]: build backend context failed
0:00:05.320219611   141 0x5631127a2000 ERROR                nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1266> [UID = 1]: generate backend failed, check config file settings
0:00:05.320652484   141 0x5631127a2000 WARN                 nvinfer gstnvinfer.cpp:888:gst_nvinfer_start:<primary-nvinference-engine> error: Failed to create NvDsInferContext instance
0:00:05.320755341   141 0x5631127a2000 WARN                 nvinfer gstnvinfer.cpp:888:gst_nvinfer_start:<primary-nvinference-engine> error: Config file path: configs/yolov4_tao/pgie_yolov4_tao_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
Running...

**PERF:  FPS 0 (Avg)	
Wed Jun 28 00:21:55 2023
**PERF:  0.00(0.00)	
ERROR from element primary-nvinference-engine: Failed to create NvDsInferContext instance
Error details: gstnvinfer.cpp(888): gst_nvinfer_start (): /GstPipeline:ds-custom-pipeline/GstNvInfer:primary-nvinference-engine:
Config file path: configs/yolov4_tao/pgie_yolov4_tao_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
Returned, stopping playback
Deleting pipeline

This is whole log

The algorithm worked when it cloned from the address of this above git hub and ran only with the trt.engine file. However, the algorithm doesn’t run when I run with my trt.engine file, so I think it’s a version issue of the trt.engine file.

And Below is my config file, and I want to skip the process of converting from etlt to trt and run it directly from the trt.engine file deployed in TAO.

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=yolov4_labels.txt
model-engine-file=../../models/yolov4/custom_trt.engine
int8-calib-file=../../models/yolov4/cal.bin.trt8517
#tlt-encoded-model=../../models/yolov4/yolov4_resnet18_epoch_080.etlt
tlt-model-key=nvidia_tlt
infer-dims=3;384;1248
maintain-aspect-ratio=1
uff-input-order=0
uff-input-blob-name=Input
batch-size=4
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=3
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=../../post_processor/libnvds_infercustomparser_tao.so
layer-device-precision=cls/mul:fp32:gpu;box/mul_6:fp32:gpu;box/add:fp32:gpu;box/mul_4:fp32:gpu;box/add_1:fp32:gpu;cls/Reshape_reshape:fp32:gpu;box/Reshape_reshape:fp32:gpu;encoded_detections:fp32:gpu;bg_leaky_conv1024_lrelu:fp32:gpu;sm_bbox_processor/concat_concat:fp32:gpu;sm_bbox_processor/sub:fp32:gpu;sm_bbox_processor/Exp:fp32:gpu;yolo_conv1_4_lrelu:fp32:gpu;yolo_conv1_3_1_lrelu:fp32:gpu;md_leaky_conv512_lrelu:fp32:gpu;sm_bbox_processor/Reshape_reshape:fp32:gpu;conv_sm_object:fp32:gpu;yolo_conv5_1_lrelu:fp32:gpu;concatenate_6:fp32:gpu;yolo_conv3_1_lrelu:fp32:gpu;concatenate_5:fp32:gpu;yolo_neck_1_lrelu:fp32:gpu

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

And one more question.
It is a way to link RTSP or USB CAM.
Is it possible to answer this?

from error log and configuration, the app can’t find the model, please set tlt-encoded-model.

What I want is to run the trt.engine file directly, not using the etlt file.
Please tell me how to run with only the trt.engine file.

as the log shown, the engine was generated on tensorrt 8.5.1.7, but the current version is 8.5.2.2, the engine need to be regenerated.

When exporting from TAO, the trt version installed internally is 8.5.1.7, so it cannot be converted to 8.5.2.2.
How do I convert the trt engine to 8.5.2.2?

there are two methods:

  1. you can set tlt-encoded-model, then the app will genearte a new engine for new trt version.
  2. you can use tao-converter to generate a new engine, and set the correct engine path. here is a sample,./tao-converter -k nvidia_tlt -t int8 -c models/yolov4-tiny/cal.bin.trt8517 -p Input,1x3x544x960,4x3x544x960,4x3x544x960
    -e models/yolov4-tiny/1/yolov4_cspdarknet_tiny_397.etlt_b4_gpu0_int8.engine
    models/yolov4-tiny/yolov4_cspdarknet_tiny_397.etlt

Thank you very much. Let’s try it through the Tao-converter.
So when you run it on deepstream, can you tell me how to run it on usb camera or rtsp camera?

as the code shown, ds-tao-detection can supports v4l2 source, here is a sample " uridecodebin uri=v4l2:///dev/video0 !".

How do I apply the above as a command?

please try ./apps/tao_detection/ds-tao-detection -c configs/yolov4_tao/pgie_yolov4_tao_config.txt -i v4l2:///dev/video0. you can use v4l2-ctl --list-devices to chek video number. I suggest using gst-launch-1.0 to debug v4l2 source first.

An error appears when running the above command.
I think this is a matter of gst, as I said. How can we solve this?

root@soohyeon-HP-Z2-Tower-G5-Workstation:/opt/nvidia/deepstream/deepstream-6.2/
sources/project/deepstream_tao_apps# ./apps/tao_detection/ds-tao-detection -c configs/yolov4_tao/pgie_yolov4_tao_config.txt -i v4l2:///dev/video0
Request sink_0 pad from streammux
batchSize 1...
WARNING: Overriding infer-config batch-size (4) with number of sources (1)
Now playing: configs/yolov4_tao/pgie_yolov4_tao_config.txt
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
0:00:02.749272980   308 0x55c687b1e400 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1909> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.2/sources/project/deepstream_tao_apps/models/yolov4/yolov4_resnet18_epoch_080.etlt_b1_gpu0_int8.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 5
0   INPUT  kFLOAT Input           3x384x1248      
1   OUTPUT kINT32 BatchedNMS      1               
2   OUTPUT kFLOAT BatchedNMS_1    200x4           
3   OUTPUT kFLOAT BatchedNMS_2    200             
4   OUTPUT kFLOAT BatchedNMS_3    200             

0:00:02.809789453   308 0x55c687b1e400 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2012> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.2/sources/project/deepstream_tao_apps/models/yolov4/yolov4_resnet18_epoch_080.etlt_b1_gpu0_int8.engine
0:00:02.813818287   308 0x55c687b1e400 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:configs/yolov4_tao/pgie_yolov4_tao_config.txt sucessfully
Decodebin child added: source
Decodebin child added: decodebin0
Running...
ERROR from element source: Could not open device '/dev/video0' for reading and writing.
Error details: v4l2_calls.c(621): gst_v4l2_open (): /GstPipeline:ds-custom-pipeline/GstBin:source-bin-00/GstURIDecodeBin:uri-decode-bin/GstV4l2Src:source:
system error: Operation not permitted
Returned, stopping playback
Deleting pipeline

from the error, it is because tha app could not open device. please use gst-launch-1.0 to debug v4l2 source first. here is a sample: gst-launch-1.0 uridecodebin uri=v4l2:///dev/videoX ! nvvideoconvert ! autovideosink. X is the device number.

Thank you for your help.
It was confirmed that cam was linked through the command of gst-launch-1.0 v4l2src device=/dev/video0! autovideosink.

However, the following error occurred. I think it’s a sink problem, but I don’t know the solution.
Is it possible to answer this?

root@soohyeon-HP-Z2-Tower-G5-Workstation:/opt/nvidia/deepstream/deepstream-6.2/sources/project/de
epstream_tao_apps# ./apps/tao_detection/ds-tao-detection -c configs/yolov4_tao/pgie_yolov4_tao_config.txt -i v4l2:///dev/video0
Request sink_0 pad from streammux
batchSize 1...
Now playing: configs/yolov4_tao/pgie_yolov4_tao_config.txt
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
0:00:02.734664765   284 0x55d1ae0c2ac0 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1909> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.2/sources/project/deepstream_tao_apps/models/yolov4/yolov4_resnet18_395.etlt_b1_gpu0_int8.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 5
0   INPUT  kFLOAT Input           3x544x960       
1   OUTPUT kINT32 BatchedNMS      1               
2   OUTPUT kFLOAT BatchedNMS_1    200x4           
3   OUTPUT kFLOAT BatchedNMS_2    200             
4   OUTPUT kFLOAT BatchedNMS_3    200             

0:00:02.795680112   284 0x55d1ae0c2ac0 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2012> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.2/sources/project/deepstream_tao_apps/models/yolov4/yolov4_resnet18_395.etlt_b1_gpu0_int8.engine
0:00:02.799161376   284 0x55d1ae0c2ac0 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:configs/yolov4_tao/pgie_yolov4_tao_config.txt sucessfully
Decodebin child added: source
Decodebin child added: decodebin0
Running...
In cb_newpad
Failed to link decoderbin src pad to converter sink pad
###Decodebin did not pick nvidia decoder plugin.

**PERF:  FPS 0 (Avg)	
Thu Jun 29 00:43:14 2023
**PERF:  0.00(0.00)	
ERROR from element source: Internal data stream error.
Error details: gstbasesrc.c(3072): gst_base_src_loop (): /GstPipeline:ds-custom-pipeline/GstBin:source-bin-00/GstURIDecodeBin:uri-decode-bin/GstV4l2Src:source:
streaming stopped, reason not-linked (-1)
Returned, stopping playback
Deleting pipeline
  1. I can’t reproduce your issue, here is the log
    log.txt (2.5 KB)
  2. did you try gst-launch-1.0 uridecodebin uri=v4l2:///dev/video0 ! nvvideoconvert ! autovideosink ? can you see the output video?

gst-launch-1.0 uridecodebin uri=v4l2:///dev/video0 ! nvvideoconvert ! autovideosink

root@soohyeon-HP-Z2-Tower-G5-Workstation:/opt/nvidia/deepstream/deepstream-6.2/sources/project/de
epstream_tao_apps# gst-launch-1.0 uridecodebin uri=v4l2:///dev/video0 ! nvvideoconvert ! autovideosink
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
WARNING: from element /GstPipeline:pipeline0/GstURIDecodeBin:uridecodebin0: Delayed linking failed.
Additional debug info:
./grammar.y(506): gst_parse_no_more_pads (): /GstPipeline:pipeline0/GstURIDecodeBin:uridecodebin0:
failed delayed linking some pad of GstURIDecodeBin named uridecodebin0 to some pad of Gstnvvideoconvert named nvvideoconvert0
ERROR: from element /GstPipeline:pipeline0/GstURIDecodeBin:uridecodebin0/GstV4l2Src:source: Internal data stream error.
Additional debug info:
gstbasesrc.c(3072): gst_base_src_loop (): /GstPipeline:pipeline0/GstURIDecodeBin:uridecodebin0/GstV4l2Src:source:
streaming stopped, reason not-linked (-1)
Execution ended after 0:00:02.280950470
Setting pipeline to NULL ...
Freeing pipeline ...

When I type with this command, I get the following error.

The way I ran the cam is the command gst-launch-1.0 v4l2src device=/dev/video0! autovideosink.

please share the result of “v4l2-ctl -d /dev/video0 --list-formats-ext”.