Significant slowdown after DeepStream v6.2

frkl · June 18, 2024, 1:49pm

Hi,

I’ve noticed a significant slowdown of my GStreamer/DeepStream pipeline based on YOLOv7 with almost every version after DeepStream v6.2 during both TensorRT engine build and nvinfer run.

I can observe similar behavior when running this DeepStream sample app repo in different DeepStream containers. With DeepStream 7.0 it takes twice as long to run the same pipeline.

What is the reason for this and can this somehow be fixed?
Thanks a lot for any advice!

Setup:
GPU: GeForce RTX 3060 12 GB
NVIDIA GPU Driver Version 555.42.02

Steps to reproduce:

Pull and run the respective Docker images/containers
Navigate to /opt/nvidia/deepstream/deepstream/sources/apps/
Checkout commit with respective DeepStream version tag from sample app repo
Specify correct NVDS_VERSION in Makefile
Run make
Run ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4

Outputs
DeepStream 6.1 (nvcr.io/nvidia/deepstream:6.1-triton)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1482 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:01.310177147   583 0x563fe97ae230 WARN                 nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1888> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:01.336521804   583 0x563fe97ae230 WARN                 nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1993> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:01.336543205   583 0x563fe97ae230 INFO                 nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:30.862798311   583 0x563fe97ae230 INFO                 nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1946> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:30.901643307   583 0x563fe97ae230 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
End of stream

The program took 4.94 seconds to redact 1443 frames, pref = 292.15 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 6.2 (nvcr.io/nvidia/deepstream:6.2-triton)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:02.629953427   241 0x55ef20e83c10 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:02.734843529   241 0x55ef20e83c10 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:02.734880039   241 0x55ef20e83c10 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:24.517833343   241 0x55ef20e83c10 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1955> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:24.656468132   241 0x55ef20e83c10 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.25 seconds to redact 1443 frames, pref = 339.63 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 6.3 (nvcr.io/nvidia/deepstream:6.3-triton-multiarch)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:02.624725396   880 0x55c783be08f0 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1976> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:02.731642010   880 0x55c783be08f0 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2081> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:02.731689119   880 0x55c783be08f0 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:24.484216113   880 0x55c783be08f0 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2034> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:24.621066403   880 0x55c783be08f0 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 6.10 seconds to redact 1443 frames, pref = 236.51 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 6.4 (nvcr.io/nvidia/deepstream:6.4-triton-multiarch)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:06.642388790   487 0x56060aba2d20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2080> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:06.868524923   487 0x56060aba2d20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2185> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:06.868548497   487 0x56060aba2d20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2106> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:33.916139450   487 0x56060aba2d20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2138> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:34.174575073   487 0x56060aba2d20 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 6.07 seconds to redact 1443 frames, pref = 237.87 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 7.0 (nvcr.io/nvidia/deepstream:7.0-triton-multiarch)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1494 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:06.644427918    99 0x555db535ea30 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2083> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:06.867416394    99 0x555db535ea30 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2188> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:06.867436473    99 0x555db535ea30 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:33.652104778    99 0x555db535ea30 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2141> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:33.902237814    99 0x555db535ea30 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 8.53 seconds to redact 1443 frames, pref = 169.07 fps 

Returned, stopping playback
Deleting pipeline

Fiona.Chen · June 19, 2024, 6:33am

Did you run the dockers on the same host with the same Nvidia GPU driver and CUDA versions?

frkl · June 19, 2024, 6:41am

Yes, I used the same machine/setup for all runs and executed them several times in different order to avoid side-effects.

Fiona.Chen · June 19, 2024, 7:04am

The apps are not the same, why do you compare the different versions?

frkl · June 19, 2024, 7:21am

There is only one difference in the apps, between tag DS_6.1 and DS_6.3 (this memory type change). With DeepStream 6.3 I actually have to adopt it, otherwise the pipeline raises an error. With DeepStream 6.4/7.0 I tried both versions. It did not affect the processing times.

Fiona.Chen · June 19, 2024, 8:30am

I got different FPS on RTX3060 with DeepStream 7.0

xxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
0:00:07.448577124 1002 0x558ed3b78ae0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2095> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x270x480
1 OUTPUT kFLOAT output_cov 4x17x30
2 OUTPUT kFLOAT output_bbox 16x17x30

0:00:07.640886672 1002 0x558ed3b78ae0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2198> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
0:00:07.646275478 1002 0x558ed3b78ae0 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.53 seconds to redact 1443 frames, pref = 318.67 fps

Returned, stopping playback
Deleting pipeline
xxxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
0:00:07.454428025 1026 0x55d541d27ac0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2095> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x270x480
1 OUTPUT kFLOAT output_cov 4x17x30
2 OUTPUT kFLOAT output_bbox 16x17x30

0:00:07.646741751 1026 0x55d541d27ac0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2198> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
0:00:07.651932847 1026 0x55d541d27ac0 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.44 seconds to redact 1443 frames, pref = 324.95 fps

Returned, stopping playback
Deleting pipeline
xxxxxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
0:00:07.444018332 1050 0x5574111059c0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2095> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x270x480
1 OUTPUT kFLOAT output_cov 4x17x30
2 OUTPUT kFLOAT output_bbox 16x17x30

0:00:07.637306610 1050 0x5574111059c0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2198> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
0:00:07.642429184 1050 0x5574111059c0 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.58 seconds to redact 1443 frames, pref = 314.96 fps

Returned, stopping playback
Deleting pipeline
xxxxxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# nvidia-smi
Wed Jun 19 08:28:25 2024
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:65:00.0 Off | N/A |
| 30% 46C P0 43W / 170W | 18MiB / 12288MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
±--------------------------------------------------------------------------------------+

frkl · June 19, 2024, 8:55am

Did you also run it in the nvcr.io/nvidia/deepstream:7.0-triton-multiarch Docker image? If so, I’ll try to get the exact same setup (driver version, etc.). Thanks a lot for your effort!

Fiona.Chen · June 19, 2024, 8:57am

Yes. The docker is nvcr.io/nvidia/deepstream:7.0-triton-multiarch

frkl · July 1, 2024, 1:32pm

Hi,
I unfortunately could not manage to install a different driver version on the RTX 3060 machine as this would have affected others. Instead, I’ve now used a cloud instance with Tesla T4 GPU:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P0              25W /  70W |      2MiB / 15360MiB |      6%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

I can still observe a significant performance difference between DeppStream 6.2 and DeepStream 7.0:

DeepStream 6.2

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:04.718521502   494 0x60752abb2610 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:04.770671272   494 0x60752abb2610 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:04.770695496   494 0x60752abb2610 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:22.766180031   494 0x60752abb2610 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1955> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480
1   OUTPUT kFLOAT output_cov      4x17x30
2   OUTPUT kFLOAT output_bbox     16x17x30

0:00:22.827191155   494 0x60752abb2610 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 2.36 seconds to redact 1443 frames, pref = 612.60 fps

Returned, stopping playback
Deleting pipeline

DeepStream 7.0

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1494 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:09.065868075   465 0x5d2f191e7a20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2083> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:09.198186190   465 0x5d2f191e7a20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2188> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:09.198212744   465 0x5d2f191e7a20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:33.105730358   465 0x5d2f191e7a20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2141> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480
1   OUTPUT kFLOAT output_cov      4x17x30
2   OUTPUT kFLOAT output_bbox     16x17x30

0:00:33.237763439   465 0x5d2f191e7a20 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 5.99 seconds to redact 1443 frames, pref = 240.97 fps

Returned, stopping playback
Deleting pipeline

These are the exact commands which I run in the respective containers:

wget https://github.com/NVIDIA-AI-IOT/redaction_with_deepstream/archive/refs/heads/master.zip
apt update && apt install unzip && unzip master.zip && cd redaction_with_deepstream-master
make
./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4

Do you have any idea what else could be responsible for this?
Thanks a lot in advance!

Fiona.Chen · July 2, 2024, 2:27am

The CUDA version and TensorRT version are all different. And the model in this sample is quite old for the new CUDA and new TensorRT.

yingliu · July 23, 2024, 6:21am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

system · August 6, 2024, 6:22am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
DeepStream 1.1.8 demo app deepstream-test1 failed DeepStream SDK cudnn , jetson , deepstream	7	21	December 30, 2024
Deepstream5.1 运行python test1时报错 DeepStream SDK	6	636	October 12, 2021
Segmentation Fault: Running nvidia deepstream 5.0 SDK on Ubuntu DeepStream SDK	3	495	October 12, 2021
Deepstream engine files not generated DeepStream SDK	13	4692	October 12, 2021
Can't run deepstream-app-5.1 example using deepstream docker 5.1, but can do it with deepstream docker 5.0.1 DeepStream SDK	4	1986	October 12, 2021
Deep Stream SDK DeepStream SDK gstreamer	4	514	October 12, 2021
Run BACK-TO-BACK-DETECTORS REFERENCE APP under DeepStream SDK 5.0 DeepStream SDK	16	998	October 12, 2021
Deepstream python sample"deepstream-test1" does not run DeepStream SDK cuda , gstreamer	10	386	September 5, 2023
DeepStream5.0 TLT, can't run deepstream-test5 DeepStream SDK	8	616	October 12, 2021
Issues with running inference on multiple rtsp streams in deepstream-imagedata-multistream DeepStream SDK jetson-inference	24	668	August 7, 2024

Significant slowdown after DeepStream v6.2

Related topics