Significant slowdown after DeepStream v6.2

Hi,

I’ve noticed a significant slowdown of my GStreamer/DeepStream pipeline based on YOLOv7 with almost every version after DeepStream v6.2 during both TensorRT engine build and nvinfer run.

I can observe similar behavior when running this DeepStream sample app repo in different DeepStream containers. With DeepStream 7.0 it takes twice as long to run the same pipeline.

What is the reason for this and can this somehow be fixed?
Thanks a lot for any advice!

Setup:
GPU: GeForce RTX 3060 12 GB
NVIDIA GPU Driver Version 555.42.02

Steps to reproduce:

  1. Pull and run the respective Docker images/containers
  2. Navigate to /opt/nvidia/deepstream/deepstream/sources/apps/
  3. Checkout commit with respective DeepStream version tag from sample app repo
  4. Specify correct NVDS_VERSION in Makefile
  5. Run make
  6. Run ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4

Outputs
DeepStream 6.1 (nvcr.io/nvidia/deepstream:6.1-triton)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1482 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:01.310177147   583 0x563fe97ae230 WARN                 nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1888> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:01.336521804   583 0x563fe97ae230 WARN                 nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1993> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:01.336543205   583 0x563fe97ae230 INFO                 nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:30.862798311   583 0x563fe97ae230 INFO                 nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1946> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.1/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:30.901643307   583 0x563fe97ae230 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
End of stream

The program took 4.94 seconds to redact 1443 frames, pref = 292.15 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 6.2 (nvcr.io/nvidia/deepstream:6.2-triton)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:02.629953427   241 0x55ef20e83c10 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:02.734843529   241 0x55ef20e83c10 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:02.734880039   241 0x55ef20e83c10 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:24.517833343   241 0x55ef20e83c10 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1955> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:24.656468132   241 0x55ef20e83c10 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.25 seconds to redact 1443 frames, pref = 339.63 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 6.3 (nvcr.io/nvidia/deepstream:6.3-triton-multiarch)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:02.624725396   880 0x55c783be08f0 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1976> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:02.731642010   880 0x55c783be08f0 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2081> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:02.731689119   880 0x55c783be08f0 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:24.484216113   880 0x55c783be08f0 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2034> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.3/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:24.621066403   880 0x55c783be08f0 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 6.10 seconds to redact 1443 frames, pref = 236.51 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 6.4 (nvcr.io/nvidia/deepstream:6.4-triton-multiarch)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:06.642388790   487 0x56060aba2d20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2080> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:06.868524923   487 0x56060aba2d20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2185> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:06.868548497   487 0x56060aba2d20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2106> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:33.916139450   487 0x56060aba2d20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2138> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.4/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:34.174575073   487 0x56060aba2d20 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 6.07 seconds to redact 1443 frames, pref = 237.87 fps 

Returned, stopping playback
Deleting pipeline

DeepStream 7.0 (nvcr.io/nvidia/deepstream:7.0-triton-multiarch)

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1494 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:06.644427918    99 0x555db535ea30 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2083> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:06.867416394    99 0x555db535ea30 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2188> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:06.867436473    99 0x555db535ea30 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:33.652104778    99 0x555db535ea30 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2141> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480       
1   OUTPUT kFLOAT output_cov      4x17x30         
2   OUTPUT kFLOAT output_bbox     16x17x30        

0:00:33.902237814    99 0x555db535ea30 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 8.53 seconds to redact 1443 frames, pref = 169.07 fps 

Returned, stopping playback
Deleting pipeline

Did you run the dockers on the same host with the same Nvidia GPU driver and CUDA versions?

Yes, I used the same machine/setup for all runs and executed them several times in different order to avoid side-effects.

The apps are not the same, why do you compare the different versions?

There is only one difference in the apps, between tag DS_6.1 and DS_6.3 (this memory type change). With DeepStream 6.3 I actually have to adopt it, otherwise the pipeline raises an error. With DeepStream 6.4/7.0 I tried both versions. It did not affect the processing times.

I got different FPS on RTX3060 with DeepStream 7.0

xxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
0:00:07.448577124 1002 0x558ed3b78ae0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2095> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x270x480
1 OUTPUT kFLOAT output_cov 4x17x30
2 OUTPUT kFLOAT output_bbox 16x17x30

0:00:07.640886672 1002 0x558ed3b78ae0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2198> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
0:00:07.646275478 1002 0x558ed3b78ae0 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.53 seconds to redact 1443 frames, pref = 318.67 fps

Returned, stopping playback
Deleting pipeline
xxxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
0:00:07.454428025 1026 0x55d541d27ac0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2095> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x270x480
1 OUTPUT kFLOAT output_cov 4x17x30
2 OUTPUT kFLOAT output_bbox 16x17x30

0:00:07.646741751 1026 0x55d541d27ac0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2198> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
0:00:07.651932847 1026 0x55d541d27ac0 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.44 seconds to redact 1443 frames, pref = 324.95 fps

Returned, stopping playback
Deleting pipeline
xxxxxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# ./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
0:00:07.444018332 1050 0x5574111059c0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2095> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x270x480
1 OUTPUT kFLOAT output_cov 4x17x30
2 OUTPUT kFLOAT output_bbox 16x17x30

0:00:07.637306610 1050 0x5574111059c0 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2198> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine
0:00:07.642429184 1050 0x5574111059c0 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 4.58 seconds to redact 1443 frames, pref = 314.96 fps

Returned, stopping playback
Deleting pipeline
xxxxxxxx:/opt/nvidia/deepstream/deepstream/sources/apps/redaction_with_deepstream# nvidia-smi
Wed Jun 19 08:28:25 2024
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:65:00.0 Off | N/A |
| 30% 46C P0 43W / 170W | 18MiB / 12288MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
±--------------------------------------------------------------------------------------+

Did you also run it in the nvcr.io/nvidia/deepstream:7.0-triton-multiarch Docker image? If so, I’ll try to get the exact same setup (driver version, etc.). Thanks a lot for your effort!

Yes. The docker is nvcr.io/nvidia/deepstream:7.0-triton-multiarch

Hi,
I unfortunately could not manage to install a different driver version on the RTX 3060 machine as this would have affected others. Instead, I’ve now used a cloud instance with Tesla T4 GPU:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P0              25W /  70W |      2MiB / 15360MiB |      6%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

I can still observe a significant performance difference between DeppStream 6.2 and DeepStream 7.0:

DeepStream 6.2

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:04.718521502   494 0x60752abb2610 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:04.770671272   494 0x60752abb2610 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:04.770695496   494 0x60752abb2610 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:22.766180031   494 0x60752abb2610 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1955> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-6.2/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480
1   OUTPUT kFLOAT output_cov      4x17x30
2   OUTPUT kFLOAT output_bbox     16x17x30

0:00:22.827191155   494 0x60752abb2610 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 2.36 seconds to redact 1443 frames, pref = 612.60 fps

Returned, stopping playback
Deleting pipeline

DeepStream 7.0

./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4
Now playing: /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1494 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine open error
0:00:09.065868075   465 0x5d2f191e7a20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2083> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed
0:00:09.198186190   465 0x5d2f191e7a20 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2188> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/configs/../fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine failed, try rebuild
0:00:09.198212744   465 0x5d2f191e7a20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: Trying to create engine from model files
Warning, setting batch size to 1. Update the dimension after parsing due to using explicit batch size.
0:00:33.105730358   465 0x5d2f191e7a20 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2141> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.0/sources/apps/redaction_with_deepstream-master/fd_lpd_model/fd_lpd.caffemodel_b1_gpu0_fp32.engine successfully
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:612 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT data            3x270x480
1   OUTPUT kFLOAT output_cov      4x17x30
2   OUTPUT kFLOAT output_bbox     16x17x30

0:00:33.237763439   465 0x5d2f191e7a20 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:./configs/pgie_config_fd_lpd.txt sucessfully
Pipeline ready
Pipeline running
nvstreammux: Successfully handled EOS for source_id=0
End of stream

The program took 5.99 seconds to redact 1443 frames, pref = 240.97 fps

Returned, stopping playback
Deleting pipeline

These are the exact commands which I run in the respective containers:

wget https://github.com/NVIDIA-AI-IOT/redaction_with_deepstream/archive/refs/heads/master.zip
apt update && apt install unzip && unzip master.zip && cd redaction_with_deepstream-master
make
./deepstream-redaction-app -c ./configs/pgie_config_fd_lpd.txt -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -o out.mp4

Do you have any idea what else could be responsible for this?
Thanks a lot in advance!

The CUDA version and TensorRT version are all different. And the model in this sample is quite old for the new CUDA and new TensorRT.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.