Low fps rate and low volatile gpu-util running mutiple streams

damonzzz · April 28, 2023, 2:16am

Please provide complete information as applicable to your setup.

• Hardware Platform GPU RTX 2060
• DeepStream Version 6.1
• NVIDIA GPU Driver Version (valid for GPU only) 510.47.03
• Issue Type questions

Hi, I’m trying to run multiple streams using deepstream. However, only 6 streams added, the fps droped to 2.

2023-04-28 02:10:53.471 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 1.84, 'stream1': 1.84, 'stream2': 1.84, 'stream3': 1.84, 'stream4': 1.84, 'stream5': 1.84, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}
2023-04-28 02:11:01.093 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 1.44, 'stream1': 1.44, 'stream2': 1.44, 'stream3': 1.44, 'stream4': 1.44, 'stream5': 1.44, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}
2023-04-28 02:11:09.091 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 1.5, 'stream1': 1.5, 'stream2': 1.5, 'stream3': 1.5, 'stream4': 1.5, 'stream5': 1.5, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}
2023-04-28 02:11:17.220 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 2.09, 'stream1': 2.09, 'stream2': 2.09, 'stream3': 2.09, 'stream4': 2.09, 'stream5': 2.09, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}

I try to find out why and get confused that my gpu is not working

hx@hx-System-Product-Name:~$ nvidia-smi
Fri Apr 28 10:05:36 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   45C    P2    29W / 190W |   3053MiB /  6144MiB |     12%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       913      G   /usr/lib/xorg/Xorg                 36MiB |
|    0   N/A  N/A      1131      G   /usr/bin/gnome-shell                6MiB |
|    0   N/A  N/A     46658      C   python3                          1853MiB |
+-----------------------------------------------------------------------------+

I continue to dig in and using nvidia-smi dmon command, here is the output:

hx@hx-System-Product-Name:~$ nvidia-smi dmon
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    29    46     -     8     3     0     1  6801  1365
    0    47    47     -     5     2     0     1  6801  1365
    0    29    46     -    16     6     0     1  6801  1365
    0    46    46     -     8     3     0     2  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    46     -     0     0     0     0  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    47     -     8     3     0     1  6801  1365
    0    43    47     -     4     2     0     1  6801  1365
    0    29    47     -    15     6     0     2  6801  1365
    0    44    47     -     7     3     0     1  6801  1365
    0    29    47     -     9     3     0     1  6801  1365
    0    29    47     -    17     6     0     2  6801  1365
    0    29    47     -     5     2     0     1  6801  1365
    0    29    47     -     0     0     0     0  6801  1365
    0    42    47     -    13     5     0     2  6801  1470
    0    31    47     -    17     7     0     2  6801  1470
    0    46    48     -     8     3     0     1  6801  1365
    0    29    47     -     9     3     0     1  6801  1365
    0    45    47     -     9     3     0     1  6801  1365
    0    46    47     -     9     3     0     1  6801  1365
    0    46    47     -     4     2     0     1  6801  1365
    0    29    47     -     9     3     0     1  6801  1365
    0    29    47     -     4     2     0     1  6801  1365
    0    29    48     -     5     2     0     1  6801  1365
    0    29    47     -    12     5     0     1  6801  1365
    0    29    47     -     4     2     0     0  6801  1365
    0    29    48     -     5     2     0     0  6801  1365
    0    32    48     -     4     2     0     1  6801  1365
    0    29    48     -     8     3     0     1  6801  1365
    0    56    48     -     5     2     0     1  6801  1365
    0    29    48     -    13     5     0     2  6801  1365
    0    47    48     -    13     5     0     2  6801  1365
    0    29    48     -    13     5     0     2  6801  1365
    0    29    48     -    12     5     0     2  6801  1365
    0    40    49     -    13     5     0     2  6801  1830
    0    40    49     -     7     3     0     1  6801  1830
    0    29    48     -    12     5     0     1  6801  1365
    0    29    48     -     8     3     0     1  6801  1365
    0    29    48     -     5     2     0     1  6801  1365
    0    30    49     -    12     5     0     2  6801  1365
    0    47    49     -     4     2     0     1  6801  1365
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    29    48     -    13     5     0     2  6801  1365
    0    30    49     -     9     3     0     1  6801  1365
    0    30    49     -     9     3     0     1  6801  1365
    0    47    49     -    17     6     0     2  6801  1395

It seems the decoder is not working right? Why and how can I fix this?

By the way, I’ve set the properties of fakesink as I searched on the google.

        self.sink.set_property("sync", 0)
        self.sink.set_property("qos", 0)

Thanks in advance.

Fiona.Chen · April 28, 2023, 8:44am

Are you using deepstream-app or other sample app from the SDK? If not, can you show us the complete pipeline? If yes, can you share us your configurations?

damonzzz · April 28, 2023, 9:25am

@Fiona.Chen Thanks for quick reply.

I’m not using any samples from the SDK. The graph is too large to upload, I upload it to google driver: picture link

All streams are added one by one like runtime sources add/delete example does.

If you need any more information, please let me know.

Thanks a lot.

Fiona.Chen · May 4, 2023, 10:41am

So it is the dynamic source adding/delete case. Are the sources live streams? What is the configurations for the nvstreammux plugin in your code?

damonzzz · May 5, 2023, 1:22am

@Fiona.Chen Yes they all are live streams.

the config of nvstreammux is

        # config pipeline
        streammux.set_property("batched_push_timeout", 25000)
        streammux.set_property("batch_size", 16)
        streammux.set_property("gpu_id", 0)
        streammux.set_property("live-source", 1)
        streammux.set_property('live-source', 1)
        streammux.set_property("nvbuf-memory-type", mem_type)
        streammux.set_property('width', 1920)
        streammux.set_property('height', 1080)

More specific, the streams have different fps rates. The first and second streams perform normally. When the third one was added to pipeline, all three streams drop to the same fps rate about 10 fps. As I continue to add streams, all streams fps rate are the same and drop little by little.

Fiona.Chen · May 5, 2023, 8:06am

Since your sources have different FPS and you are working with dynamic source adding/removing, the nvstreammux will try to synchronize the streams in batch, if one source is removed, the batch will wait for the missing frame from the removed source until timeout. So please use “export NVSTREAMMUX_ADAPTIVE_BATCHING=yes” and set appropriate batched_push_timeout value to balance the batch synchronization for all sources.

damonzzz · May 5, 2023, 9:52am

@Fiona.Chen Thanks for your advice. It does help to increase frame rate after I set export NVSTREAMMUX_ADAPTIVE_BATCHING=yes” and

streammux.set_property("batched-push-timeout", 25)

However, the fps is not stable:

2023-05-05 09:47:20.337 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 12.79, 1: 13.18, 2: 12.79, 3: 9.88, 4: 9.69, 5: 10.08, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}
2023-05-05 09:47:25.201 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 25.88, 1: 26.49, 2: 23.41, 3: 10.47, 4: 10.68, 5: 10.68, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}
2023-05-05 09:47:30.252 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 20.81, 1: 20.81, 2: 15.86, 3: 9.91, 4: 9.71, 5: 9.71, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}
2023-05-05 09:47:35.284 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 12.54, 1: 12.74, 2: 12.35, 3: 9.96, 4: 10.15, 5: 10.15, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}

as well as the utilization values for SM, Memory, and Decoder:

   0   150    66     -    45    28     0     7  6801  1920
    0    99    63     -    51    31     0     8  6801  1950
    0    50    65     -    18    11     0     4  6801  1845
    0   168    66     -    34    22     0     5  6801  1890
    0    51    62     -    37    23     0     6  6801  1950
    0    59    62     -    38    24     0     5  6801  1950
    0   108    66     -    47    29     0     7  6801  1905
    0   130    63     -    44    27     0     7  6801  1950
    0    67    66     -    64    38     0    10  6801  1845
    0    51    64     -    43    26     0     7  6801  1950
    0    95    64     -    33    21     0     5  6801  1950
    0   108    63     -    24    15     0     5  6801  1935
    0   140    63     -    42    27     0     6  6801  1935
    0    60    63     -    32    21     0     5  6801  1950
    0    67    63     -    48    30     0     8  6801  1950
    0    51    64     -    20    13     0     4  6801  1950
    0    35    61     -    22    15     0     4  6801  1365
    0   125    64     -    30    18     0     5  6801  1920
    0   165    63     -    43    27     0     6  6801  1950
    0    51    65     -    38    24     0     6  6801  1800
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    69    66     -    45    28     0     6  6801  1875
    0   100    62     -    48    29     0     8  6801  1950
    0    52    62     -    31    19     0     5  6801  1950
    0    98    63     -    38    24     0     6  6801  1950
    0    56    62     -    32    21     0     5  6801  1950
    0    52    62     -    29    18     0     5  6801  1950
    0   163    63     -    33    21     0     5  6801  1950
    0    72    66     -    40    25     0     6  6801  1845
    0   139    66     -    26    17     0     4  6801  1860
    0    51    62     -    45    28     0     7  6801  1950

Is there a way to check the NvDecoder status? Or how to set its value.

Thanks.

Fiona.Chen · May 6, 2023, 1:38am

damonzzz:

# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    69    66     -    45    28     0     6  6801  1875
    0   100    62     -    48    29     0     8  6801  1950
    0    52    62     -    31    19     0     5  6801  1950
    0    98    63     -    38    24     0     6  6801  1950
    0    56    62     -    32    21     0     5  6801  1950
    0    52    62     -    29    18     0     5  6801  1950
    0   163    63     -    33    21     0     5  6801  1950

This log already shows the decoder performance.

The FPS will not be stable if you remove/add sources frequently. The convergence of the batch timing algorithm needs some time.

You can try to increase the rtspsrc latency rtspsrc to make the received rtps stream be more stable, you can set the “sync” property to TRUE with your sink. These may help the pipeline FPS to be more stable but it will increase the pipeline latency.

damonzzz · May 6, 2023, 2:50am

Thanks a lot!

I guess this is the key to make runtime add/delete source more stable. Anyway, thanks.

system · May 24, 2023, 7:23am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance drop in multi source rtsp stream processing DeepStream SDK	10	1096	October 12, 2021
Increase the FPS DeepStream SDK	25	1832	April 17, 2024
Nvstreammux (new) plugin is broken in DS 6.2 release DeepStream SDK	22	1626	May 22, 2023
Drop in FPS when adding more streams in DeepStream, and GPU utilization not exceeding 30% cuDNN tensorrt , opencv , cuda , tensorflow , python , deepstream	0	71	July 19, 2024
Performance drop when using multiple sources DeepStream SDK	27	1231	April 29, 2024
Deepstream FPS drops when i add more and more RTSP streams DeepStream SDK	20	2337	November 6, 2023
Optimising a pipeline with a large amount of input sources DeepStream SDK	7	549	March 8, 2022
Deepstream 6.4 and very low FPS DeepStream SDK	12	162	August 20, 2024
FPS drops to 0.2 after some time in Deepstream 5.0 python app DeepStream SDK	5	1223	October 12, 2021
FPS drops when source is added/removed on runtime with new streammux DeepStream SDK	10	1030	February 3, 2023

Low fps rate and low volatile gpu-util running mutiple streams

Related topics