Low fps rate and low volatile gpu-util running mutiple streams

Please provide complete information as applicable to your setup.

• Hardware Platform GPU RTX 2060
• DeepStream Version 6.1
• NVIDIA GPU Driver Version (valid for GPU only) 510.47.03
• Issue Type questions

Hi, I’m trying to run multiple streams using deepstream. However, only 6 streams added, the fps droped to 2.

2023-04-28 02:10:53.471 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 1.84, 'stream1': 1.84, 'stream2': 1.84, 'stream3': 1.84, 'stream4': 1.84, 'stream5': 1.84, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}
2023-04-28 02:11:01.093 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 1.44, 'stream1': 1.44, 'stream2': 1.44, 'stream3': 1.44, 'stream4': 1.44, 'stream5': 1.44, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}
2023-04-28 02:11:09.091 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 1.5, 'stream1': 1.5, 'stream2': 1.5, 'stream3': 1.5, 'stream4': 1.5, 'stream5': 1.5, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}
2023-04-28 02:11:17.220 | DEBUG    | kbds.util.FPS:perf_print_callback:64 - **PERF: 
 {'stream0': 2.09, 'stream1': 2.09, 'stream2': 2.09, 'stream3': 2.09, 'stream4': 2.09, 'stream5': 2.09, 'stream6': 0.0, 'stream7': 0.0, 'stream8': 0.0, 'stream9': 0.0, 'stream10': 0.0, 'stream11': 0.0, 'stream12': 0.0, 'stream13': 0.0, 'stream14': 0.0, 'stream15': 0.0}

I try to find out why and get confused that my gpu is not working

hx@hx-System-Product-Name:~$ nvidia-smi
Fri Apr 28 10:05:36 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   45C    P2    29W / 190W |   3053MiB /  6144MiB |     12%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       913      G   /usr/lib/xorg/Xorg                 36MiB |
|    0   N/A  N/A      1131      G   /usr/bin/gnome-shell                6MiB |
|    0   N/A  N/A     46658      C   python3                          1853MiB |
+-----------------------------------------------------------------------------+


I continue to dig in and using nvidia-smi dmon command, here is the output:

hx@hx-System-Product-Name:~$ nvidia-smi dmon
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    29    46     -     8     3     0     1  6801  1365
    0    47    47     -     5     2     0     1  6801  1365
    0    29    46     -    16     6     0     1  6801  1365
    0    46    46     -     8     3     0     2  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    46     -     0     0     0     0  6801  1365
    0    29    46     -     9     3     0     1  6801  1365
    0    29    47     -     8     3     0     1  6801  1365
    0    43    47     -     4     2     0     1  6801  1365
    0    29    47     -    15     6     0     2  6801  1365
    0    44    47     -     7     3     0     1  6801  1365
    0    29    47     -     9     3     0     1  6801  1365
    0    29    47     -    17     6     0     2  6801  1365
    0    29    47     -     5     2     0     1  6801  1365
    0    29    47     -     0     0     0     0  6801  1365
    0    42    47     -    13     5     0     2  6801  1470
    0    31    47     -    17     7     0     2  6801  1470
    0    46    48     -     8     3     0     1  6801  1365
    0    29    47     -     9     3     0     1  6801  1365
    0    45    47     -     9     3     0     1  6801  1365
    0    46    47     -     9     3     0     1  6801  1365
    0    46    47     -     4     2     0     1  6801  1365
    0    29    47     -     9     3     0     1  6801  1365
    0    29    47     -     4     2     0     1  6801  1365
    0    29    48     -     5     2     0     1  6801  1365
    0    29    47     -    12     5     0     1  6801  1365
    0    29    47     -     4     2     0     0  6801  1365
    0    29    48     -     5     2     0     0  6801  1365
    0    32    48     -     4     2     0     1  6801  1365
    0    29    48     -     8     3     0     1  6801  1365
    0    56    48     -     5     2     0     1  6801  1365
    0    29    48     -    13     5     0     2  6801  1365
    0    47    48     -    13     5     0     2  6801  1365
    0    29    48     -    13     5     0     2  6801  1365
    0    29    48     -    12     5     0     2  6801  1365
    0    40    49     -    13     5     0     2  6801  1830
    0    40    49     -     7     3     0     1  6801  1830
    0    29    48     -    12     5     0     1  6801  1365
    0    29    48     -     8     3     0     1  6801  1365
    0    29    48     -     5     2     0     1  6801  1365
    0    30    49     -    12     5     0     2  6801  1365
    0    47    49     -     4     2     0     1  6801  1365
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    29    48     -    13     5     0     2  6801  1365
    0    30    49     -     9     3     0     1  6801  1365
    0    30    49     -     9     3     0     1  6801  1365
    0    47    49     -    17     6     0     2  6801  1395

It seems the decoder is not working right? Why and how can I fix this?

By the way, I’ve set the properties of fakesink as I searched on the google.

        self.sink.set_property("sync", 0)
        self.sink.set_property("qos", 0)

Thanks in advance.

Are you using deepstream-app or other sample app from the SDK? If not, can you show us the complete pipeline? If yes, can you share us your configurations?

@Fiona.Chen Thanks for quick reply.

I’m not using any samples from the SDK. The graph is too large to upload, I upload it to google driver: picture link

All streams are added one by one like runtime sources add/delete example does.

If you need any more information, please let me know.

Thanks a lot.

So it is the dynamic source adding/delete case. Are the sources live streams? What is the configurations for the nvstreammux plugin in your code?

@Fiona.Chen Yes they all are live streams.

the config of nvstreammux is

        # config pipeline
        streammux.set_property("batched_push_timeout", 25000)
        streammux.set_property("batch_size", 16)
        streammux.set_property("gpu_id", 0)
        streammux.set_property("live-source", 1)
        streammux.set_property('live-source', 1)
        streammux.set_property("nvbuf-memory-type", mem_type)
        streammux.set_property('width', 1920)
        streammux.set_property('height', 1080)

More specific, the streams have different fps rates. The first and second streams perform normally. When the third one was added to pipeline, all three streams drop to the same fps rate about 10 fps. As I continue to add streams, all streams fps rate are the same and drop little by little.

Since your sources have different FPS and you are working with dynamic source adding/removing, the nvstreammux will try to synchronize the streams in batch, if one source is removed, the batch will wait for the missing frame from the removed source until timeout. So please use “export NVSTREAMMUX_ADAPTIVE_BATCHING=yes” and set appropriate batched_push_timeout value to balance the batch synchronization for all sources.

@Fiona.Chen Thanks for your advice. It does help to increase frame rate after I set export NVSTREAMMUX_ADAPTIVE_BATCHING=yes” and

streammux.set_property("batched-push-timeout", 25)

However, the fps is not stable:

2023-05-05 09:47:20.337 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 12.79, 1: 13.18, 2: 12.79, 3: 9.88, 4: 9.69, 5: 10.08, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}
2023-05-05 09:47:25.201 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 25.88, 1: 26.49, 2: 23.41, 3: 10.47, 4: 10.68, 5: 10.68, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}
2023-05-05 09:47:30.252 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 20.81, 1: 20.81, 2: 15.86, 3: 9.91, 4: 9.71, 5: 9.71, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}
2023-05-05 09:47:35.284 | DEBUG    | kbds.util.FPS:perf_print_callback:75 - {0: 12.54, 1: 12.74, 2: 12.35, 3: 9.96, 4: 10.15, 5: 10.15, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0}

as well as the utilization values for SM, Memory, and Decoder:

   0   150    66     -    45    28     0     7  6801  1920
    0    99    63     -    51    31     0     8  6801  1950
    0    50    65     -    18    11     0     4  6801  1845
    0   168    66     -    34    22     0     5  6801  1890
    0    51    62     -    37    23     0     6  6801  1950
    0    59    62     -    38    24     0     5  6801  1950
    0   108    66     -    47    29     0     7  6801  1905
    0   130    63     -    44    27     0     7  6801  1950
    0    67    66     -    64    38     0    10  6801  1845
    0    51    64     -    43    26     0     7  6801  1950
    0    95    64     -    33    21     0     5  6801  1950
    0   108    63     -    24    15     0     5  6801  1935
    0   140    63     -    42    27     0     6  6801  1935
    0    60    63     -    32    21     0     5  6801  1950
    0    67    63     -    48    30     0     8  6801  1950
    0    51    64     -    20    13     0     4  6801  1950
    0    35    61     -    22    15     0     4  6801  1365
    0   125    64     -    30    18     0     5  6801  1920
    0   165    63     -    43    27     0     6  6801  1950
    0    51    65     -    38    24     0     6  6801  1800
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    69    66     -    45    28     0     6  6801  1875
    0   100    62     -    48    29     0     8  6801  1950
    0    52    62     -    31    19     0     5  6801  1950
    0    98    63     -    38    24     0     6  6801  1950
    0    56    62     -    32    21     0     5  6801  1950
    0    52    62     -    29    18     0     5  6801  1950
    0   163    63     -    33    21     0     5  6801  1950
    0    72    66     -    40    25     0     6  6801  1845
    0   139    66     -    26    17     0     4  6801  1860
    0    51    62     -    45    28     0     7  6801  1950

Is there a way to check the NvDecoder status? Or how to set its value.

Thanks.

This log already shows the decoder performance.

The FPS will not be stable if you remove/add sources frequently. The convergence of the batch timing algorithm needs some time.

You can try to increase the rtspsrc latency rtspsrc to make the received rtps stream be more stable, you can set the “sync” property to TRUE with your sink. These may help the pipeline FPS to be more stable but it will increase the pipeline latency.

Thanks a lot!

I guess this is the key to make runtime add/delete source more stable. Anyway, thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.