Deepstream_parallel_inference_app fps abnormal

**• Hardware Platform (GPU) RTX 3060
**• DeepStream Version 6.3
• TensorRT Version
**• NVIDIA GPU Driver Version (valid for GPU only) CUDA12.0
• Issue Type (questions)

In the following example, when using multiple different RTSP sources, if each source has a different FPS, using perf_cb shows that the FPS for each source is exactly the same. Does this mean that in this example, the FPS for each source is unified? Additionally, if one source disconnects, none of the RTSP streams are displayed. Does this indicate that the example lacks a reconnection mechanism for disconnected streams?

  1. In deepstream_parallel_inference_app, the source of each inference branch is from the same nvstreammux, so the fps of each inference branch is the same. please refer to the design graph in readme.
  2. this sample does not support RTSP reconnection. please refer to deepstream-app opensource code. It already supports RTSP reconnection functionality. deepstream-app uses rtspsrc_monitor_probe_func to monitor src’s data receiving. it will reconnect rtsp srouce in watch_source_status when not receiving data in a specific time.

I discovered that when using 3 RTSP streams and setting the following configuration:

streammux:
  gpu-id: 0
  ## Boolean property to inform muxer that sources are live
  live-source: 1
  buffer-pool-size: 4
  batch-size: 2

there is an FPS anomaly that causes stuttering. The batch-size must be set to 3 for normal operation. However, if I set it to 3 and one RTSP stream disconnects, leaving only 2 streams, it again causes FPS anomalies and stuttering. Is this normal? Is there a way to resolve this issue?

please refer to this topic. please use “export NVSTREAMMUX_ADAPTIVE_BATCHING=yes” and set appropriate batched_push_timeout value. batched_push_timeout should be 1/max_fps. please refer to this doc.

Is there a way to maintain the original FPS for each source? Otherwise, in a multi-RTSP scenario, the FPS would become that of the slowest source.

fps measuring is opensource in deepstream-app, please refer to enable_perf_measurement and perf_measurement_callback in perf_measurement_callback. if you wan’t to get the origin fps of each srouce, you can do the fps measuring on nvstreammux’s sink. for example, in create_pipeline of deepstream-app, fps measuring is on the sink of demuxer or nvmultistreamtiler.

As you mentioned, the source of each inference branch is from the same nvstreammux, so the FPS of each inference branch is the same. During my testing, I found that if there are multiple RTSP streams, the FPS becomes the same as the slowest one. Is there a way to change this so that each inference branch uses the original FPS of its respective source?

deepstream-parallel-inference does not support fps measuring. so, I don’t know how you added the code. for example, there are three rtsp source with fp10, fp20 and fp30 respectively. if you add fps measuring on the first nvstreammux src, the three fps should be 10,20,30. assuming the two inference branches use the three sources respectively, if you add fps measuring on pgie’s src, the fps of the two branches should both be 10,20,30. did you get a different result? if yes, please share the detailed results.

I added a perf_cb function to detect the FPS. The result shows that each source has the same FPS, which is the lowest one. Could it be that my method is incorrect?
If my sources are FPS 10, FPS 20, and FPS 30, they all display as FPS 10.

int main(int argc, char *argv[])
{
  .....
  if (config->enable_perf_measurement)
  {
    GstPad *fps_pad = NULL;
      if (config->tiled_display_config.enable == NV_DS_TILED_DISPLAY_DISABLE)
  {
    fps_pad = gst_element_get_static_pad(pipeline->demuxer, "sink");
  }
  else
  {
    fps_pad = gst_element_get_static_pad(pipeline->tiled_display_bin.bin, "sink");
  }
    appCtx->perf_struct.context = appCtx;
    enable_perf_measurement(&appCtx->perf_struct, fps_pad,
                            pipeline->multi_src_bin.num_bins,
                            config->perf_measurement_interval_sec,
                            config->multi_source_config[0].dewarper_config.num_surfaces_per_frame,
                            perf_cb);
    g_print("Performance measurement enabled for %d streams\n", pipeline->multi_src_bin.num_bins);
  }
  ....
}
static void
perf_cb(gpointer context, NvDsAppPerfStruct *str)
{
  static guint header_print_cnt = 0;
  guint i;
  AppCtx *appCtx = (AppCtx *)context;
  guint numf = str->num_instances;

  g_mutex_lock(&fps_lock);
  for (i = 0; i < numf; i++)
  {
    fps[i] = str->fps[i];
    fps_avg[i] = str->fps_avg[i];
  }

  if (header_print_cnt % 20 == 0)
  {
    g_print("\n**PERF:  ");
    for (i = 0; i < numf; i++)
    {
      g_print("FPS %d (Avg)\t", i);
    }
    g_print("\n");
    header_print_cnt = 0;
  }
  header_print_cnt++;
  if (num_instances > 1)
       g_print("PERF(%d): ", appCtx->index);
  else
  g_print("**PERF:  ");

  for (i = 0; i < numf; i++)
  {
    g_print("%.2f (%.2f)\t", fps[i], fps_avg[i]);
  }
  g_print("\n");
  g_mutex_unlock(&fps_lock);

  time_t timep;
  time(&timep);
  char tmp[64];
  strftime(tmp, sizeof(tmp), "%Y-%m-%d %H:%M:%S", localtime(&timep));
  std::cout << tmp << std::endl;
}

I later found through testing that when I set the following configuration:

streammux:
  gpu-id: 0
  ## Boolean property to inform muxer that sources are live
  live-source: 1
  buffer-pool-size: 4
  batch-size: 1

the detected source FPS can maintain its original FPS.

However, when I set the following configuration:

streammux:
  gpu-id: 0
  ## Boolean property to inform muxer that sources are live
  live-source: 1
  buffer-pool-size: 4
  batch-size: 3

the detected FPS for all sources becomes the same as the slowest FPS. Why is that?

  1. could you share the whole configuration file?
  2. I tested deepstream-app by this cfg source2_1080p_dec_infer-resnet_demux_int8_fan.txt (4.2 KB) with fps15 and fp5 two sources. the output fps are not the same. here is the log log.txt (2.0 KB). could you narrow down this issue? here are some methods.
  3. add a probe function on multi_src_bin.streammux, which is the first streammux, to check if frame number of batch is the same every time.
  4. use pipeline->multi_src_bin.streammux’s src as fps_pad to check if the fps are same.
    fps_pad = gst_element_get_static_pad(pipeline->multi_src_bin.streammux, “src”);
  1. When I use deepstream-app, the FPS is normal, but when I use deepstream-parallel-inference and add FPS detection, I notice that the FPS readings are all the same.
  2. I testeddeepstream-parallel-inference by this cfg
    source4_1080p_dec_parallel_infer.txt (8.1 KB)

I want to maintain the FPS of each source when using deepstream-parallel-inference. Currently, it seems that the FPS becomes the same as the slowest one.

please refer to my last comment. I tested deepstream-parallel-inference by using fps_pad = gst_element_get_static_pad(pipeline->multi_src_bin.streammux, “src”); and can’t the issue( FPS becomes the same as the slowest one). here are test details.
source: fps5, fps15, fps30, fps30. sources_4_different_source.csv (202 Bytes)
cfg: vehicle_lpr_analytic/source4_1080p_dec_parallel_infer.yml
some log:
Processing frame number = 1206
**PERF: 5.00 (4.97) 15.00 (14.97) 29.40 (27.46) 29.60 (27.51)
2024-06-15 14:27:57