DeepStream SDK FAQ

17.[DeepStream_dGPU_App] Using OpenCV to run deepstream pipeline

Sometimes the gstreamer pipeline in opencv will fail. Please refer to the following topic to resolve this problem.

How to compile OpenCV with Gstreamer [Ubuntu&Windows] | by Galaktyk 01 | Medium

18. Open model deployment on DeepStream (Thanks for the sharing!)
Yolo2/3/4/5/OR : Improved DeepStream for YOLO models (Thanks @marcoslucianops )
YoloV4 : GitHub - NVIDIA-AI-IOT/yolo_deepstream: yolo model qat and deploy with deepstream&tensorrt + deepstream_yolov4.tgz - Google Drive
YoloV4+dspreprocess : deepstream_yolov4_with_nvdspreprocess.tgz - Google Drive
YoloV5 + nvinfer : GitHub - beyondli/Yolo_on_Jetson
Yolov5-small : Custom Yolov5 on Deepstream 6.0 (Thanks @raghavendra.ramya)
YoloV5+Triton : Triton Inference through docker - #7 by mchi
YoloV5_gpu_optimization: GitHub - NVIDIA-AI-IOT/yolov5_gpu_optimization: This repository provides YOLOV5 GPU optimization sample
YoloV7: GitHub - NVIDIA-AI-IOT/yolo_deepstream: yolo model qat and deploy with deepstream&tensorrt
YoloV7+Triton: Deepstream / Triton Server - YOLOv7(Thanks @Levi_Pereira )
YoloV7+nvinfer: Tutorial: How to run YOLOv7 on Deepstream(Thanks @vcmike )
YoloV8+nvinfer: Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK | Seeed Studio Wiki

19. [DSx_All_App] How to use classification model as pgie?
The input is a blue car picture, we want to get the “blue” label, here is the test command:
blueCar.zip (37.6 KB)
dstest_appsrc_config.txt (3.7 KB)
Secondary_CarColor.zip (8.2 MB)
1. copy Secondary_CarColor model if needed
if there is no Secondary_CarColor directory, execute unzip -o Secondary_CarColor.zip -d /opt/nvidia/deepstream/deepstream/samples/models
2. test
gst-launch-1.0 filesrc location=blueCar.jpg ! jpegdec ! videoconvert ! video/x-raw,format=I420 ! nvvideoconvert ! video/x-raw\(memory:NVMM\),format=NV12 ! mux.sink_0 nvstreammux name=mux batch-size=1 width=1280 height=720 ! nvinfer config-file-path=./dstest_appsrc_config.txt ! nvvideoconvert ! video/x-raw\(memory:NVMM\),format=RGBA ! nvdsosd ! nvvideoconvert ! video/x-raw,format=I420 ! jpegenc ! filesink location=out.jpg

[Access output of Primary Classifier]
[Resnet50 with imagenet dataset image classification using deepstream sdk]

20. How to trouble shoot error cuGraphicsGLRegisterBuffer failed with error(219) gst_eglglessink_cuda_init texture = 1

CUDA_ERROR_INVALID_GRAPHICS_CONTEXT = 219

This indicates an error with OpenGL or DirectX context.

Make sure you use nvidia X driver.
Please follow this to setup nvidia X server. Chapter 6. Configuring X for the NVIDIA Driver
These are some common problems you may meet associated with the driver. Chapter 8. Common Problems (nvidia.com)

https://forums.developer.nvidia.com/t/issue-runnung-deepstream-app-docker-container-5-0-6-0-in-rtx-3080-and-a5000-laptop/213783
cuGraphicsGLRegisterBuffer failed with error(219) gst_eglglessink_cuda_init texture = 1 - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

21.[Jetson] TRT version miss match between Deepstream 6.1 docker and device version can be fixed by APT update for Jetpack 5.0.1 DP

1 docker run --rm -it --runtime=nvidia REPOSITORY:TAG
2 remove previous TRT package
  apt-get purge --remove libnvinfer8 libnvinfer-plugin8  libnvinfer-bin python3-libnvinfer
3 apt-get update 
4 install TRT 8.4.0.11 package
  apt-get install libnvinfer8 libnvinfer-plugin8  libnvinfer-bin python3-libnvinfer 
5 Verify TRT version
  nm -D /usr/lib/aarch64-linux-gnu/libnvinfer.so.8.4.0 |grep version

related topic 218888

22. [Jetson] VIC Configuration failed image scale factor exceeds 16
this issue is a limitation of Jetson VIC processing and can be fixed by modifying configuration, for example:

# model's dimensions: height is 1168, width is 720.
uff-input-dims=3;1168;720;0  
#if scaling-compute-hw = VIC, input-object-min-height need to be even and greater than or equal to (model height)/16  
input-object-min-height=74
#if scaling-compute-hw = VIC, input-object-min-width need to be even and greater than or equal to( model width)/16  
input-object-min-width=46

related topic [VIC Configuration failed image scale factor exceeds 16, use GPU for Transformation - #3 by Amycao]

23 How to change python sample apps from display to output file or fakesink for the users who do not have monitor in their device, the patch is based on test1 sample.

Usage: python3 deepstream_test_1.py <media file or uri> <sink type: 1-filesink; 2-fakesink; 3-display sink>

nvidia@ubuntu:/opt/nvidia/deepstream/deepstream/sources/deepstream_python_apps/apps/deepstream-test1$ diff -Naur deepstream_test_1.py.orig deepstream_test_1.py
--- deepstream_test_1.py.orig	2022-08-15 20:12:39.809775283 +0800
+++ deepstream_test_1.py	2022-08-15 22:06:27.052250778 +0800
@@ -123,8 +123,8 @@
 
 def main(args):
     # Check input arguments
-    if len(args) != 2:
-        sys.stderr.write("usage: %s <media file or uri>\n" % args[0])
+    if len(args) != 3:
+        sys.stderr.write("usage: %s <media file or uri> <sink type: 1-filesink; 2-fakesink; 3-display sink>\n" % args[0])
         sys.exit(1)
 
     # Standard GStreamer initialization
@@ -179,14 +179,46 @@
     if not nvosd:
         sys.stderr.write(" Unable to create nvosd \n")
 
-    # Finally render the osd output
-    if is_aarch64():
-        transform = Gst.ElementFactory.make("nvegltransform", "nvegl-transform")
-
-    print("Creating EGLSink \n")
-    sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
-    if not sink:
-        sys.stderr.write(" Unable to create egl sink \n")
+    if args[2] == '1':
+
+        nvvidconv1 = Gst.ElementFactory.make ("nvvideoconvert", "nvvid-converter1")
+        if not nvvidconv1:
+            sys.stderr.write("Unable to create nvvidconv1")
+        capfilt = Gst.ElementFactory.make ("capsfilter", "nvvideo-caps")
+        if not capfilt:
+            sys.stderr.write("Unable to create capfilt")
+        caps = Gst.caps_from_string ('video/x-raw(memory:NVMM), format=I420')
+#        feature = gst_caps_features_new ("memory:NVMM", NULL)
+#        gst_caps_set_features (caps, 0, feature)
+        capfilt.set_property('caps', caps)
+        print("Creating nvv4l2h264enc \n")
+        nvh264enc = Gst.ElementFactory.make ("nvv4l2h264enc" ,"nvvideo-h264enc")
+        if not nvh264enc:
+            sys.stderr.write("Unable to create nvh264enc")
+        print("Creating filesink \n")    
+        sink = Gst.ElementFactory.make ("filesink", "nvvideo-renderer")
+        sink.set_property('location', './out.h264')
+        if not sink:
+            sys.stderr.write("Unable to create filesink")
+
+    elif args[2] == '2':
+
+        print("Creating fakesink \n")
+        sink = Gst.ElementFactory.make ("fakesink", "fake-renderer")
+        if not sink:
+            sys.stderr.write("Unable to create fakesink")
+
+    elif args[2] == '3':
+
+        print("Creating EGLSink \n")
+        sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
+        if not sink:
+            sys.stderr.write(" Unable to create egl sink \n")
+        if is_aarch64():
+            transform = Gst.ElementFactory.make("nvegltransform", "nvegl-transform")
+            if not transform:
+                sys.stderr.write(" Unable to create egl transform \n")
 
     print("Playing file %s " %args[1])
     source.set_property('location', args[1])
@@ -204,9 +236,17 @@
     pipeline.add(pgie)
     pipeline.add(nvvidconv)
     pipeline.add(nvosd)
-    pipeline.add(sink)
-    if is_aarch64():
-        pipeline.add(transform)
+    if args[2] == '1':
+        pipeline.add(nvvidconv1)
+        pipeline.add(capfilt)
+        pipeline.add(nvh264enc)
+        pipeline.add(sink)
+    elif args[2] == '2':
+        pipeline.add(sink)
+    elif args[2] == '3':
+        pipeline.add(sink)
+        if is_aarch64():
+            pipeline.add(transform)
 
     # we link the elements together
     # file-source -> h264-parser -> nvh264-decoder ->
@@ -225,11 +265,19 @@
     streammux.link(pgie)
     pgie.link(nvvidconv)
     nvvidconv.link(nvosd)
-    if is_aarch64():
-        nvosd.link(transform)
-        transform.link(sink)
-    else:
+    if args[2] == '1':
+        nvosd.link(nvvidconv1)
+        nvvidconv1.link(capfilt)
+        capfilt.link(nvh264enc)
+        nvh264enc.link(sink)
+    elif args[2] == '2':
         nvosd.link(sink)
+    elif args[2] == '3':
+        if is_aarch64():
+            nvosd.link(transform)
+            transform.link(sink)
+        else:
+            nvosd.link(sink)
 
     # create an event loop and feed gstreamer bus mesages to it

24. [DeepStream 6.1.1 GA] simple demo for adding dewarper support to deepstream-app

Usege: deepstream-app -c source1_dewarper_test.txt

source1_dewarper_test.txt (3.6 KB)

---
 .../src/deepstream_config_file_parser.c       |  15 ++-
 .../common/src/deepstream_source_bin.c        |   5 -
 .../common/src/deepstream_streammux.c         |   5 +-
 .../deepstream_app_config_parser.c            |   7 +-
 .../deepstream_app_config_parser_yaml.cpp     |   4 +

diff --git a/apps/deepstream/common/src/deepstream_config_file_parser.c b/apps/deepstream/common/src/deepstream_config_file_parser.c
--- a/apps/deepstream/common/src/deepstream_config_file_parser.c
+++ b/apps/deepstream/common/src/deepstream_config_file_parser.c
@@ -76,6 +76,8 @@ GST_DEBUG_CATEGORY (APP_CFG_PARSER_CAT);
 #define CONFIG_GROUP_STREAMMUX_FRAME_NUM_RESET_ON_STREAM_RESET "frame-num-reset-on-stream-reset"
 #define CONFIG_GROUP_STREAMMUX_FRAME_NUM_RESET_ON_EOS "frame-num-reset-on-eos"
 #define CONFIG_GROUP_STREAMMUX_FRAME_DURATION "frame-duration"
+#define CONFIG_GROUP_STREAMMUX_NUM_SURFACES_PER_FRAME "num-surfaces-per-frame"
+
 #define CONFIG_GROUP_STREAMMUX_CONFIG_FILE_PATH "config-file"
 #define CONFIG_GROUP_STREAMMUX_SYNC_INPUTS "sync-inputs"
 #define CONFIG_GROUP_STREAMMUX_MAX_LATENCY "max-latency"
@@ -742,6 +744,11 @@ parse_streammux (NvDsStreammuxConfig *config, GKeyFile *key_file, gchar *cfg_fil
           g_key_file_get_boolean(key_file, CONFIG_GROUP_STREAMMUX,
           CONFIG_GROUP_STREAMMUX_ASYNC_PROCESS, &error);
       CHECK_ERROR(error);
+    } else if (!g_strcmp0(*key, CONFIG_GROUP_STREAMMUX_NUM_SURFACES_PER_FRAME)) {
+        config->num_surface_per_frame =
+            g_key_file_get_integer(key_file, CONFIG_GROUP_STREAMMUX,
+            CONFIG_GROUP_STREAMMUX_NUM_SURFACES_PER_FRAME, &error);
+        CHECK_ERROR(error);
     } else {
       NVGSTDS_WARN_MSG_V ("Unknown key '%s' for group [%s]", *key,
           CONFIG_GROUP_STREAMMUX);
@@ -1070,8 +1077,12 @@ parse_dewarper (NvDsDewarperConfig * config, GKeyFile * key_file, gchar *cfg_fil
         g_key_file_get_integer (key_file, CONFIG_GROUP_DEWARPER,
             CONFIG_GROUP_DEWARPER_NUM_SURFACES_PER_FRAME, &error);
       CHECK_ERROR (error);
-    }
-    else {
+    } else if (!g_strcmp0 (*key, CONFIG_GROUP_DEWARPER_SOURCE_ID)) {
+      config->source_id =
+          g_key_file_get_integer (key_file, CONFIG_GROUP_DEWARPER,
+          CONFIG_GROUP_DEWARPER_SOURCE_ID, &error);
+      CHECK_ERROR (error);
+    } else {
       NVGSTDS_WARN_MSG_V ("Unknown key '%s' for group [%s]", *key,
           CONFIG_GROUP_DEWARPER);
     }
diff --git a/apps/deepstream/common/src/deepstream_source_bin.c b/apps/deepstream/common/src/deepstream_source_bin.c
--- a/apps/deepstream/common/src/deepstream_source_bin.c
+++ b/apps/deepstream/common/src/deepstream_source_bin.c
@@ -1527,11 +1527,6 @@ create_multi_source_bin (guint num_sub_bins, NvDsSourceConfig * configs,
       goto done;
     }
 
-    if(configs->dewarper_config.enable) {
-        g_object_set(G_OBJECT(bin->sub_bins[i].dewarper_bin.nvdewarper), "source-id",
-                configs[i].source_id, NULL);
-    }
-
     bin->num_bins++;
   }
   NVGSTDS_BIN_ADD_GHOST_PAD (bin->bin, bin->streammux, "src");
diff --git a/apps/deepstream/common/src/deepstream_streammux.c b/apps/deepstream/common/src/deepstream_streammux.c
--- a/apps/deepstream/common/src/deepstream_streammux.c
+++ b/apps/deepstream/common/src/deepstream_streammux.c
@@ -92,7 +92,10 @@ set_streammux_properties (NvDsStreammuxConfig *config, GstElement *element)
                config->max_latency, NULL);
   g_object_set (G_OBJECT (element), "frame-num-reset-on-eos",
       config->frame_num_reset_on_eos, NULL);
-
+  if (config->num_surface_per_frame > 1) {
+      g_object_set (G_OBJECT (element), "num-surfaces-per-frame",
+          config->num_surface_per_frame, NULL);
+  }
   ret= TRUE;
 
   return ret;
diff --git a/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser.c b/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser.c
--- a/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser.c
+++ b/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
+ * Copyright (c) 2018-2022, NVIDIA CORPORATION. All rights reserved.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -373,6 +373,11 @@ parse_config_file (NvDsConfig *config, gchar *cfg_file_path)
       parse_err = !parse_osd (&config->osd_config, cfg_file);
     }
 
+    if (!g_strcmp0 (*group, CONFIG_GROUP_DEWARPER)) {
+      parse_err = !parse_dewarper (&config->multi_source_config[0].dewarper_config,
+          cfg_file, cfg_file_path);
+    }
+
     if (!g_strcmp0 (*group, CONFIG_GROUP_PREPROCESS)) {
         parse_err =
             !parse_preprocess (&config->preprocess_config, cfg_file,
diff --git a/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser_yaml.cpp b/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser_yaml.cpp
--- a/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser_yaml.cpp
+++ b/apps/deepstream/sample_apps/deepstream-app/deepstream_app_config_parser_yaml.cpp
@@ -129,6 +129,7 @@ parse_config_file_yaml (NvDsConfig *config, gchar *cfg_file_path)
   std::string sink_str = "sink";
   std::string sgie_str = "secondary-gie";
   std::string msgcons_str = "message-consumer";
+  std::string dewarper_str = "dewarper";
 
   config->source_list_enabled = FALSE;
 
@@ -183,6 +184,9 @@ parse_config_file_yaml (NvDsConfig *config, gchar *cfg_file_path)
     else if (paramKey == "osd") {
       parse_err = !parse_osd_yaml(&config->osd_config, cfg_file_path);
     }
+    else if (paramKey.compare(0, dewarper_str.size(), dewarper_str) == 0) {
+      parse_err = !parse_dewarper_yaml (&config->multi_source_config[0].dewarper_config, cfg_file_path);
+    }
     else if (paramKey == "pre-process") {
       parse_err = !parse_preprocess_yaml(&config->preprocess_config, cfg_file_path);
     }

25. [ALL_ALL_nvdsinfer] Add TensorRT Verbose log

To debug nvinfer related issue inside gst-nvinfer, we can enable nvinfer log with setting the enviroment variable “NVDSINFER_LOG_LEVEL”.

The value can be set to following numbers for different level log:

0: NVDSINFER_LOG_ERROR
1: NVDSINFER_LOG_WARNING
2: NVDSINFER_LOG_INFO
3: NVDSINFER_LOG_DEBUG

Example for enabling debug log:
export NVDSINFER_LOG_LEVEL=3

When the NVDSINFER_LOG_LEVEL environment variable is not set, the default log level is error log.

26.[Deepstream6.2 Gst-nvstreammux & Gst-nvstreammux New]How to set parameters reasonably to improve the efficiency of nvstreammux in live mode

Gst-nvstreammux:

  • Set the batch-size to the number of sources
  • export NVSTREAMMUX_ADAPTIVE_BATCHING=yes
  • If you want to get high fps, set the batched-push-timeout to the (1000000 us/maximum fps among the videos)

Gst-nvstreammux New(export USE_NEW_NVSTREAMMUX=yes):

  • Set the batch-size to the number of sources
  • Should not turn off the adaptive-batching parameter
  • Please refer to the following basic tuning principles first:Gst-nvstreammux Tuning parameters
  • Set the max-same-source-frames to ceil(maximum fps/minimum fps)
  • Set the max-num-frames-per-batch of each source to ceil(current fps/minimum fps)

Experiment: There are 3 live sources with fps 15, 25 and 30 respectively for runtime source addition/deletion scenarios. We’ll delete the video of 25 fps when the pipeline is running.

Build rtsp server:

  • Install FFmpeg $ sudo apt-get install ffmpeg
  • $docker run --rm -it --network=host aler9/rtsp-simple-server
  • Prepare videos with different frame rates: 15fps.mp4 25fps.mp4 30fps.mp4
  • Open a new terminal and run: $ffmpeg -re -stream_loop -1 -i 15fps.mp4 -c:v copy -an -f rtsp -rtsp_transport tcp rtsp://127.0.0.1:8554/stream0
  • Open a new terminal and run: $ffmpeg -re -stream_loop -1 -i 25fps.mp4 -c:v copy -an -f rtsp -rtsp_transport tcp rtsp://127.0.0.1:8554/stream1
  • Open a new terminal and run: $ffmpeg -re -stream_loop -1 -i 30fps.mp4 -c:v copy -an -f rtsp -rtsp_transport tcp rtsp://127.0.0.1:8554/stream2

Build demo code:
config_mux_source3.txt (2.9 KB)
deepstream_test_rt_src_add_del.c (26.1 KB)
Makefile (2.0 KB)

 $cd /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/
 $git clone https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps.git
 $cd deepstream_reference_apps/runtime_source_add_delete/

Replace files in the directory with attached files
 $cp deepstream_test_rt_src_add_del.c ./deepstream_test_rt_src_add_del.c
 $cp Makefile ./Makefile
 $cp config_mux_source3.txt ./
 $export CUDA_VER=11.8
 $make

Run:

  • ./deepstream-test-rt-src-add-del 0 filesink 1
  • You can add and delete streams by pressing “a” and “d” on the keyboard.

FPS For Streams with Gst-nvstreammux :

  • Set batched-push-timeout to 70000 ms > 1000000 us/ 15fps
    1. Start with all streams
     **PERF: 14.03 (14.95)   25.05 (24.98)   25.05 (24.92)
     **PERF: 15.26 (14.95)   24.42 (24.98)   24.42 (24.92)
     **PERF: 14.71 (14.95)   25.51 (24.98)   25.51 (24.92)
     **PERF: 15.04 (14.95)   25.07 (24.98)   25.07 (24.92)
    
    1. Delete the stream source of 25 fps
     **PERF: 14.52 (14.96)   16.59 (23.21)   0.00 (19.20)
     **PERF: 15.53 (14.96)   16.66 (23.19)   0.00 (19.13)
     **PERF: 14.73 (14.96)   16.49 (23.17)   0.00 (19.07)
     **PERF: 14.51 (14.96)   16.58 (23.14)   0.00 (19.00)
    
  • Set batched-push-timeout to 40000 ms = 1000000 us/ 25fps
    1. Start with all streams
     **PERF: 15.12 (13.89)   30.24 (28.01)   25.20 (23.92)
     **PERF: 14.94 (13.91)   29.88 (28.03)   24.90 (23.93)
     **PERF: 14.98 (13.91)   29.95 (28.03)   24.32 (23.93)
     **PERF: 15.01 (13.93)   10.32 (27.84)   25.63 (23.95)
    
    1. Delete the stream source of 25 fps
     **PERF: 14.47 (14.27)   24.80 (28.13)   0.00 (21.76)
     **PERF: 15.51 (14.28)   24.84 (28.11)   0.00 (21.61)
     **PERF: 14.95 (14.28)   24.92 (28.09)   0.00 (21.46)
     **PERF: 14.83 (14.29)   24.83 (28.07)   0.00 (21.31)
    
  • Set batched-push-timeout to 20000 ms < 1000000 us/ 30 fps
    1. Start with all streams
     **PERF: 15.00 (14.82)   29.58 (36.49)   24.81 (25.10)
     **PERF: 14.94 (14.85)   30.09 (35.51)   25.08 (25.09)
     **PERF: 15.07 (14.88)   30.38 (34.65)   25.14 (24.98)
     **PERF: 15.00 (14.89)   29.54 (34.22)   24.78 (25.07)
    
    1. Delete the stream source of 25 fps
     **PERF: 14.97 (14.83)   30.11 (30.67)   0.00 (23.22)
     **PERF: 15.07 (14.83)   30.11 (30.65)   0.00 (22.71)
     **PERF: 14.92 (14.83)   29.71 (30.64)   0.00 (22.21)
     **PERF: 14.95 (14.84)   30.22 (30.62)   0.00 (21.73)
    

FPS For Streams with Gst-nvstreammux New:
A. Plugin Parameters Settings:

  • max-same-source-frames=1
  • 15 fps video: max-num-frames-per-batch=1
  • 25 fps video: max-num-frames-per-batch=1
  • 30 fps video: max-num-frames-per-batch=1
    1. Start with all streams
     **PERF: 15.22 (14.54)   14.00 (14.88)   14.42 (13.84)
     **PERF: 13.89 (14.55)   14.58 (14.86)   14.13 (13.82)
     **PERF: 14.00 (14.54)   14.00 (14.84)   14.42 (13.84)
     **PERF: 15.00 (14.55)   14.59 (14.82)   13.46 (13.85)
    
    1. Delete the stream source of 25 fps
     **PERF: 13.46 (14.52)   15.00 (14.81)   0.00 (13.35)
     **PERF: 15.22 (14.51)   14.00 (14.79)   0.00 (13.07)
     **PERF: 13.89 (14.51)   14.58 (14.77)   0.00 (12.80)
     **PERF: 14.00 (14.50)   14.00 (14.75)   0.00 (12.54)
    

B. Plugin Parameters Settings:

  • max-same-source-frames=2
  • 15 fps video: max-num-frames-per-batch=1
  • 25 fps video: max-num-frames-per-batch=2
  • 30 fps video: max-num-frames-per-batch=2
    1. Start with all streams
     **PERF: 14.79 (14.90)   29.68 (29.84)   25.58 (24.86)
     **PERF: 15.16 (14.90)   30.09 (29.86)   24.32 (24.88)
     **PERF: 14.89 (14.90)   29.87 (29.87)   25.88 (24.90)
     **PERF: 15.09 (14.90)   30.38 (29.87)   24.30 (24.88)
    
    1. Delete the stream source of 25 fps
     **PERF: 15.04 (14.94)   30.08 (29.92)   0.00 (17.51)
     **PERF: 14.94 (14.94)   29.86 (29.92)   0.00 (17.32)
     **PERF: 15.00 (14.94)   29.85 (29.92)   0.00 (17.15)
     **PERF: 15.08 (14.95)   30.54 (29.93)   0.00 (16.97)
    

[ALL_ALL_common] Historical DeepStream documents and package links

For the users who are still working on old DeepStream versions, the historical DeepStream documents and packages can be found in the following links:

28. [DSx_All_App] How to connect a USB camera in DeepStream?

28.1 Query the device number by the v4l2-ctl tool
E.g.
$ sudo apt install v4l-utils && v4l2-ctl --list-devices
28.2 Query the supported formats and capabilities of the camera by v4l2-ctl tool.
E.g. Query the camera’s formats and capabilities whose device number is 2.
$ v4l2-ctl -d /dev/video2 --list-formats-ext
The information may be displayed in the following format
ioctl: VIDIOC_ENUM_FMT
Type: Video Capture

octl: VIDIOC_ENUM_FMT
Type: Video Capture

[0]: 'YUYV' (YUYV 4:2:2)
	Size: Discrete 640x480
		Interval: Discrete 0.033s (30.000 fps)
		Interval: Discrete 0.042s (24.000 fps)
		Interval: Discrete 0.050s (20.000 fps)
		Interval: Discrete 0.067s (15.000 fps)
		Interval: Discrete 0.100s (10.000 fps)
		Interval: Discrete 0.133s (7.500 fps)
		Interval: Discrete 0.200s (5.000 fps)
        ......
[1]: 'H264' (H.264, compressed)
	Size: Discrete 640x480
		Interval: Discrete 0.033s (30.000 fps)
		Interval: Discrete 0.042s (24.000 fps)
		Interval: Discrete 0.050s (20.000 fps)
		Interval: Discrete 0.067s (15.000 fps)
		Interval: Discrete 0.100s (10.000 fps)
		Interval: Discrete 0.133s (7.500 fps)
		Interval: Discrete 0.200s (5.000 fps)
        ......
[2]: 'MJPG' (Motion-JPEG, compressed)
    Size: Discrete 640x480
		Interval: Discrete 0.033s (30.000 fps)
		Interval: Discrete 0.042s (24.000 fps)
		Interval: Discrete 0.050s (20.000 fps)
		Interval: Discrete 0.067s (15.000 fps)
		Interval: Discrete 0.100s (10.000 fps)
		Interval: Discrete 0.133s (7.500 fps)
		Interval: Discrete 0.200s (5.000 fps)
        ......

28.3 Choose one format and capabilities from the step 2 query results.
The capsfilter after v4l2src is necessary if your camera can output different formats videos or different resolution/framerate videos with the same format. set the corresponding capsfilter properties to let the camera output the chosen format and capabilities. use gst-launch to construct a working pipeline to test the function.
E.g. The following pipeline lets the camera output video at resolution 640x480, format YUY2 , fps30.

$ gst-launch-1.0 v4l2src device=/dev/video2 ! 'video/x-raw, format=YUY2, width=640, height=480, framerate=30/1'  ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! mux.sink_0  nvstreammux name=mux width=1280 height=720 batch-size=1  ! fakesink

28.3.1 If the camera can output compressed video formats such as “MJPG”, “H264”,…etc, and you choose to use the compressed format in the DeepStream(GStreamer) pipeline. You need to add the corresponding video decoder after v4l2src.

E.g. The following pipeline lets the camera output video at resolution 640x480, format jpeg, fps30.
For Jetson

$ gst-launch-1.0 v4l2src device=/dev/video2 ! 'image/jpeg,  width=640, height=480, framerate=30/1' ! nvv4l2decoder mjpeg=true ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! mux.sink_0  nvstreammux name=mux width=1280 height=720 batch-size=1  ! fakesink

For dGPU

$ gst-launch-1.0 v4l2src device=/dev/video2 ! 'image/jpeg,  width=640, height=480, framerate=30/1' ! nvv4l2decoder ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! mux.sink_0  nvstreammux name=mux width=1280 height=720 batch-size=1  ! fakesink

E.g. The following pipeline lets the camera output video at resolution 640x480, format h264, fps30.

$ gst-launch-1.0 v4l2src device=/dev/video2 ! 'video/x-h264, format=avc, width=640, height=480, framerate=30/1' ! nvv4l2decoder ! fakesink 

28.3.2 If all formats and capabilities are not supported by nvvideoconvert, there will be a “linking failed” error. you can add a videoconvert before nvvideoconvert to convert the raw data to the format nvvideoconvert can support.

E.g. The following pipeline lets the camera output video at resolution 640x480, format YUY2 , fps30. videoconvert is used to convert YUY2 to NV12 which is supported by nvvideoconvert.

gst-launch-1.0  v4l2src device=/dev/video2 ! 'video/x-raw, format=YUY2, width=640, height=480, framerate=30/1' ! videoconvert ! 'video/x-raw, format=NV12' ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12'  ! fakesink

28.3.3 using uridecodebin to connect the camera.
uridecodebin is a Gstreamer bin including v4l2src and decoder. Capsfilter can’t be added manually after v4l2src, so the negotiated format and capabilities are unpredictable. we only recommend using v4l2src to connect USB cameras.
E.g. The following pipeline uses uridecodebin to connect the camera, the camera 's output format and capabilities are unknown.

gst-launch-1.0 uridecodebin uri=v4l2:///dev/video2 ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! mux.sink_0 nvstreammux name=mux width=1280 height=720 batch-size=1 ! fakesink

29.[DSx_All_App] Debug Tips for DeepStream Nvinferserver Accuracy Issue
This corresponds to the Debug Tips for DeepStream Accuracy Issue and teaches you how to set relevant parameters in nvinferserver.

29.1 Input scale & offset

infer_config {
   preprocess {
     normalize {
        scale_factor: 0.0039215697906911373
        channel_offsets: [0, 0, 0]
     }
   }
}

29.2 Input Order

infer_config { 
  preprocess {
    tensor_order: TENSOR_ORDER_LINEAR
    network_format: IMAGE_FORMAT_RGB
  } 
}

29.3 Dims
Each model also needs a specific config.pbtxt file in its subdirectory for nvinferser. We set the imputlayer and outputlayer dims in the file.

input [
  {
    dims: [3, 224, 224]
  }
]
output [
  {
    dims: [6, 1, 1]
  }
]

For the nvinferserver config file:

infer_config { 
  backend {
    inputs: [ {
      dims: [3, 224, 224]
    }]
  }
}

29.4 scale and padding

infer_config { 
  preprocess {
    maintain_aspect_ratio: 1
    symmetric_padding: 1
  } 
}

29.5 inference precision
For the config.pbtxt file:

input [
  {
    data_type: TYPE_FP16
  }
]

For the nvinferserver config file:

infer_config { 
  backend {
    inputs: [ {
      data_type: TYPE_FP16
    }]
  }
}

29.6 threshold

postprocess {
  detection {
    nms {
      confidence_threshold:0.2
      topk:20
      iou_threshold:0.5
    }
  }
}

30.[DSx_All_App] How to parse tensor output layers in the customized post-processing for nvinfer and nvinferserver?
In nvinferserver grpc mode, the layers data may reach randomly. We suggest using the following method to parse tensor output layers.

bool NvDsInferParseCustom(std::vector<NvDsInferLayerInfo> const &outputLayersInfo, 
NvDsInferNetworkInfo  const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams, 
std::vector<NvDsInferInstanceMaskInfo> &objectList) {
    auto layerFinder = [&outputLayersInfo](const std::string &name)
        -> const NvDsInferLayerInfo *{
        for (auto &layer : outputLayersInfo) {
            if (layer.layerName && name == layer.layerName) {
                return &layer;
            }
        }
        return nullptr;
    };

    /* take layer names generate_detections and mask_fcn_logits/BiasAdd for example. */

    const NvDsInferLayerInfo *detectionLayer = layerFinder("generate_detections");
    const NvDsInferLayerInfo *maskLayer = layerFinder("mask_fcn_logits/BiasAdd");

    if (!detectionLayer || !maskLayer) {
        std::cerr << "ERROR: some layers missing or unsupported data types "
                << "in output tensors" << std::endl;
        return false;
    }
    ......
}

related topics
[Nvinfer's results are different from nvinferserver]
[Running Yolov5 Model in triton inference server with GRPC mode to work with Deepstream]

31.[DSx_All_App] Use nvurisrcbin plugin to do smart record in Python
Nvurisrcbin plugin encapsulates smart recording function. Please refer to the complete Python sample test.py.diff (1.9 KB) based on deeptream-test3.py. Below are the main three steps to use.

31.1 Set parameters for smart record
After creating the nvurisrcbin element, you need to set smart record parameters before starting pipeline. The explanation of the parameters can be found in nvurisrcbin.html.

ele.set_property("smart-record", 2) # enable smart record.
ele.set_property("smart-rec-dir-path", ".") # set record path.
#For more parameters, please run gst-inspect-1.0 nvurisrcbin to check.

31.2 Send signal to start record
Signal "start-sr" has four parameters, namely gpointer sessionId, guint startTime, guint duration and gpointer userData. Python can use a capsule object to pass C gpointer. Here is a patch gpointer_bind.diff (1.4 KB) to bind the function alloc_buffer1 which can return a capsule object.

a=pyds.alloc_buffer1(4)
ele.emit('start-sr', a, 2, 7, None)

31.3 Send signal to stop record
Signal "stop-sr" has one parameter guint sessionId.

ele.emit('stop-sr', 0)

related topic:
[Smart recording on Azure IOTEDGE]
[Smart record in Python]
[Smart Record in python]
[Using smart-record in python]

32.[DSx_dGPU_App] How to use display window in the docker on Dgpu
Precondition:
1.Make sure that a monitor is connected to your host
2.Make sure that the driver version on the host is consistent with the version required by DeepStream. You can refer to the dGPU model Platform and OS Compatibility to get the matching version information.

Steps:
The below requirements shall be met before starting the docker:
1.Set appropriate value for the DISPLAY variable, you can use the “xdpyinfo | grep display” command on the host to get the id of the display
2.Execute the command: xhost + from the host terminal to allow the docker to launch a display window

Ex:
$ export DISPLAY=:0
$ xhost +

Note:

If you have multiple cards on your host, make sure to set NVIDIA’s card as the default one to use. There are two ways to configure the graphics card.

1.cli command:
$sudo apt install nvidia-settings
$sudo apt install nvidia-prime
$sudo prime-select nvidia

2.You can refer to the link switch-intel-nvidia-graphics-card-ubuntu to switch the card. Please choose the NVIDIA (Performance Mode) as show in the following image.

33.[DSx_Jetson_dGPU_Plugin] Dump the inference Inputs in nvinferserver plugin
Nvinferserver plugin and low level library are open source from DeepStream6.2. Here is a method to dump Inference inputs.
33.1 Make sure nvinferserver low level library can be built successfully.
The code path is /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinferserver. Please refer to readme for how to build.

33.2 Install dependencies.

sudo apt-get update
sudo apt-get install libopencv-dev

33.3 Apply patch and build
Please apply patch infer_preprocess.cpp.diff (2.9 KB) to infer_preprocess.cpp, and apply patch Makefile.diff (154 Bytes) to Makefile. Then use the following command-line to build.
export WITH_OPENCV=1 && make
Please use the following command-line to replace the old library with the new library .

mv /opt/nvidia/deepstream/deepstream/lib/libnvds_infer_server.so /opt/nvidia/deepstream/deepstream/lib/libnvds_infer_server.so_ori
mv libnvds_infer_server.so /opt/nvidia/deepstream/deepstream/lib/libnvds_infer_server.so

After restarting the pipeline, nvinfersever low level library will dump the inference inputs to jpg files.

34.[DS6.4_GLib2.0] How to update the GLib 2.0 manually for DS 6.4
As the Note in Quickstart Guide, there is a bug in glib 2.0-2.72 version which comes with ubuntu22.04 by default. If you want to update yourself, you can follow the following methods.
34.1 Install dependencies
Docker: nvcr.io/nvidia/deepstream:6.4-triton-multiarch

apt-get update && apt-get install -y \
    pkg-config \
    python3-dev \
    libffi-dev \
    libmount-dev \
    ninja-build \
    wget \
    libgirepository1.0-dev

pip3 install packaging
pip3 install meson==1.2.0

34.2 get the GLib 2.0 source code and install
Note: If you are using python, it is recommended to install version 2.79 or later. It will install the corresponding gobject-instropection together.

wget https://github.com/GNOME/glib/archive/refs/tags/2.79.1.tar.gz
tar -xzvf 2.79.1.tar.gz
cd glib-2.79.1/
rm -r subprojects/gvdb
meson _build
ninja -C _build
ninja -C _build install

34.3 tell the system load shared libraries in new locations

ldconfig 

If that doesn’t work, try export LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/

34.4 run the test demo
glib_version.py (143 Bytes)
Note: First you need to set the GI_TYPELIB_PATH variables to the the gobject-instropection you just installed.

export GI_TYPELIB_PATH=/usr/local/lib/x86_64-linux-gnu/girepository-1.0
python3 glib_version.py

35.How to add custom metadata in a GStreamer native plugin and access it in Python.

Here is an example based on deepstream-test1.py and dsexample.

35.1. Add custom metadata in the dsexample plugin.

diff --git a/sources/gst-plugins/gst-dsexample/gstdsexample.cpp b/sources/gst-plugins/gst-dsexample/gstdsexample.cpp
index d5399c7..764ec0a 100644
--- a/sources/gst-plugins/gst-dsexample/gstdsexample.cpp
+++ b/sources/gst-plugins/gst-dsexample/gstdsexample.cpp
@@ -318,6 +318,76 @@ gst_dsexample_get_property (GObject * object, guint prop_id,
   }
 }
 
+#define NVDS_GST_META_DSEXAMPLE (nvds_get_user_meta_type((char *)"NVIDIA.NVDS_GST_META_DSEXAMPLE"))
+
+typedef struct _DsExampleMeta
+{
+  gchar *timestamp;
+} DsExampleMeta;
+
+/* gst meta copy function set by user */
+static gpointer dsexample_meta_copy_func(gpointer data, gpointer user_data)
+{
+  NvDsUserMeta *user_meta = (NvDsUserMeta *) data;
+  DsExampleMeta *src_meta = (DsExampleMeta *)user_meta->user_meta_data;
+  DsExampleMeta *dst_meta = (DsExampleMeta*)g_malloc0(sizeof(DsExampleMeta));
+  if (src_meta->timestamp == NULL)
+  {
+    g_print("no buffer abort !!!! \n");
+    abort();
+  }
+  dst_meta->timestamp = g_strdup(src_meta->timestamp);
+  memcpy(dst_meta, src_meta, sizeof(DsExampleMeta));
+  return (gpointer)dst_meta;
+}
+
+/* gst meta release function set by user */
+static void dsexample_meta_release_func(gpointer data, gpointer user_data)
+{
+  NvDsUserMeta *user_meta = (NvDsUserMeta *) data;
+  DsExampleMeta *meta = (DsExampleMeta *)user_meta->user_meta_data;
+  if (meta) {
+    if (meta->timestamp) {
+      g_free(meta->timestamp);
+      meta->timestamp = NULL;
+    }
+    g_free(meta);
+    meta = NULL;
+  }
+  user_meta->user_meta_data = NULL;
+}
+
+void attach_frame_custom_metadata (NvDsBatchMeta *batch_meta, NvDsFrameMeta *frame_meta)
+{
+  DsExampleMeta *meta = (DsExampleMeta *)g_malloc0(sizeof(DsExampleMeta));
+  if (meta == NULL)
+  {
+    g_print("no buffer abort !!!! \n");
+    abort();
+  }
+  struct timeval time;
+  gettimeofday(&time, NULL);
+  double cur = (time.tv_sec) * 1000.0;
+  cur += (time.tv_usec) / 1000.0;
+  /* Add dummy metadata */
+  meta->timestamp = (gchar *)g_malloc0(128);
+  snprintf(meta->timestamp, 128, "cur %.3f ms", cur);
+
+  NvDsUserMeta *user_meta =
+            nvds_acquire_user_meta_from_pool (batch_meta);
+  if (user_meta) {
+    user_meta->user_meta_data = (void *) meta;
+    user_meta->base_meta.meta_type = NVDS_GST_META_DSEXAMPLE;
+    user_meta->base_meta.copy_func =
+        (NvDsMetaCopyFunc) dsexample_meta_copy_func;
+    user_meta->base_meta.release_func =
+        (NvDsMetaReleaseFunc) dsexample_meta_release_func;
+    nvds_add_user_meta_to_frame (frame_meta, user_meta);
+  }
+
+  g_print("Attached DsExampleMeta from native ==> %s\n", meta->timestamp);
+}
+
 /**
  * Initialize all resources and start the output thread
  */
@@ -759,6 +829,7 @@ gst_dsexample_transform_ip (GstBaseTransform * btrans, GstBuffer * inbuf)
 #endif
       /* Attach the metadata for the full frame */
       attach_metadata_full_frame (dsexample, frame_meta, scale_ratio, output, i);
+      attach_frame_custom_metadata(batch_meta, frame_meta);
       i++;
       free (output);
     }


35.2. Modify the corresponding Python bindings.

diff --git a/bindings/src/bindnvdsmeta.cpp b/bindings/src/bindnvdsmeta.cpp
index 95b0585..2aec39f 100644
--- a/bindings/src/bindnvdsmeta.cpp
+++ b/bindings/src/bindnvdsmeta.cpp
@@ -22,6 +22,13 @@
 
 namespace py = pybind11;
 
+#define NVDS_GST_META_DSEXAMPLE (nvds_get_user_meta_type((char *)"NVIDIA.NVDS_GST_META_DSEXAMPLE"))
+
+typedef struct _DsExampleMeta
+{
+  gchar *timestamp;
+} DsExampleMeta;
+
 namespace pydeepstream {
 
     void bindnvdsmeta(py::module &m) {
@@ -77,8 +84,25 @@ namespace pydeepstream {
                        pydsdoc::nvmeta::MetaTypeDoc::NVDS_START_USER_META)
                 .value("NVDS_FORCE32_META", NVDS_FORCE32_META,
                        pydsdoc::nvmeta::MetaTypeDoc::NVDS_FORCE32_META)
+                .value("NVDS_GST_META_DSEXAMPLE", NVDS_GST_META_DSEXAMPLE)
                 .export_values();
 
+        py::class_<DsExampleMeta>(m, "DsExampleMeta")
+                .def(py::init<>())
+                .def_property("timestamp",
+                              STRING_PROPERTY(DsExampleMeta,
+                                              timestamp))
+                .def("cast",
+                     [](void *data) {
+                         return (DsExampleMeta *) data;
+                     },
+                     py::return_value_policy::reference)
+
+                .def("cast",
+                     [](size_t data) {
+                         return (DsExampleMeta *) data;
+                     },
+                     py::return_value_policy::reference);
 
         py::class_<NvDsComp_BboxInfo>(m, "NvDsComp_BboxInfo",
                                       pydsdoc::nvmeta::NvDsComp_BboxInfoDoc::descr)

35.3.Access the custom metadata in Python code.

diff --git a/apps/deepstream-test1/deepstream_test_1.py b/apps/deepstream-test1/deepstream_test_1.py
index 861cefc..a3e2809 100755
--- a/apps/deepstream-test1/deepstream_test_1.py
+++ b/apps/deepstream-test1/deepstream_test_1.py
@@ -47,6 +47,7 @@ def osd_sink_pad_buffer_probe(pad,info,u_data):
     # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
     # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
     batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
+
     l_frame = batch_meta.frame_meta_list
     while l_frame is not None:
         try:
@@ -82,6 +83,19 @@ def osd_sink_pad_buffer_probe(pad,info,u_data):
             except StopIteration:
                 break
 
+        l_user=frame_meta.frame_user_meta_list
+        while l_user is not None:
+            try:
+                user_meta = pyds.NvDsUserMeta.cast(l_user.data)
+                if (user_meta and user_meta.base_meta.meta_type == pyds.NvDsMetaType.NVDS_GST_META_DSEXAMPLE):
+                    dsexample_meta = pyds.DsExampleMeta.cast(user_meta.user_meta_data)
+                    print(f"access dsmeta from python {pyds.get_string(dsexample_meta.timestamp)}")
+            except StopIteration:
+                break
+            try: 
+                l_user=l_user.next
+            except StopIteration:
+                break
         # Acquiring a display meta object. The memory ownership remains in
         # the C code so downstream plugins can still access it. Otherwise
         # the garbage collector will claim it when this probe function exits.
@@ -167,6 +181,10 @@ def main(args):
     if not pgie:
         sys.stderr.write(" Unable to create pgie \n")
 
+    dsexample = Gst.ElementFactory.make("dsexample", "dsexample")
+    if not dsexample:
+        sys.stderr.write(" Unable to create dsexample \n")
+
     # Use convertor to convert from NV12 to RGBA as required by nvosd
     nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
     if not nvvidconv:
@@ -186,7 +204,8 @@ def main(args):
             sys.stderr.write(" Unable to create nv3dsink \n")
     else:
         print("Creating EGLSink \n")
-        sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
+        # sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
+        sink = Gst.ElementFactory.make("fakesink", "nvvideo-renderer")
         if not sink:
             sys.stderr.write(" Unable to create egl sink \n")
 
@@ -206,6 +225,7 @@ def main(args):
     pipeline.add(decoder)
     pipeline.add(streammux)
     pipeline.add(pgie)
+    pipeline.add(dsexample)
     pipeline.add(nvvidconv)
     pipeline.add(nvosd)
     pipeline.add(sink)
@@ -225,7 +245,8 @@ def main(args):
         sys.stderr.write(" Unable to get source pad of decoder \n")
     srcpad.link(sinkpad)
     streammux.link(pgie)
-    pgie.link(nvvidconv)
+    pgie.link(dsexample)
+    dsexample.link(nvvidconv)
     nvvidconv.link(nvosd)
     nvosd.link(sink)

35.4.Recompile and install dsexample and pyds, and then run deepstream_test1.py. You will see the following log in the output

Attached DsExampleMeta from native ==> cur 1712056757649.727 ms
access dsmeta from python cur 1712056757649.727 ms
Frame Number=303 Number of Objects=31 Vehicle_count=26 Person_count=5
Attached DsExampleMeta from native ==> cur 1712056757651.504 ms
access dsmeta from python cur 1712056757651.504 ms
Frame Number=304 Number of Objects=27 Vehicle_count=22 Person_count=5
Attached DsExampleMeta from native ==> cur 1712056757653.223 ms
access dsmeta from python cur 1712056757653.223 ms
Frame Number=305 Number of Objects=29 Vehicle_count=23 Person_count=6
Attached DsExampleMeta from native ==> cur 1712056757654.953 ms
access dsmeta from python cur 1712056757654.953 ms
Frame Number=306 Number of Objects=29 Vehicle_count=24 Person_count=5
Attached DsExampleMeta from native ==> cur 1712056757656.722 ms
access dsmeta from python cur 1712056757656.722 ms
Frame Number=307 Number of Objects=29 Vehicle_count=25 Person_count=4
Attached DsExampleMeta from native ==> cur 1712056757658.446 ms
access dsmeta from python cur 1712056757658.446 ms

36.[DSx_All_App] How to use Nsight System CLI to profile DeepStream App from the command line.

Deepstream Plugins that can be queried using this tool.

Plugins Tags
nvv4l2-decoder Call to NV12_to_NV12_cutex Call to cudaEventCreateWithFlags ……
nvstreammux stream-muxer_acquireBufferFromPool(Batch=0) stream-muxer_collectingBuffers(Batch=0) …
nvinfer buffer_process batch_num=1 ……
nvinferserver buffer_process batch_num=1 …
nvdewarper nvdewarper0_(Frame=1)_Scale ……
nvdsosd nv-onscreendisplay_(Frame=0) …
nvmultistreamtiler tiled_display_tiler_(Frame=0) ……
nvtracker (null)_nvtracker_convert_buffer(Frame=0) ……

How to use the tool(Take the deepstream-test1 as an example.)

  1. Make sure you have the Nsight System installed on your system. You can download and install it from the following page Nsight Systems - Get Started | NVIDIA Developer according to your platform.

  2. Use NVTX to trace any CUDA kernal function wrote by yourself if necessary.

  • Add #include “nvtx3/nvToolsExt.h” in the source code.
  • Add the compiler flag “-ldl” in the Makefile
  • Add calls to the NVTX API functions below.
nvtxRangePush("my_cuda_kernal");
//your code
nvtxRangePop();
  1. Run the command below, it will generate a report file called nsys_report.nsys-rep
$nsys profile -w true -t "cuda,cudnn,osrt,nvtx" -o ./nsys_report --cuda-memory-usage=true ./deepstream-test1-app dstest1_config.yml
  1. Use the Nsight Systems App to read the nsys_report on your Microsoft System. As the image below, just right click the GstNvinfer and choose the “Show in Events View”, the detailed information is displayed at the bottom.