Nsight systems output for deepstream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.2
• TensorRT Version 8.5.2.2
• NVIDIA GPU Driver Version (valid for GPU only) 525.125.06
• Issue Type( questions, new requirements, bugs)

I have a question regarding the output of nsight systems that was used to inspect deepstream. In the output for GstNvInfer components, I noticed there is some sort of concurency implemented in deepstream, as a second batch is started to process before the first batch is done, for example (see batch_num 2480, 2481, 2482 are being processed concurently):


c

However, zooming out of the output I noticed some gaps in the third line of GstNvInfer: UID=1:


which means that less concurency is applied, I assume.

My questions:

  1. Why are there sometimes 3 concurrent processing lines for the GstNvInfer: UID=1 component and other times 2 lines? Does this mean that other parts of the pipeline need more intense processing, so the GstNvInfer: UID=1 component gets less GPU time?
  2. Does deepstream run on one process, multiple threads OR on multiple processes, multiple threads per process?
  3. How would you recommend looking for bottlenecks in the nsight output? In one tutorial, I found that if the CUDA HW part is a full line without any gaps, then it means the GPU is utilized fully and no optimizations can be made, which is my case:

Thank you in advance!

1/3 which sample are you testing? what is your nsys command-line?
2 deepstream run on one process, multiple threads. nvinfer plugin and low level lib are opensource. please refer to these design references.

1/3. I run deepstream in a docker container, so inside the docker container I ran /nsys/bin/nsys profile deepstream-app -c config.txt, then copied the generated report.nsys-rep back to host and opened it with Nsight systems.
I tested with deepstream-app, I also have dsexample plugin in my pipeline.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

are there two nvifner in one thread? what is the whole media pipeline? could you provide a compete screenshot or a short nsys-rep file? Thanks!
after testing deepstream-test1, I did not see the problem above.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.