• Hardware Platform (Jetson / GPU)
• DeepStream Version
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
I’m trying to profile a pipeline with the gst-nvtracker plugin, using the nvds_nvmultiobjecttracker.so library using Nsight Systems (2023.1.1). It seems that the tracker library includes calls to cudaProfilerStart/Stop:
This unfortunately results in significant profiler overhead:
See “CUDA profiling data flush” being triggered in each nvtracker batch.
Is there a way to remove the calls to cudaProfilerStart/Stop, perhaps by some argument/ENV variable?
I have also asked on the forum if it is possible to bypass this on the nsys side (Excessive CUDA profiling data flush)
• How to reproduce the issue?
Running an example from this repo: GitHub - NVIDIA-AI-IOT/deepstream_python_apps: DeepStream SDK Python bindings and sample applications shows this to a lesser extent (since this demo app includes a graphical output, the flush isn’t blocking anything, but it can still be observed on the timeline):
nsys profile -t cuda,nvtx python deepstream_nvdsanalytics.py file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4