• Hardware Platform (Jetson / GPU)
Jetson Orin Nano
• DeepStream Version
7.1
• JetPack Version (valid for Jetson only)
6.2
• TensorRT Version
latets
• NVIDIA GPU Driver Version (valid for GPU only)
latest installed by sdk manager
• Issue Type( questions, new requirements, bugs)
Questions, Performance Issue
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
This pertains to the performance issue (#1 below):
-
Setup: Create a primary Python application that manages processes using the multiprocessing module. Set the start method early using multiprocessing.set_start_method(‘spawn’, force=True).
-
DeepStream Code: Package a separate DeepStream Python project (which runs correctly standalone) as an installable package (pip install -e). This package contains a class (DeepStreamPipeline) with a .run() method that initializes and runs a standard DeepStream pipeline (e.g., based on deepstream-test1 or a custom pipeline reading from a source, doing inference, tracking, and outputting).
-
Launch: From the inference application, import the DeepStreamPipeline class. Define a simple wrapper function that instantiates this class and calls its .run() method. Launch this wrapper function as the target of a multiprocessing.Process.
-
(Optional) Sibling Process: Launch another simple multiprocessing.Process from the inference app (e.g., simulating camera management, can be mostly idle or do minimal work).
-
Observation: Monitor the FPS or visual output of the DeepStream pipeline running in its child process. Compare this to running the same DeepStream Python code/config directly (standalone).
-
Expected Result (Standalone): Smooth, consistent FPS.
-
Actual Result (via Multiprocessing Spawn): Intermittent performance “hiccups”. The pipeline runs at expected speed for a period, then slows down significantly, then recovers, repeating this cycle. nvidia-smi shows fluctuating GPU utilization during these periods.
(Configuration files are standard DeepStream configs - Primary Detector (e.g., TRT engine from ONNX), NvTracker, Sink. Let me know if specific config snippets are needed).
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
N/A - Seeking advice/understanding for existing functionality and packaging.
Detailed Questions:
I’m developing an application where a primary Python “inference” process manages a separate DeepStream pipeline, also written in Python. I’ve successfully integrated the DeepStream codebase as an editable package into the inference environment.
I’m encountering an issue and I’d appreciate help with:
1. Performance Hiccups/Fluctuations (as described in “How to reproduce”):
-
Background: Switched to multiprocessing.set_start_method(‘spawn’, force=True) to resolve malloc(): unaligned tcache chunk detected errors likely related to fork() and GPU resource inheritance. This resolved the crashes.
-
Problem: When launched via multiprocessing.Process using the spawn method, the DeepStream pipeline exhibits the intermittent slowdowns described above, which don’t occur when running the same DeepStream Python code standalone.
-
Question (Hiccups): What are common causes for this type of intermittent slowdown when running a GPU-intensive DeepStream Python application within a multiprocessing (spawn) environment alongside other Python processes? Could it be related to spawn overhead during runtime, inefficient scheduling between the Python processes vying for the GIL or other system resources, subtle GPU context switching issues triggered by the parent/sibling processes, or IPC bottlenecks (though current data transfer is minimal)? Any suggestions on how to further diagnose or mitigate this specific hiccup behavior?