• Hardware Platform: Jetson Orin NX 8GB
• DeepStream Version: 7.0
• JetPack Version: 6.0
• TensorRT Version: 8.6.2.3
• Issue Type: Bug
I have developed a C++ program that executes a GStreamer pipeline that performs object detection from multiple camera sources using the gst-nvinfer
plugin and tracks detected objects using the gst-nvtracker
plugin.
From time to time, when the memory usage on the system is very high, I have been experiencing the program getting stuck when starting the pipeline.
When attaching the running process to GDB, I discovered the program getting stuck in the call to gst_element_set_state()
to set the pipeline to PLAYING state.
Following the call stack upward revealed the actual blocking point is a std::condition_variable::wait()
call inside the libnvdsgst_tracker.so
plugin.
After rebuilding this plugin with debug symbols, I traced the issue to the following call chain in the gst-nvtracker
sources (Deepstream version 7.0):
The program calls NvTrackerProc::deInit()
at nvtracker_proc.cpp:313
because initConvBufPool()
returns false
:
/** Initialize the buffer pool based on config */
if (m_Config.inputTensorMeta == false){
ret = initConvBufPool();
if (!ret) {
LOG_ERROR("gstnvtracker: Failed to initilaize surface transform buffer pool.\n");
deInit();
return false;
}
}
The hang occurs in NvTrackerProc::deInit()
at nvtracker_proc.cpp:424
, where the caller waits infinitely on the m_BufQueueCond
condition variable:
void NvTrackerProc::deInit()
{
/** Clear out all pending process requests and return surface buffer
* and notify all threads waiting for these requests */
unique_lock<mutex> lkProc(m_ProcQueueLock);
if (m_Config.numTransforms > 0 && m_Config.inputTensorMeta == false) {
while (!m_ConvBufMgr.isQueueFull())
{
/** printf("m_ConvBufMgr.getFreeQueueSize() %d, m_ConvBufMgr.getActualPoolSize() %d\n",
* m_ConvBufMgr.getFreeQueueSize(), m_ConvBufMgr.getActualPoolSize()); */
m_BufQueueCond.wait(lkProc);
}
}
/* OMITTED */
}
You can reproduce the mentioned issue by simply commenting line nvtracker_proc.cpp:310
and setting ret=false
, to skip the call to initConvBufPool()
and simulate a failure condition.
What factors could cause the initConvBufPool()
function to fail?
What solution do you suggest for fixing this issue?