[DeepStream 7.0] Program stuck when starting pipeline using gst-nvtracker

tognirob · May 30, 2025, 1:32pm

• Hardware Platform: Jetson Orin NX 8GB
• DeepStream Version: 7.0
• JetPack Version: 6.0
• TensorRT Version: 8.6.2.3
• Issue Type: Bug

I have developed a C++ program that executes a GStreamer pipeline that performs object detection from multiple camera sources using the gst-nvinfer plugin and tracks detected objects using the gst-nvtracker plugin.

From time to time, when the memory usage on the system is very high, I have been experiencing the program getting stuck when starting the pipeline.

When attaching the running process to GDB, I discovered the program getting stuck in the call to gst_element_set_state() to set the pipeline to PLAYING state.

Following the call stack upward revealed the actual blocking point is a std::condition_variable::wait() call inside the libnvdsgst_tracker.so plugin.

After rebuilding this plugin with debug symbols, I traced the issue to the following call chain in the gst-nvtracker sources (Deepstream version 7.0):

The program calls NvTrackerProc::deInit() at nvtracker_proc.cpp:313 because initConvBufPool() returns false:

/** Initialize the buffer pool based on config */
if (m_Config.inputTensorMeta == false){
  ret = initConvBufPool();
  if (!ret) {
    LOG_ERROR("gstnvtracker: Failed to initilaize surface transform buffer pool.\n");
    deInit();
    return false;
  }
}

The hang occurs in NvTrackerProc::deInit() at nvtracker_proc.cpp:424, where the caller waits infinitely on the m_BufQueueCond condition variable:

void NvTrackerProc::deInit()
{
  /** Clear out all pending process requests and return surface buffer
    * and notify all threads waiting for these requests */
  unique_lock<mutex> lkProc(m_ProcQueueLock);
  if (m_Config.numTransforms > 0 && m_Config.inputTensorMeta == false) {
    while (!m_ConvBufMgr.isQueueFull())
    {
      /** printf("m_ConvBufMgr.getFreeQueueSize() %d, m_ConvBufMgr.getActualPoolSize() %d\n",
        * m_ConvBufMgr.getFreeQueueSize(), m_ConvBufMgr.getActualPoolSize()); */
      m_BufQueueCond.wait(lkProc);
    }
  }
  /* OMITTED */
}

You can reproduce the mentioned issue by simply commenting line nvtracker_proc.cpp:310 and setting ret=false, to skip the call to initConvBufPool() and simulate a failure condition.

What factors could cause the initConvBufPool() function to fail?

What solution do you suggest for fixing this issue?

junshengy · June 3, 2025, 3:39am

This may be stuck due to insufficient memory. You can check why ConvBufManager::init failed.
I guess it might be caused by NvBufSurfaceCreate failure

/** Create the buffers. The proper way is to set number of buffer sets as a fixed number. */
  for (uint32_t setInd = 0; setInd < MAX_BUFFER_POOL_SIZE; setInd++)
  {

    NvBufSurface *pNewBuf = nullptr;
    int ret = NvBufSurfaceCreate(&pNewBuf, batchSize, &bufferParam);
    if (ret < 0)
    {
      LOG_ERROR("gstnvtracker: Got %d creating nvbufsurface\n", ret);
      deInit();
      return false;
    }

The best course of action might be to replace it with a better device. Orin NX 8GB is an memory limited device, try increasing the swap space

tognirob · June 9, 2025, 5:47pm

Thank you for your reply.
As I pointed out in my post, in case the call to ConvBufManager::init() fails, the caller thread will get stuck on m_BufQueueCond.wait().
This is a bug. Will you provide a fix for this in the next release?

junshengy · June 10, 2025, 6:18am

We are discussing this issue internally and will reply to this topic if it is fixed.

tognirob · June 10, 2025, 11:59am

Thank you for the support.

Meanwhile, I was able to retrieve the error prints of the plugin when this issue occurred:

gstnvtracker: Got -1 mapping nvbufsurface
gstnvtracker: Failed to initialize ConvBufManager
gstnvtracker: Failed to initilaize surface transform buffer pool.

So I can confirm that the failure point is in the call to NvBufSurfaceMap() at convbufmanager.cpp:102, which returns -1.

In case this error condition is handled correctly, will it be safe for the application to try to restart the pipeline again?.

junshengy · June 13, 2025, 3:00am

This is usually due to low memory. If you restart the pipeline it may continue to fail.

Topic		Replies	Views
[DeepStream 7.0] Program stuck when starting pipeline using gst-nvtracker DeepStream SDK gstreamer , nvbugs , deepstream	2	85	December 1, 2025
Tracker hanging on cleanup DeepStream SDK deepstream	8	258	September 17, 2024
Nvtracker hangs at uninitialization with DS7.0 DeepStream SDK deepstream	2	149	November 7, 2024
There was a problem deleting the pipeline DeepStream SDK gstreamer , nvbugs , deepstream	11	238	August 22, 2024
Deepstream Memory Leak DeepStream SDK	7	2132	November 19, 2021
Memory leak in split pipeline DeepStream SDK deepstream	6	178	March 2, 2026
Question on nvtracker memory leak DeepStream SDK	4	795	March 18, 2022
Customized Deepstream app crashes gst_memory_get_sizes: assertion 'mem != NULL' failed DeepStream SDK	15	2654	September 8, 2021
Error: gst-resource-error-quark: Failed to allocate a buffer DeepStream SDK	17	986	June 2, 2024
NvBufSurfTransform failed with error -3 while converting buffer DeepStream SDK	29	1774	June 25, 2023

[DeepStream 7.0] Program stuck when starting pipeline using gst-nvtracker

Related topics