Tracker element adds latency by waiting for the next frame

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson AGX Orin Developer board
• DeepStream Version 6.0.1
• JetPack Version (valid for Jetson only) 5.0.2
• TensorRT Version 8.4.1
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs) question/bug
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

We have a deepstream python application that is similar to the deepstream python apps rtsp-in-rtsp-out application. The application has a pipeline that can handle a variable amount of UDP multicast input streams. These source bins attach to the streammux, which is linked to the pgie (yolov8 nano model using deepstream-yolo implementation) with int8 calibration and the pgie links to a NvDCF tracker. The tracker is either linked to a fakesink or a setup similar to the rtsp-in-rtsp-out app to restream the tiled OSD via multicast and RTSP.

We achieve good performance on the Jetson AGX Orin, with a throughput of 25fps for up to 14 streams. To measure the latency of our system we are using the NSight profiling tool. With 8 streams, we achieve a latency from the start of the streammux element to the end of the tracker of ±90ms, which is shown in the figure below.

In this performance profile we can see a few things:

• The streammux element start processing batch 1993 (first green line)
• The pgie element performs inference for batch 1993
• The meta data for batch 1993 is attached
• The tracker_convert_buffer function is called for batch 1992 (!! not 1993)
• The tracker waits for a while for an ioctl block, we suspect this is for copying the batch data to video memory?
• The tracker is executed for batch 1992
• This process is repeated, and then finally the tracker is executed for batch 1993 (last green line)

From this, we can conclude that the tracker waits until the meta-data for the next batch has been computed before determining the ID’s for the current batch. This introduces a latency of ±40ms in our pipeline.

Our question is whether this behavior can be avoided. We have found some documentation on passing past-frame data in the tracker documentation. However, setting the enable-past-frame parameter to 0 in the tracker config does not solve our problem.

Additionally, we were wondering if there is any other way to reduce the latency of our pipeline. We notice that a lot of time is spend in ioctl calls, both before the tracker element and the inference element. The length of this call also scales linearly with the batch size. Would it be possible to reduce this?

TL;DR Our tracker waits one frame before computing ID’s, can we turn this behavior off?

1 Like

Can you upgrade to DeepStream 6.3? I don’t think nvtracker will delay one frame. Can you check the nsys with only one stream?

Unfortunately, we are not able to upgrade to Deepstream 6.3.

I have checked the Nsys report for 1 stream (with the batch-size changed accordingly). The tracker still skips a frame. The overall latency is lower but this is mostly due to faster memory copies and slightly faster inference.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Can you share your nsys report for 1 stream to us to have a check?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.