Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
GPU GTX1080 • DeepStream Version
6.0 • TensorRT Version
8.0.1 • NVIDIA GPU Driver Version (valid for GPU only)
495.29.05 • Issue Type( questions, new requirements, bugs)
Bug? • How to reproduce the issue ?
Build a custom DeepStream pipeline using Python bindings for object detection and drawing bounding boxes from tensor output meta.
Grab a Pytorch model of YoloV5 and optimize it with TensorRT.
Take the optimized model and configure the DeepStream pipeline to use Triton server and make it load the TRT YoloV5 model.
Run the inference pipeline.
When there are target objects in the video, they are detected correctly. Later when objects leave the scene, the network still outputs boxes from past detections, like ghost bounding boxes, until new valid objects come into the scene. The ghost boxes are not random, they repeat old detections.
The model with its TRT optimization has been tested with the Triton server alone, and it works fine. This problem is new when we introduce DeepStream.
I can’t share the model or the complete pipeline at this moment. I know that makes it difficult to reproduce.
Maybe I can work on a reduced version of the code and find a public model to test.
The model is based on:
Bear with me please, maybe you could help me identify the origin of this behavior.
If I have to make a guess, I would call it some issue with memory management between DeepStream, Triton TRT, and Yolo backend.
We are still debugging this. It may be related to a TensorTRT minor version change, or something like that, impacting in the yolo.so lib file. So far we couldn’t find we comes from. Thanks.
Yes, we still have this issue. Although it’s difficult to tell if is a problem inside deepstream/triton, or it’s external.
In DeepStream we have rewritten our object detection code for other reasons. It works fine for all models except this one. Our code is in the pgie callback of DeepStream for tensor meta output post-processing in Python.
YoloV5 is optimized with TensorRT, which is based on the Yolo links above, one for the model, and the other for the yololayer.cu file. We are using T-RT 8.0.1. Triton Server uses the yolo.so lib.
It looks like somewhere in the code, when there are no active objects, old objects from the buffer are still there, valid.
After some time, I think the problem is related to some mismatch between Tensor-RT and this yolo.cu source code, which is outside the Nvidia domain, so that’s why I closed the topic.
I am doing custom postprocessing using Python
I succeeded in doing that, but I have a memory leak issue. @xtianhb.glb , do you have the same issue too? I am currently working on re-implementing the model using nvinfer instead of nvinferserver.
It looks like we are using the same sources for YoloV5 and building similar pipelines (Python, NvInferServer, etc). That is great, we are on the same page.
In my case the model runs almost correctly, doesn’t crash or anything. The only problem I noticed is there are old bounding boxes detected when there are no new objects. When new objects enter the image, the old ones disappear one by one.
I have the feeling that this is related to a circular memory buffer, or some leak in memory as you said.