How to get image frame from nvinfer parse-bbox-func-name for custom tiny YOLOv3 model?

dylon · April 23, 2020, 9:48pm

I need to track items as they are inserted into a bin. To do that, I have a tiny YOLOv3 detector that identifies the end effector. From that, I trace the contours around the item it is holding and fit a bounding box around the item. I want to run inference on the item with DeepStream, not the end effector. Since I have to provide a custom function to return the bounding boxes from the tiny YOLOv3 model (parse-bbox-func-name), it seems a natural fit to add my custom logic there. To do that, I need the RGB(A) frames for inference. Is there a way to get them from within the function? If not, how may I build such a model with DeepStream?

The function is given a std::vector of NvDsInferLayerInfo objects. I’m assuming each of them represents one image frame, and the documentation says it contains a (void *) field named buffer that is a "Pointer to the buffer for the layer data". Does that mean they have references to their associated image frames? If so, how may I get them?

mchi · April 30, 2020, 5:51am

Hi Dylon,
It’s not feasible to get access to image frame in parse-bbox-func-name.
I’m not very sure about your pipeline, here are some informantion for reference:
You can refer to Yolo sample for details on how to integrate Yolo models with DeepStream .
You can also refer to some code in 全网首发：DeepStream中，获得视频帧数据的代码_柳鲲鹏的博客-CSDN博客_osd_sink_pad_buffer_probe to get the buffer data if you really need.