Crop the buffer output from nvstreammux then input these cropped buffer into nvinfer for inference - to detect small people objects

Hi team,

I need to do inference on traffic videos, which often contain extremely small people objects. To detect vehicle and people objects in such small sizes, I have tried two methods as below:
1: the 1st one is to add a videoconvert plugin before streammux (right after source bin) in the pipeline:
uridecoder->nvvideoconvert->nvstreammux->nvinfer>nvtracker->nvmultistreamtiler->nvvideoconvert->nvdsosd->nveglglessink
On OSD display, the whole frame is cropped and inference only run on the cropped frame as well, which is good.
However, what I really want is on the OSD display it can be combination of the inference results from both original buffer and the cropped buffer. May I know how to implement this?

  1. the second one is to do NvBufSurfTransform in src probe function of streammux. I tried to crop the buffer output from streammux. The surface transform I have done is similar to the implementation in gst_dsexample_transform_ip() and get_converted_mat() in dsexample. But I found there is no difference in the inference result displayed via the OSD and seems nothing have been internally cropped. Seems the dst_surface has not even been output to downstream.

Would you please help give me some advices on how to move on?

Thanks in advance.

The components after streammux get input buffers from streammux.

SGIE has CROP pre-process, but PGIE does not support. We will try to support “crop” for PGIE. Source code is open. You can also try to modify it.