ROI for source video in deepstream-app

I am trying to crop(not resize) the source video before inference as I dont need detections beyond a certain Region Of Interest(ROI). That also might make the app faster. But I am not sure how to do this in the deepstream app.
In test apps, this is possible by modifying the source element but I am not able to find where those elements are is in the deepstream_app_main.c file. I am also not able to find the possible Properties for [source] in the config file that might make this possible. How do I carry on with this?

Hi, you can refer to this:

Hi,I also have similar issues.

Hi iyaqoobi, I was able to successfully customise the dsexample plugin to crop and save images. My question is to crop the input source video itself before feeding it to the pgie. Suppose my video dimensions are 1920,1080 I want the pgie to work on only 900,800 and not even process the rest.

ROI based inference is not supported in nvinfer. However, user can infer on entire frame and then filter the detected objects outside ROI from metadata

Isnt it possible in deepstream test apps?

Hi Chris, yes filtering metadata is possible. But I assumed cropping the video frame will make the inference significantly faster. Isnt this a potential enhancement? I am trying to modify the deepstream_app_main.c file to make it happen. Correct me if I am wrong, I think modifying the source element(GstVideoCrop) will crop the video at the first element of the pipeline itself thereby not affecting the working of the rest of the pipeline.

Isnt this a potential enhancement?
Yes. We will add ROI pre-process in next release.

I think you can modify the gstnvinfer.cpp ->

gst_nvinfer_process_full_frame() line 1284 ~ 1287
    rect_params.left = 0; = 0;
    rect_params.width = in_surf->surfaceList[i].width;
    rect_params.height = in_surf->surfaceList[i].height;

    /* Scale and convert the buffer. */
    if (get_converted_buffer (nvinfer, in_surf, in_surf->surfaceList + i,
            &rect_params, memory->surf, memory->surf->surfaceList + idx,
            scale_ratio_x, scale_ratio_y,
            memory->frame_memory_ptrs[idx]) != GST_FLOW_OK) {
      GST_ELEMENT_ERROR (nvinfer, STREAM, FAILED, ("Buffer conversion failed"),
      return GST_FLOW_ERROR;

I tried modifying rect_params.height and width. But the problem is that it seems like get_converted_buffer() takes the height from the top( origin is on the top left). If that is the case, then no matter how much I change rect_params, I wont be able to crop out the top part. So I am looking to modify get_converted_buffer(). Any suggestions?

Extra cycles won’t be required since inference happens on image with Network Resolution

Is there currently any way to cut the video source before making the primary inference from the python examples? Or maybe, can some plugin be incorporated to crop the video streaming in the Gstreamer pipe?

something like this:

I understand that I can define a region of interest and omit those BBoxes that are outside that area with the metadata and I also understand that it will not take extra processing because the frame will be resized to the network resolution, but my model must recognize small objects (banknotes (paper money), cards, and casino chips), where I am sure that they will only appear in a certain area of ​​the frame. Therefore, I trained it with cropped images to improve accuracy, obtaining very good results.

Hi camilosilva91.

Please help to open a new topic for your issue. Thanks

Hi All,
Just wondering if there is any solution to the above problem.
I did post a question on the link mentioned above and no answer.

link to the question

In short
can we use nvvideoconvert in the config file to crop before inference
your help will be much appreciated