Affine transformation/warp on bbox

• Hardware Platform: GPU
• DeepStream Version: 5.0.0
• TensorRT Version: 7.0.0.11
• NVIDIA GPU Driver Version (valid for GPU only): 460.32.03

Hi, I want to get something done that was discussed quite a few times in this forum but was never really answered or the provided answer was for an outdated deepstream version.

My pipeline looks something like this: pgie (detector, outputs bboxes) → sgie1 (network-type=100, output-tensor-meta=1; outputs landmarks which i parse and attach to the respective bbox meta) → sgie2

sgie2 expects a bbox that is aligned in respect to the landmarks from sgie1’s output. So the question is how to realize that?

I found the following topics covering this:

https://forums.developer.nvidia.com/t/image-pre-processing-between-pgie-and-sgie/111350/6

Here it is suggested to do an affine transformation on every bbox. I have no idea how this should be realized within deepstream API as a bbox is defined by it’s upper left corner as well as width and height, so it gives a rectangle but applying an affine transformation on a rectangle not necessarily yields a rectangle.

Next thing is that the piece of code which is suggested to be changed in /opt/nvidia/deepstream/deepstream-4.0/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp is not available anymore in my deepstream version.

https://forums.developer.nvidia.com/t/need-advice-image-pre-processing-between-pgie-and-sgie-custom-sgie-output-feature-vector/107444/4

This suggests to follow the dsexample plugin and to use NPP API. However I didn’t manage to find any example how to use the NPP API. Additionally it would be nice to see an example how to connect DS and NPP API. Maybe someone can point to a resource.

Appreciate any comment

You can update/change the meta data (left corner as well as width and height) via installing the probe on sgie1 src pad in your case, then sgie2 will do infer on your updated bbox, you need to configure sgie2 do inference based on sgie1 through operate-on-gie-id. Referring Gst-nvinfer — DeepStream 6.3 Release documentation

Let me show you what I want to achieve:

Say I have this input picture from the chancellor:

Now I run a detection model on it which returns this bbox:

Now I compute an affine transformation(just a rotation in this case) which maps the box in the above picture to this one:

Unfortunately the above picture cannot be mapped to DS API via left corner, width and height. However, I want to crop this box like this (Excuse my crappy free hand paint cutting)

and transform this into something I can feed into an nvinfer:

I don’t see how this could be accomplished using left corner, width and height. You might be able to do it in OpenCV but I guess NPP API is also capable of that. Any idea how? I would like to stay in the NVIDIA stack and ideally handle all this or as much as possible on the GPU. Additionally I would probably need to not only change the rect_params fields in the probe callbacks but also some buffers I guess?

Any idea on how to solve this?

The source picture is taken from wikipedia: https://de.wikipedia.org/wiki/Datei:Angela_Merkel_-_World_Economic_Forum_Annual_Meeting_2011.jpg

RIght, you can refer https://docs.nvidia.com/cuda/npp/group__affine__transform.html

Yeah, in gstnvinfer.cpp , instead of calling NvBufSurfTransform inside convert_batch_and_push_to_input_thread, need to call affine transformation

Could you deepen on that?

NvBufSurfTransform and convert_batch_and_push_to_input_thread is two function in gstnvinfer.cpp, I mean you need to replace current NvBufSurfTransform with “affine transformation” you need to implement it yourself.