Image pre-processing between PGIE and SGIE

Hello. I need implement same pipeline like this, but with affine transform on every box with different affine transform matrices:

  1. -> Input
  2. -> PGIE
  3. -> Affine transform on every box
  4. -> SGIE

Can i implement such pipeline without touching gst-nvinfer code?
Thanks.

1 Like

Would you mind to explain more details about your requirement?

I have classifier that requires aligned images and detector, that returns some landmarks among with bounding boxes, but in different network output. Based on these landmarks i need to calculate affine transformation matrix and then align detected region, so classifier will get aligned image.

In gstnvinfer plugin sources i see that input buffers for classifier will be generated right before inference, so, i can’t made alignment for each of them without changing gstnvinfer… or?

Today i tried to align each detected region in buffer that passed from streammux, but this buffer will be used for displaying, so this solution doesn’t suit me.
There is another thing: it’s difficult to correctly match bounding box that i got from object metadata and landmarks that i got from raw tensor.

Please point me to right direction how to implement such pipeline

Gentle reminder.
Maybe, i can split buffers passed to sgie and nvosd? Then transformed image will be passed to sgie and original to nvosd. Can you advice how to do that?

Checking internally and give u response ASAP

We suggest install probe on PGIE src pad, do Affine transform on every box if possible, change / update metadata, then SGIE will work on this data.

Maybe, i can split buffers passed to sgie and nvosd? Then transformed image will be passed to sgie and original to nvosd. Can you advice how to do that?
U can add a tee after pgie and then give connect both sgie and nvosd to tee src pad.

Thanks for your answer.

I have checked all again, and noticed that affine transform function (nppiWarpAffine) i want to use can’t work with NV12 buffers.

As mentioned in this thread, i can make nv12 → rgba → nv12 transformation to perform affine transform with rgba buffer.

But after that, in gstnvinfer plugin nv12 → rgba (rgb/bgr) transformation will be applied again. It’s looks like redundancy and maybe there is another solution to transform buffer?

Some sort of preprocess buffer callback in gstnvinfer will be very good solution that can solve my and similar issues. Maybe you have plans to implement such feature?

could you do WarpAffine processing to the input image (RGB planar) that will be feed into nvinfer, then you don’t need do WarpAffine between PGIE and SGIE ?
If that works for you, you could use one of below APIs.

Three-channel planar 32-bit floating-point affine warp
NppStatus
nppiWarpAffine_32f_P3R(const Npp32f * pSrc[3], NppiSize oSrcSize, int nSrcStep, NppiRect oSrcROI,
Npp32f * pDst[3], int nDstStep, NppiRect oDstROI,
const double aCoeffs[2][3], int eInterpolation);

Three-channel 32-bit floating-point affine warp
NppStatus
nppiWarpAffine_32f_C3R(const Npp32f * pSrc, NppiSize oSrcSize, int nSrcStep, NppiRect oSrcROI,
Npp32f * pDst, int nDstStep, NppiRect oDstROI,
const double aCoeffs[2][3], int eInterpolation);

could you do WarpAffine processing to the input image (RGB planar) that will be feed into nvinfer, then you don’t need do WarpAffine between PGIE and SGIE ?

If hope, i understand you correctly.
Affine transformation coefficients will be calculated on PGIE output (additional output tensor with landmarks), so, i think it’s not possible to perform warp affine transform before nvinfer

Sorry! Please ignore my comment. It should not work.

I think, the solution bcao suggested in comment#6 should work.

In PGIE/nvinfer, current data processing process: NV12 → (conversion / CUDA kernel) → RGB planar → TensorRT inference → parser .

So, you could just add WarpAffine between “RGB planar → TensorRT inference” ?

Could you get some advice how to implement it in gstnvinfer?

In gstnvinfer sources i see get_converted_buffer function that prepares buffer to transform (crops, scales and converts to rgb planar) and convert_batch_and_push_to_input_thread that transforms buffer and pushes buffer to inference.

I assume that warpaffine should be called in convert_batch_and_push_to_input_thread, somewhere before process_lock mutex locking. Is it right?

Hi

In file: /opt/nvidia/deepstream/deepstream-4.0/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp

below code calls TensorRT enqueue() API - TensorRT: nvinfer1::IExecutionContext Class Reference

/* Queue the bound buffers for inferencing. */
        if (!m_InferExecutionContext->enqueue(enqueueBatchSize, bindingBuffers,
                                              m_InferStream, &m_InputConsumedEvent))
        {
            printError("Failed to enqueue inference batch");
            status = NVDSINFER_TENSORRT_ERROR;
            goto error;
        }

As you can find in the APi description, bindingBuffers is an array of pointers to input and output buffers for the network, so you can get the input CUDA buffer pointer from this array, and then add WarpAffine on it. Is it fine for your case?

Thanks.

How access tensor metadata where landmarks stored from nvdsinfer?
And where i can check that this is PGIE model?

Hi Alexdefsen,
Sorry for delay!
You can use “initParams.uniqueID” used in nvdsinfer to identify the GIE, uniqueID value is from the “gie-unique-id” property in gie configure file.

How access tensor metadata where landmarks stored from nvdsinfer?
sorry! I don’t understand what tensor metadata you mean.

Thanks!

I mean NVDSINFER_TENSOR_OUTPUT_META from NvDsUserMeta passed with buffer (GstBuffer). I need to access additional pgie output where landmarks are stored, but i cant find any GstBuffer mention in nvdsinfer_context_impl.cpp

Hi Defsen,
Really sorry about the delay!

Please refer to NVIDIA DeepStream SDK API Reference: NvDsInferTensorMeta Struct Reference , could you append your additional outout in priv_data ?

Thanks!

sorry, im newbie.
how can i install probe on PGIE src pad? im deploying face recognition, i need some facial landmark points to build a warpAffine on detected face image before feed it into face embedding model. another quesition that can i do it with facial landmark?
(sorry if my english skills is not good, i come from country that don’t speak English as main language)

1 Like

hi bcao. i have the same question in #6, i need to transpose image before pass it to SGIE, how can i do that ?