Preprocessing of image in InferPreprocessor::transform

I don’t have correct result from secondary gie’s output.
Suspicious on the preprocessing part.

I have made inference on TensorRT working well.
The preprocessing for input image in TensorRT is as follows.

Dims NumPlateRecognition::loadJPEGFile(std::vector<std::string> fileName, int num)
{
    Dims4 inputDims{num, 24, 94, 3};
    Dims4 inputDims_1img{1, 24, 94, 3};
    const size_t vol = samplesCommon::volume(inputDims);
    const size_t vol_1img = samplesCommon::volume(inputDims_1img);
    unsigned char *data = new unsigned char[vol];
    for(int f=0; f < num; f++){
       cv::Mat image, im_rgb;
       image = cv::imread(fileName[f], cv::IMREAD_COLOR);
       cv::cvtColor(image, im_rgb, cv::COLOR_BGR2RGB);
       image.release();
       memcpy(data+(f*vol_1img), im_rgb.ptr<unsigned char>(), vol_1img);
       im_rgb.release();
       mInput.hostBuffer.resize(inputDims);
       float* hostDataBuffer = static_cast<float*>(mInput.hostBuffer.data());
       std::transform(data, data+vol, hostDataBuffer, [](uint8_t x) { return (static_cast<float>(x) / 255.0); });      
    }
    delete[] data;
    return inputDims;    
} 

The steps are (1) need to resize the input image to 24(h)x94(w) size. (2) normalization [](uint8_t x) { return (static_cast<float>(x) / 255.0); }

Observed the preprocessing part in deepstream inside this function.

NvDsInferStatus InferPreprocessor::transform(NvDsInferContextBatchInput& batchInput, void* devBuf, CudaStream& mainStream, CudaEvent* waitingEvent)
{

}

My configuration file has
dstest2_sgie1_config.txt (3.7 KB)

infer-dims=24;94;3
net-scale-factor=0.0039215697906911373
model-color-format=0

I am trying to make sure Deepstream has same preprocessing as I implemented for TensorRT. My queries are as follows.

(1)Since this is processing in sgie, the detection outputs from pgie need to be resized to 24x94x3.
The conversion function used for my case is convertFcn = NvDsInferConvert_C3ToP3Float;. Looked inside nvdsinfer_conversion.cu file and I just found API, actual implementation is in cuda file.
So is resizing done for sgie input?

(2)Printed out the following loop inside nvdsinfer_context_impl.cpp (line 405-412)

if (convertFcn) {
            std::cout<<"convertFcn is " << 0 <<" "<< m_NetworkInfo.width << " " << m_NetworkInfo.height << " " << m_Scale << " " << batchInput.inputPitch << std::endl;
            /* Input needs to be pre-processed. */
            convertFcn(outPtr, (unsigned char*)batchInput.inputFrames[i],
                m_NetworkInfo.width, m_NetworkInfo.height, batchInput.inputPitch,
                m_Scale, m_MeanDataBuffer.get() ? m_MeanDataBuffer->ptr<float>() : nullptr,
                *m_PreProcessStream);
        }

Found out correct network input size and scale.
But pitch is 512 for input size 94 24 (sgie input size) and 7680 for 1920 1080 (pgie input size)
How does pitch is calculated?
Understood that pitch is width-based, so 94*3 = 282.

(3)Since sgie is trained from Tensorflow, its data format is NHWC format. Is that matter?
I checked nvinfer1::PluginFormat. It doesn’t have kNHWC format.
So the plugin layer (the last layer of sgie) is set nvinfer1::PluginFormat::kLINEAR for the data format.
Is that OK?

(4)Inside TensorRT, preprocessing is done as follows.
(uint8_t x) { return (static_cast<float>(x) / 255.0); }
Input pixel (unsigned char) is converted to float and normalized with 255.0.
That is net-scale-factor inside my config file net-scale-factor=0.0039215697906911373.
Then converted to uint_8.

Where can I check the same thing is implemented in Deepstream?

I can see only this line in nvdsinfer_conversion.cu (line 208)

NvDsInferConvert_CxToP3FloatKernel <<<blocks, threadsPerBlock, 0, stream>>>
            (outBuffer, inBuffer, width, height, pitch, 3, scaleFactor);

Hi,

Just want to clarify first.

In general, Deepstream includes detector and classifier.
Do you also feed the TensorRT pipeline with the same detector output?
Or run the Deepstream pipeline with only sgie?

Since if the workflow is not aligned, it is hard to compare the output difference.

Thanks.

TensorRT was tested with cropped images. Load with opencv in bgr, then convert to rgb and do normalization and infer.

Deepstream is pgie detection first, crop bounding boxes and fed into sgie.

My another query is whether I can implement customized code for preprocessing?

Hi,

Yes, you can. The preprocessing is open-sourced.
Please find it in this folder:

/opt/nvidia/deepstream/deepstream-5.0/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp

Thanks.

1 Like

I found two things to discuss.

Deepstream’s color format is NvDsInferFormat_RGBA. Four channels.
So pgie’s network input size 1920x1080 and it has inputPitch 7680 (1920x4)

But for sgie’s network input size is 94x24 and inputPitch should be 376 (94x4).
From my print, I saw that inputPitch is 512 for sgie. When I change to 376, output results make more sense.
inputPitch for pgie is correct 7680, when I print I can see 7680.
Is that bug in Deepstream?

Then can i confirm that image resizing is done only in streammux?
Then I have only one streammux in at the beginning of Deepstream pipeline and so there is no image resizing for sgie to match network input size Is that true? So I need image resizing code for sgie. Can I confirm? Then I will make my own image resizing code for sgie.

I did printing inside this loop.

for (unsigned int i = 0; i < batchSize; i++)
    {
        float* outPtr = (float*)devBuf + i * m_NetworkInputLayer.inferDims.numElements;

        if (convertFcn) {
            std::cout<<"convertFcn is " << 0 <<" "<< m_NetworkInfo.width << " " << m_NetworkInfo.height << " " << m_Scale << " " << batchInput.inputPitch << std::endl;
            /* Input needs to be pre-processed. */
		    convertFcn(outPtr, (unsigned char*)batchInput.inputFrames[i],
		        m_NetworkInfo.width, m_NetworkInfo.height, batchInput.inputPitch,
		        m_Scale, m_MeanDataBuffer.get() ? m_MeanDataBuffer->ptr<float>() : nullptr,
		        *m_PreProcessStream);
           
        } else if (convertFcnFloat) {
            /* Input needs to be pre-processed. */
            convertFcnFloat(outPtr, (float *)batchInput.inputFrames[i],
                m_NetworkInfo.width, m_NetworkInfo.height, batchInput.inputPitch,
                m_Scale, m_MeanDataBuffer.get() ? m_MeanDataBuffer->ptr<float>() : nullptr,
                *m_PreProcessStream);
        }
    }

Hi can I have reply on this?

Hi any reply for this?

Hi,

Sorry for the late update.

The 512 should come from the limitation of EGL mapping that pitch must be the multiples of 256 and greater or equal to the 512.
We are checking this internally. Will update more information later.

Thanks.

To reproduce this internally, could you share the model (including the 94x24 classifier), config and source with us?

Thanks.

Hi thanks.
Sent the link to download all files at private message.
Can please let me know when you can download?

Please let me know if need some more clarifications.

Can please confirm that I need resizing for the sgie input as resize is done only in streammux? Thanks

Hi,

Thanks for your sharing. The files are downloaded.
We are working on reproducing this and will share more information with you later.

There is a resizer in the nvinfer component that will resize the data input into the model size before inferencing.

Thanks.

Thanks for the reply.
Yes I like to confirm that I have only one resizer in streammux for the pgie.
streammux resizes the input image for the pgie.
For the sgie, there is no resizer to resize .
So that I need own resizer for the sgie.
Is that true?

Hi,

streammux is independent to the resizer located in the nvinfer component.
The resizer in the GStreamer pipeline, for both pgie and sgie.

Please check /opt/nvidia/deepstream/deepstream-5.0/sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp:

/**
 * Calls the one of the required conversion functions based on the network
 * input format.
 */
static GstFlowReturn
get_converted_buffer (GstNvInfer * nvinfer, NvBufSurface * src_surf,
    NvBufSurfaceParams * src_frame, NvOSD_RectParams * crop_rect_params,
    NvBufSurface * dest_surf, NvBufSurfaceParams * dest_frame,
    gdouble & ratio_x, gdouble & ratio_y, void *destCudaPtr)
{
  guint src_left = GST_ROUND_UP_2 ((unsigned int)crop_rect_params->left);
  guint src_top = GST_ROUND_UP_2 ((unsigned int)crop_rect_params->top);
  guint src_width = GST_ROUND_DOWN_2 ((unsigned int)crop_rect_params->width);
  guint src_height = GST_ROUND_DOWN_2 ((unsigned int)crop_rect_params->height);
  guint dest_width, dest_height;

  ...

   /* Create temporary src and dest surfaces for NvBufSurfTransform API. */
  nvinfer->tmp_surf.surfaceList[nvinfer->tmp_surf.numFilled] = *src_frame;

  /* Set the source ROI. Could be entire frame or an object. */
  nvinfer->transform_params.src_rect[nvinfer->tmp_surf.numFilled] =
      {src_top, src_left, src_width, src_height};
  /* Set the dest ROI. Could be the entire destination frame or part of it to
   * maintain aspect ratio. */
  nvinfer->transform_params.dst_rect[nvinfer->tmp_surf.numFilled] =
      {0, 0, dest_width, dest_height};

  nvinfer->tmp_surf.numFilled++;

  return GST_FLOW_OK;
}

Thanks.

I c thanks nvinfer has its own resizing.

May I have any reply for this?

Hi,

Sorry that we need more time to check this.
We will update here once we got any progress.

Thanks.

May I have any update on this?
I have been in this deep stream development for nearly half a year already. Started from Sept and still have issues and can’t run successfully in deepstream.

Hi,

Sorry to keep you waiting.

We tried to reproduce this issue but found the pipeline you provided related to some plugin implementation.
Somehow the plugin increases the complexity when investigating the issue.

To better find the root cause, we are trying to reproduce it with one primary GIE whose input size is smaller than 128.
However, it will take some time to clarify the problem.

We will share more information with you once we make progress.
Thanks and sorry for the inconvenience.

Thanks for the reply. Once I saw your reply and felt happy that the problem was solved. It is fine. Please let me know anything need from my side.

May I have anyupdate on this?