I don’t have correct result from secondary gie’s output.
Suspicious on the preprocessing part.
I have made inference on TensorRT working well.
The preprocessing for input image in TensorRT is as follows.
Dims NumPlateRecognition::loadJPEGFile(std::vector<std::string> fileName, int num)
{
Dims4 inputDims{num, 24, 94, 3};
Dims4 inputDims_1img{1, 24, 94, 3};
const size_t vol = samplesCommon::volume(inputDims);
const size_t vol_1img = samplesCommon::volume(inputDims_1img);
unsigned char *data = new unsigned char[vol];
for(int f=0; f < num; f++){
cv::Mat image, im_rgb;
image = cv::imread(fileName[f], cv::IMREAD_COLOR);
cv::cvtColor(image, im_rgb, cv::COLOR_BGR2RGB);
image.release();
memcpy(data+(f*vol_1img), im_rgb.ptr<unsigned char>(), vol_1img);
im_rgb.release();
mInput.hostBuffer.resize(inputDims);
float* hostDataBuffer = static_cast<float*>(mInput.hostBuffer.data());
std::transform(data, data+vol, hostDataBuffer, [](uint8_t x) { return (static_cast<float>(x) / 255.0); });
}
delete[] data;
return inputDims;
}
The steps are (1) need to resize the input image to 24(h)x94(w) size. (2) normalization [](uint8_t x) { return (static_cast<float>(x) / 255.0); }
Observed the preprocessing part in deepstream inside this function.
NvDsInferStatus InferPreprocessor::transform(NvDsInferContextBatchInput& batchInput, void* devBuf, CudaStream& mainStream, CudaEvent* waitingEvent)
{
}
My configuration file has
dstest2_sgie1_config.txt (3.7 KB)
infer-dims=24;94;3
net-scale-factor=0.0039215697906911373
model-color-format=0
I am trying to make sure Deepstream has same preprocessing as I implemented for TensorRT. My queries are as follows.
(1)Since this is processing in sgie, the detection outputs from pgie need to be resized to 24x94x3.
The conversion function used for my case is convertFcn = NvDsInferConvert_C3ToP3Float;
. Looked inside nvdsinfer_conversion.cu
file and I just found API, actual implementation is in cuda file.
So is resizing done for sgie input?
(2)Printed out the following loop inside nvdsinfer_context_impl.cpp (line 405-412)
if (convertFcn) {
std::cout<<"convertFcn is " << 0 <<" "<< m_NetworkInfo.width << " " << m_NetworkInfo.height << " " << m_Scale << " " << batchInput.inputPitch << std::endl;
/* Input needs to be pre-processed. */
convertFcn(outPtr, (unsigned char*)batchInput.inputFrames[i],
m_NetworkInfo.width, m_NetworkInfo.height, batchInput.inputPitch,
m_Scale, m_MeanDataBuffer.get() ? m_MeanDataBuffer->ptr<float>() : nullptr,
*m_PreProcessStream);
}
Found out correct network input size and scale.
But pitch is 512 for input size 94 24 (sgie input size) and 7680 for 1920 1080 (pgie input size)
How does pitch is calculated?
Understood that pitch is width-based, so 94*3 = 282.
(3)Since sgie is trained from Tensorflow, its data format is NHWC format. Is that matter?
I checked nvinfer1::PluginFormat. It doesn’t have kNHWC format.
So the plugin layer (the last layer of sgie) is set nvinfer1::PluginFormat::kLINEAR for the data format.
Is that OK?
(4)Inside TensorRT, preprocessing is done as follows.
(uint8_t x) { return (static_cast<float>(x) / 255.0); }
Input pixel (unsigned char) is converted to float and normalized with 255.0.
That is net-scale-factor inside my config file net-scale-factor=0.0039215697906911373.
Then converted to uint_8.
Where can I check the same thing is implemented in Deepstream?
I can see only this line in nvdsinfer_conversion.cu (line 208)
NvDsInferConvert_CxToP3FloatKernel <<<blocks, threadsPerBlock, 0, stream>>>
(outBuffer, inBuffer, width, height, pitch, 3, scaleFactor);