How to set the correct config for a pytorch model in nvinfer?

Hello,
I have trained a Resnet classifier in PyTorch and exported to an ONNX model.

The FC layer of Resnet18 is set to:

model.fc = nn.Sequential(
        nn.Linear(model.fc.in_features, args.n_classes, bias=False),
        nn.Softmax(1),
    )

These are the normalization values I have used.

normalize = transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225])

The model accuracy goes up to 80%, but when I infer (using nvinfer) on the exact same images used in training, the result is very different.

I have used the following Offsets & scale factor


# RGB, torchvision = 255*[0.485;0.456;0.406]
offsets=123.675;116.28;103.53

maintain-aspect-ratio=1

#net-scale-factor=0.003921569
net-scale-factor=0.01735207357

With that in mind, I wanted to know

  1. How to set the correct offsets and net-scale-factor to match my training normalization values?
  2. Is there any place where I can find some examples of how nvinfer does the asymmetric padding when maintain-aspect-ratio=1
  3. Is there any official documentation to use a pytorch model as an SGIE in Deepstream?

Thanks.

Hi,

1

The preprocessing equation used in the pyTorch is y = (x - mean) / std .
In deepstream, it is y' = net-scale-factor * ( x' - mean').
So please set the mean'=mean, and net_scale_factor=1/std.

One problem is that we don’t support channel-wise normalization.
You can either use the average std value or update the source code below:

/opt/nvidia/deepstream/deepstream-5.0/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp

2. Do you mean the asymmetric padding in the network architecture, like conv?
If yes, the network definition should be identical to the training frameworks.

3. Below is a sample for ONNX model.
Although it is used as PGIE, you can follow the same for SGIE.

Thanks.

Hi Aasta,

Thank you for your quick reply. I will take a look at point 1 and 3.

For asymmetric padding I want to know what kind of input is being sent to the model.
In the documentation of nvinfer’s properties it is mentioned “Indicates whether to maintain aspect ratio while scaling input. DeepStream currently does asymmetric padding only.” for maintain-aspect-ratio

Does that mean that there is another layer added which does the padding? Or is it like a pre-processing step done in nvinfer before the object is sent to the model? If so, is there any way to see how the image looks after the padding is added.
I simply want to verify whether the input image to nvinfer (after padding) matches my preprocessing when training in Pytorch.

Thanks

Hi,

There are two different padding available.

For Deepstream, there is a padding that try to feed input data into network buffer.
In this usecase, the related to configure file is maintain-aspect-ratio.

There is also a layer-level padding which is implemented by TensorRT.
You can find some information below:

Thanks.