How to implement Batchnorm 2D for the TensorRT5?

Ubuntu 16.04

python: 3.6.8
pytorch: 1.1.0 with cuda 10
tensorrt: 5.1.5
cuda: 10
cudnn: 7.5.0

I’m trying to convert the batch norm layer from pytorch to tensorrt.

I found scale layer for tensorrt and decided to use this function to implement.

The batchnorm2d layer consists of two steps as below.

  1. nomalization(zero mean and unit variance)
    hat(x) = (x-E) / sqrt(Var + eps)

  2. affine transformation.
    y = gamma * hat(x) + beta

so, I implemented the batchnorm2d in tensorrt with two consecutive scale layers.

  1. the first layer do normalization
    scale = 1 / sqrt(running_var+eps)
    shift = running_mean * -1.0
    power = 1.0

  2. and second layer do the affine transformation
    scale = gamma(weight in pytorch)
    shift = beta(bias in pytorch)
    power = 1.0

ok, done.

But…unfortunately I got different results from tensorrt unlike pytorch.
the l2norm value of the differences is about 0.47.
the l2norm value of differences of other layers (convolution, pooling) are is about 1e-5.

what is the matter for my implementation?