ConvTranspose2D output dimensions differ in Pytorch / ONNX and TensorRT

I’m running into an issue when exporting a model from pytorch (version 1.0.0) to onnx, and then importing into tensorrt (version 5.0.3 in jetpack 4.1.1).

In pytorch/onnx, the convtranspose2d layer (with parameters: kernel size = 3, stride = 2, padding = 1, output padding = 1, dilation = 1; input tensor dimension 1 x 256 x 16 x 32) produces output tensors with dimension 1 x 256 x 32 x 64 (the desired size).

When imported into tensorrt, the layer produces output tensors with dimension 1 x 256 x 31 x 63.

Looking at pytorch’s documentation (https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d), the given formula for output dimension (eg, for height) is: (H_in - 1)stride - 2padding + kernel_size + output_padding. plugging in our parameters gives 32.

Looking at tensorrt’s documentation on the deconvolution layer (https://docs.nvidia.com/deeplearning/sdk/pdf/TensorRT-Developer-Guide.pdf - section A.1.5), the given formula for output dimension is: (H_in -1)stride + t - 2padding, where t = kernel_size + dilation*(kernel_size-1). there are a few confusing bits here:

  • plugging in our parameters doesn’t actually give 31, but 33. does the padding in this formula sum both the input padding and output_padding?
  • this differs from the pytorch formula only in the last bit: pytorch adds output_padding, and tensorrt adds dilation*(kernel_size-1) instead

Any thoughts on how we can get these two APIs to output the same dimensions here, and why the tensorrt dimension is not as expected?

Thanks for taking a look.

Hello,

we are reviewing and will keep you updated.

Great, thanks for the prompt response.

One addition: I looked at the pytorch source code to see what happened to the dilation term in their dimension equation and it looks like their docs are out of date (though the above equation holds true in our case where dilation=1). The equation in the source is:

(H_in - 1)stride - 2padding + dilation*(kernel_size-1) + output_padding + 1

so the difference between this equation and the one given in the tensorrt docs is the “output_padding + 1” term vs the added kernel_size term.

I’m running into this issue as well, with the same pytorch => ONNX => TensorRT path. I updated my model to avoid any ConvTranspose2d with padding, but I’d love to see a proper fix.

We are still reviewing, but per NvInfer.h, it looks like the formula TRT uses is:

(H_in - 1) * stride + kernelSize - 2 * padding

Additionally, it doesn’t seem like we even support dilations in transposed convolutions.

Hello,

per engineering, Can you share a copy of the model so we can see if it the ConvTranspose2D layer is working in 5.1?

you can DM me.

Thanks, sent via DM.

Hi there, wanted to see if you all were able to reproduce the behavior we’re seeing?

Thanks!

Hello,

our engineers are still triaging. will keep you updated.

hello,

our engineers believe this has been addressed in latest TRT. Can you please verify your model parses and runs fine with TensorRT 5.0.2.6?

No it isn’t solved.

is it solved?

No, it is still mismatch even in the latest 5.1.5 version.

it is still mismatch even in the 5.1.6 version in Jetpack 4.2.1.