Differences in performance between onnx models in Pytorch and TensorRT


GPU model: Quadro P6000
OS: Ubuntu 18.04
TensorRT version:
Cuda: 10.0
Python: 3.6.7
ML framework: Pytorch 1.0.1
onnx version: 1.4.1

I am trying to use TensorRT to accelerate the extraction of features from my model, first in float32 and then in float16 and int8.
The models I use are in particular VGG, ResNets and Densenets, but I have some issues in getting the code to work depending on the
model/precision pair I pick. My workflow is the following one:

Take a model from the torchvision library -> save it to .onnx -> load it in TensorRT and do inference.

  1. Float32:

VGG models work really well: I try to pass a dummy input of 1s using both Pytorch and TensorRT and I get as output basically the same result
(the difference is like 10^-7).
But if I do this with any of the ResNets or DenseNets, the output is extremely different. Why is that? I have read that it might be due to the batchnorm
layers, but how do I fix this?

  1. int8

NOTE: My code is an adaptation of the int8_caffe sample provided in version

The provided sample works fine, so I adapted it to my needs by just changing the way the images are loaded and substituting the caffe_parser with the onnx one.

When I use my datasets with the ResNet50.onnx in TensorRT-, I can run the int8 calibration + inference just fine, and I get
a really good speedup (0.9 s in f32 vs 0.37 in int8).

When I use my .onnx models from pytorch, I build my own calibrator and add the following two lines to the build_engine_onnx:

builder.int8_mode = True
builder.int8_calibrator = calibrator

Where calibrator is the int8 calibrator I build.

By doing so, I get the following error:

ERROR: Calibration failure occured with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.

Which I traced back to this line:
return builder.build_cuda_engine(network)

So the code somehow does not detect my calibrator, which is strange, since I can create an instance just fine.
Indeed, if I write:
I get
AttributeError: unreadable attribute

Why is this? I just change the way I load the images (they are .jpg), but I suspect the error might be in the .onnx generation, since I cannot even attach the calibrator to the engine.

I can share a minimum working example on these issues. If you need it, I can also include the calibrator, but it is basically the copy of the one provided in the(MNISTEntropyCalibrator). I even tried to use that one even though my dataset is not MNIST, but I get the same error.

Thank you in advance for your help.

Same problem.By the way, with all zeros tensor, both trt and pytorch can get same output. So I think not all operations are same during inference but all weights correctly loaded. Is there any solution to fix this problem???