Description
I am investigating a precision problem in a gpt-2 model, using polygraphy to debug. I am not using low precision and set strict-types. I set --onnx-outputs mark all , to compare result of every layer. But the absolute difference is not zero since a ReduceMean layer and gets bigger afterwards, finally exceeding the threshold.
I guess that the difference may be because of the different order of addition, but is there any way to let ReduceMean to produce exactly the same output? I expect the output of the net to be exactly the same.
Environment
TensorRT Version: 7.2.3.4
GPU Type: T4
Nvidia Driver Version:460.73.01
CUDA Version: 10.2
CUDNN Version: 8.1.0
Operating System + Version:RedHat
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
Relevant Files
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
I’ll produce part of my polygraphy log output here:
[[38;5;14m[I] Comparing Output: ‘231’ (dtype=float32, shape=(1, 128, 1)) with ‘231’ (dtype=float32, shape=(1, 128, 1)) | Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error^[[0m
…
[I] Absolute Difference | Stats: mean=1.2617e-10, std-dev=1.1599e-10, var=1.3454e-20, median=1.1642e-10, min=0 at (0, 1, 0), max=4.6566e-10 at (0, 42, 0)
231 is an output of a ReduceMean layer.
And afterwards in a MatMul layer the accumulated diff exceeds 1e-05
[I] Absolute Difference | Stats: mean=8.9681e-06, std-dev=1.4915e-05, var=2.2244e-10, median=3.8147e-06, min=0 at (0, 0, 0, 0), max=0.00048828 at (0, 5, 116, 115)
Steps To Reproduce
my command:
polygraphy run to_onnx/gpt.onnx --model-type onnx --onnxrt --trt -v --input-shapes input:[1,128] seg:[1,128] mask:[1,1,128,128] --int-min 0 --int-max 20000 --float-min -10000 --float-max 0 --val-range input:[0,20000] seg:[1, 2] mask:[-10000.0, 0] --onnx-outputs mark all --trt-outputs mark all --log-file compare.log --strict-types