However, when I use the inference on the pytorch quantized model and compare to the result of the onnxruntime I receive totally different results.
I attach the DepthNet code.txt where the forward function of the model is described, the onnx_creation.txt where the code of creation of ONNX file is described and the comparison between the result of pytorch and onnxruntime.
The original pytorch quantized model and its exported onnx is also attached.
Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
check_model.py
import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command. https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!
I composed a Pycharm project, which loads the quantized model, export it to onnx, loads the onnx and produces outputs from torch and onnxruntime, comparing between them (only the ‘depth’ output is compared).
The 4Nvda folder contains the depthnet_nvda.pt quantized model and quantized_depthnet.onnx produced from it.