TensorRT python API inference is inconsistent with trtexec inference


I converted tensorflow based model to TensorRT (via onnx). This generated trt engine was inferred using the TensorRT python API and trtexec CLI while using the same input. It was observed that the outputs produced by the two inferencing ways were inconsistent.


TensorRT Version: 8.5.2
Nvidia Driver Version: 516.94
CUDA Version: 11.7
Operating System + Version: Windows 10
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable): 2.11.0

Steps To Reproduce

  1. Clone the repo GitHub - DwijayDS/TensorRT-consistency-issue: Reproducing TRT consistency issue

  2. Convert the tensorflow model to ONNX using tf2onnx. The tesnorflow model is defined in the file tf_model.py

  3. Use trtexec to transform the ONNX model into a .trt file.

    trtexec --onnx=model.onnx --saveEngine=model.trt
  4. Run python_infer.py to infer the model and compare outputs.

The generated log will be similar to

This shows that there is some inconsistency between the generated output.

Let me know if you need any more details related to the issue.

Please refer to the installation steps from the below link if in case you are missing on anything

Also, we suggest you to use TRT NGC containers to avoid any system dependency related issues.